Search This Blog

Thursday 25 October 2018

Why Node.js® - several reasons of nodejs popularity?


Node.js®, known as Node is gaining attention of designer. Node has been proved as good option to write highly scalable network solutions. We recently choose Nodejs as our server language.

Asynchronous event driven JavaScript runtime 


Node is designed to build scalable network applications. Using nodejs many concurrent connection can be handle. On each connection the callback is fired, but if there is no work to be done, Node will sleep. This is opposite to common concurrency model where OS threads are employed. Thread-based networking is relatively inefficient and very difficult to use.

Node are free from worries of dead-locking the process


Users of Node are free from worries of dead-locking the process, since there are no locks. Almost no function in Node directly performs I/O, so the process never blocks.
Because nothing blocks, scalable systems are very reasonable to develop in Node.

Similarity with Ruby's Event machine and Python's twisted 


Node is similar in design and influenced by system like Ruby's Event machine and Python's twisted. Node took the event model a bit further. It presented an event loop as a runtime construct instead of as a library. In other systems there is always a blocking call to start the event-loop. This behavior is defined through callbacks at the beginning of a script and at the end starts a server through a blocking call like EventMachine::run().

In Node there is no such start-the-event-loop call. Node simply enters the event loop after executing the input script. Node exits the event loop when there are no more callbacks to perform. This behavior is like browser JavaScript — the event loop is hidden from the user


HTTP's the most important aspect of Nodejs


HTTP is a first class citizen in Node, designed with streaming and low latency in mind. This makes Node well suited for the foundation of a web library or framework.

Note - Just because Node is designed without threads, that doesn't mean one can't take
advantage of multiple core  in environment. Child processes can be spawned by using our child_process.fork() API, and are designed to be easy to communicate with. Built upon that same interface is the cluster module, which allows you to share sockets between processes
to enable load balancing over your cores. 

Monday 3 September 2018

Fixing Typescript warning - Parameter 'param' has 'any type'

Use of noImplicitAny and suppressImplicitAnyIndexErrors inside tsconfig.json

TypeScript developers disagree about whether the noImplicitAny flag should be true or false. There is no correct answer and you can change the flag later. But your choice now can make a difference in larger projects, so it merits discussion.

When the noImplicitAny flag is false (the default), and if the compiler cannot infer the variable type based on how it's used, the compiler silently defaults the type to any. That's what is meant by implicit any.

The documentation setup sets the noImplicitAny flag to true. When the noImplicitAny flag is true and the TypeScript compiler cannot infer the type, it still generates the JavaScript files, but it also reports an error. Many seasoned developers prefer this stricter setting because type checking catches more unintentional errors at compile time.

You can set a variable's type to any even when the noImplicitAny flag is true.

When the noImplicitAny flag is true, you may get implicit index errors as well. Most developers feel that this particular error is more annoying than helpful. You can suppress them with the following additional flag:

"suppressImplicitAnyIndexErrors":true
The documentation setup sets this flag to true as well.

Monday 19 February 2018

Spark tutorial - Understanding and exploring spark core components RDDs

Spark is fast, ease to use data processing engine. Spark is developed as replacement of Hadoop MapReduce and runs over Hadoop. It uses Hadoop Distributed File System of Hadoop. Spark is the open source apache product available since 2014. Spark popularity is growing at rapid speed. Spark doesn't need extra skill set. Knowledge of Core Java and Distributing computing is enough for it.

There are terminology which founds difficults to understand by spark developers. I am going to explain those in my own language.

Spark RDDs 


Spark RDDs is the resilient distributed dataset. RDDs is immutable datasets which can be created from internal source (i.e. Parallelize way) or external source (i.e. Spark streaming, text file, input file format etc). RDD's elements can distributed across spark cluster for processing. RDDs can only be created by reading data from a stable storage such as TCP Streaming, Files or by transformations on existing RDDs.


For example- If as a spark developer you write spark TCP streaming job with interval of 1 seconds. Spark continuously receive data and form a RDD form the data receives in that one seconds.
Look at the image below. RDD consists of elemens E1, E2 .... En which can be json, text, line or any other string. This RDD can be transfom into new RDD but here i am not talking about this.
Elements of RDDs can be processed using foreach  element loop. And then these elements will be distributed across executors in spark clusters.

Spark RDDs


Point (1) -  This explain how RDD is created from the data received on spark tcp socket in 1 seconds. 


Point (2) - This explain that elements of RDD assigned to executors for processing. 


Below is the code of simple spark streaming. Where socket is opend to a IP and port and RDDs are formed every 3s.See clearly where point 1 and 2 lies in the codes.

 SparkConf sparkConf = new SparkConf().setMaster("spark://10.1.0.5:8088").setAppName("App name");  
           JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(3000));  
           JavaDStream<String> stream = ssc.socketTextStream("socket IP", 9000, StorageLevels.MEMORY_AND_DISK_SER);  
           stream.foreachRDD(new VoidFunction<JavaRDD<String>>() {  
                private static final long serialVersionUID = 1L;  
                public void call(JavaRDD<String> rdd) throws Exception {  

/** Point (1) **/

rdd.foreach(new VoidFunction<String>() { private static final long serialVersionUID = 1L; public void call(String s) throws Exception {

/** Point (2) **/

System.out.println(s); } }); } }); ssc.start(); ssc.awaitTermination();

Thursday 8 February 2018

Spark TCP streaming example without Kafka

Spark Streaming is an extension of the core Spark API that enables scalable, high-throughput, fault-tolerant stream processing of live data streams. Data can be ingested from many sources like Kafka, Flume, Kinesis, or TCP sockets. And Finally, processed data can be pushed to filesystems, databases, and live dashboards. On data, you can apply Spark’s machine learning and graph processing algorithms on data streams.

Spark streaming is useful to read data from producer and distribute data over multiple machine in clustor or yarn mode.

Few term related to spark streaming -

RDD stands for resilient data distribution. RDD is created from the data that bring when spark streaming executes in batch interval.

Creating simple example of TCP socket streaming is given below -



 @SuppressWarnings("resource")
 public static void main(String[] args) {
  
  SparkConf sparkConf = new SparkConf().setMaster("spark-master-url").setAppName("xyz")
    .set("spark.executor.memory", "1g").set("spark.cores.max", "5").set("spark.driver.cores", "2")
    .set("spark.driver.memory", "2g");
 
  JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(3000));

  JavaDStream<String> JsonReq1 = ssc.socketTextStream("bindIP", bindport, StorageLevels.MEMORY_AND_DISK_SER);
  JavaDStream<String> JsonReq2 = ssc.socketTextStream("bindIP", bindport, StorageLevels.MEMORY_AND_DISK_SER);
  ArrayList<JavaDStream<String>> streamList = new ArrayList<JavaDStream<String>>();
  streamList.add(JsonReq1);
  JavaDStream<String> UnionStream = ssc.union(JsonReq2, streamList);

  UnionStream.foreachRDD(new VoidFunction<JavaRDD<String>>() {

   private static final long serialVersionUID = 1L;

   public void call(JavaRDD<String> rdd) throws Exception {

   
    rdd.foreach(new VoidFunction<String>() {

     private static final long serialVersionUID = 1L;

     public void call(String s) throws Exception {
      System.out.println(s);
     }

    });
   }
  });

  System.out.println(UnionStream.count());
  ssc.start();
  ssc.awaitTermination();
 }


Term like bindIP and bindport will be your specific spark ip/port. To test this application you can create a basic service socket port programm which must listen for clients socket from spark executor.
spark-master-url should be the url of machine where spark master is running . spark master url generally looks like spark://machineip:port

Android News and source code