Big Data Journal

Posts

Showing posts from August, 2018

What Does Spark Replace Exactly?

August 23, 2018

Spark and Storm are the two cousins companies that provide abstractions over Hadoop. Hadoop is a part of apache project; it is an open source distributed java based programming framework. It supports the storage and programming of a large data set in distributed computing environment. It is sponsored by Apache software foundation. Spark and storm makes it easy to use this programming framework in some easy ways. Hadoop can take in lots of data and allows you to build a scalable system and do analytics on that data. If you successfully build it in Hadoop in right way, you can increase the capacity of your system by adding more servers. This helps you to spin your machines depending on the size of your load. Hadoop is practically flexible; you can use it in many different ways. Due to this it is also hard to work with. Hadoop consists of Hadoop common package. This package consists of necessary Java scripts and JAR files that are needed to start Hadoop. Every file sh...

What Is The Difference Between Apache Storm And Apache Spark?

August 16, 2018

Let us first discuss the similarities between Apache Storm and Apache Spark: Streaming jobs for both of these run until there is an unrecoverable failure or shutdown by user. Both of these are implemented in JVM based language. Scala and clojure are the two respective JVM languages. Differences between Apache Spark and Apache Storm are as follows: On the basis of definition: S.No Apache Spark Apache Storm 1 It is an open source processing engine that provides an interface for programming entire cluster with implicit fault-tolerance and data parallelism. Storm is simple and easy to reliably process the unbound streams of data. 2 It is a general purpose batch processing engine. It is a task parallel continuous engine 3 Defines its workflow in DAGs called topologies. Defines its workflow in style of MapReduce. 4 Executing applications run on own server processes. It uses Apach...