Why did Google Stop Using MapReduce and Start Encouraging Cloud Dataflow?

First of all, it is important to know that the old and the conventional MapReduce model is still
being used for certain batch computing activities. Nevertheless, there are some specific tasks that
cannot be carried out with the help of MapReduce and it is only because of this reason that
Google has recently stopped using MapReduce and has started with the use of Cloud Dataflow.
For example, Google web search index updating activity is dealing with a huge amount of data
but it requires incremental updated on a constant basis. As per reports, Google has come up with
an incremental computing mechanism known as Percolator for carrying out this huge data
activity.

It is important to note that Google has come up with a streaming computing mechanism known
as Milwheel for carrying out low-latency computing jobs. Different services such as Google Map
views now depend on Milwheel. One thing that is worth noting is that the applications of Google
are not created from, no-where. The applications rely a lot on the distributed mechanisms for
providing basic functionalities. Google possesses distributed storage systems such as Google File
System along with successors like Collossus, BigTable and Chubby used by Gmail. Mesa is a
data warehousing mechanism that is geo-replicated and is used in Google ads world.

Therefore, it can clearly be said that Google which is one of the leading internet companies
surviving the World Wide Web is not only about making the effective use of MapReduce but it
is also encouraging the use of Cloud Dataflow. More papers published on this subject would be
able to provide more details on Google’s encouragement of the use of Cloud Dataflow.
Google Cloud Dataflow service stands out in competition with the streaming data processing
service of Amazon called Kinesis and the other huge data products such as Hadoop. This is due
to the fact that Cloud Dataflow is built using a technology that Google claims to be replacing all
the algorithms behind the use of Hadoop.

A closer look on this mechanism will give you an idea that Cloud Dataflow is actually a better
thought of tool. This is because the Google users can use it for enriching the applications that
they develop and even for the data that they deposit along with analytics elements. Therefore, it
can rightly be said that Google’s Cloud Dataflow is a MapReduce killer. It can significantly
result in the complete replacement of MapReduce and various other huge data processing
mechanisms.


Hope this blog post helped you understanding SWhy did Google Stop Using MapReduce and Start Encouraging Cloud Dataflow. Enroll for the big data masters program with NPN Training and become a successfull Hadoop Developer.

Comments

Popular posts from this blog

How Can SDET Training Progress Your Career?

Benefits of Spark and Scala Training