Why did Google Stop Using MapReduce and Start Encouraging Cloud Dataflow?

July 11, 2018

First of all, it is important to know that the old and the conventional MapReduce model is still being used for certain batch computing activities. Nevertheless, there are some specific tasks that cannot be carried out with the help of MapReduce and it is only because of this reason that Google has recently stopped using MapReduce and has started with the use of Cloud Dataflow.

For example, Google web search index updating activity is dealing with a huge amount of data but it requires incremental updated on a constant basis. As per reports, Google has come up with an incremental computing mechanism known as Percolator for carrying out this huge data activity.

It is important to note that Google has come up with a streaming computing mechanism known as Milwheel for carrying out low-latency computing jobs. Different services such as Google Map views now depend on Milwheel. One thing that is worth noting is that the applications of Google are not created from, no-where.

The applications rely a lot on the distributed mechanisms for providing basic functionalities. Google possesses distributed storage systems such as Google File System along with successors like Collossus, BigTable and Chubby used by Gmail. Mesa is a data warehousing mechanism that is geo-replicated and is used in Google ads world.

Therefore, it can clearly be said that Google which is one of the leading internet companies surviving the World Wide Web is not only about making the effective use of MapReduce but it is also encouraging the use of Cloud Dataflow. More papers published on this subject would be able to provide more details on Google’s encouragement of the use of Cloud Dataflow.

Google Cloud Dataflow service stands out in competition with the streaming data processing service of Amazon called Kinesis and the other huge data products such as Hadoop. This is due to the fact that Cloud Dataflow is built using a technology that Google claims to be replacing all the algorithms behind the use of Hadoop.

A closer look on this mechanism will give you an idea that Cloud Dataflow is actually a better thought of tool. This is because the Google users can use it for enriching the applications that they develop and even for the data that they deposit along with analytics elements. Therefore, it can rightly be said that Google’s Cloud Dataflow is a MapReduce killer. It can significantly result in the complete replacement of MapReduce and various other huge data processing mechanisms.

Get enrolled in apache spark and scala training in Bangalore at NPN Training leaded by qualified industry leaders.

Search This Blog

Big Data Journal

Why did Google Stop Using MapReduce and Start Encouraging Cloud Dataflow?

Comments

Post a Comment

Popular posts from this blog

Here’s Why Python Continues To Be The Language Of Choice For Data Scientists

5 Reasons To Choose Big Data Analytics As A Career In 2019

Use of Apache Spark in the Field of Healthcare