A new Microsoft data crunching framework is set to launch on the company's Azure cloud, according to a report from Redmond pundit Mary Jo Foley on ZDNet. Dubbed Cosmos, it’s a potential competitor to both Hadoop and eventually Google’s homegrown Dataflow.
Microsoft Cosmos is used extensively within the company to aggregate data from every major service into a shared pool. These services include Azure, Skype, and search engine Bing.
It is similar to MapReduce, the heart of Hadoop, as it uses a structured query interface. However, Cosmos has the additional ability of directed acyclic graphs (DAGs), a method of modeling to connect different kinds of information. The DAG approach is said to reduce time and effort involved in complex analysis and potentially improves performance. Cosmos may have a stream-processing component, based on Foley’s claims. A close contemporary and competitor of all of this functionality and benefits would be Apache Spark.
Spark allows in-memory analytics. Spark is supported in Mesosphere, which treats data centers as one big computer. Spark creators Databricks recently unveiled Spark-as-a-Service.
Over 5,000 of the company's engineers, as well as several businesses, use Microsoft Cosmos. Foley suggests its ready for wider release.
Prior to Cosmos, Microsoft’s homegrown alternative to Hadoop’s batch processing platform was developed until 2011 and was hailed as a potential Hadoop challenger.
Another potential challenger is Google’s Dataflow, its historical and real-time data analytics system that replaced MapReduce. Google dubbed Dataflow as the next evolution of Hadoop ecosystem technologies. Dataflow is believed to be undergoing commercialization after internal successes.
Hadoop momentum has been building rapidly over the past several years.
It should be noted that Google continues to support Hadoop financially through its investment arm. Numerous services based on different Hadoop distributions are available on the Google Cloud Platform, including, most recently, official support of Hortonworks.
Microsoft Cosmos might end up either as competitor or as complementary to Hadoop, depending on how the company chooses to move.
Microsoft CEO Satya Nadella outlined the company’s path to deliver a platform for ambient intelligence at a past customer event, stressing a “data culture.”
“The era of ambient intelligence has begun, and we are delivering a platform that allows companies of any size to create a data culture and ensure insights reach every individual in every organization,” Nadella said in April, prior to launching a Data Platform and Internet of Things Service.