By The VAR Guy
Google is open-sourcing more code by contributing Cloud Dataflow to the Apache Software Foundation. The move, a first for Google, opens new cloud-based data analytics options and integration opportunities for big data companies.
Cloud Dataflow is a platform for processing large amounts of data in the cloud. It features an open source, Java-based SDK, which makes it easy to integrate with other cloud-centric analytics and Big Data tools.
Although the Dataflow SDK has been open source for more than a year, Google took the bigger step this week of proposing to turn the platform into an Apache Incubator project. That move paves the way for Dataflow’s codebase to eventually become a full-fledged Apache Software Foundation project.
Google has partnered with Cloudera, data Artisans, Talend, Cask and PayPal in issuing the proposal. Those partners are already celebrating the proposal, which — if approved, which it should certainly be — will make it simpler to build Dataflow’s scalability and integration features into commercial Big Data platforms in an open source, vendor-neutral way.
Talend, for instance, had this to say: “Developers leveraging the Dataflow framework won’t be ‘locked-in’ with a specific data processing runtime and will be able to leverage new data processing framework as they emerge without having to rewrite their Dataflow pipelines, making it Future-proof.”
For the channel, Google’s proposal means the cloud and big data are set to grow closer together — and that it will be easier for open source big data companies to keep the future of data analytics open.