This article originally appeared at The WHIR
Google launched two Big Data services to general availability to attract developers and enterprise customers this week. Google Cloud Dataflow and Cloud Pub/Sub join the company’s pitch to businesses with Big Data processing workloads, which Google said in a blog post include financial fraud detection, genomic analysis, inventory management, click-stream analysis, A/B user interaction testing, and cloud-scale ETL.
Cloud Dataflow was launched to beta in April, and provides a unified programming model to avoid the complexity of developing separate systems for batch and streaming data sources. In addition to fully managed, fault tolerant, highly available, and SLA-backed batch and stream processing, Cloud Dataflow provides a model for balancing correctness, latency, and cost with massive-scale, unordered data, Google said. The company also touts its performance versus Hadoop, extensible SDK, and Native Google Cloud Platform integration for other Google services like Cloud Datastore, BigQuery, and Cloud Pub/Sub.
“Streaming Google Cloud Dataflow perfectly fits requirements of time series analytics platform at Wix.com, in particular, its scalability, low latency data processing and fault-tolerant computing,” said Gregory Bondar, Ph.D., Sr. Director of Data Services Platform at Wix. “Wide range of data collection transformations and grouping operations allow to implement complex stream data processing algorithms.”
Cloud Pub/Sub delivers reliable real-time messaging between different services, Google APIs, and third-party services at up to 1 million message operations per second. Google says that in addition to integrating applications and services, Pub/Sub helps real-time big data stream analysis by replacing traditionally separate queuing, notification, and logging systems with a single API. The service also costs as little as 5 cents per million message operations for sustained usage.