During Data Week in New York the O’Reilly Strata Conference and Hadoop World events are taking place to showcase tools and techniques that make data work. The Twitter conversation for these events can be found on hashtags #strataconf and #hw2012.
MapR Launches Big Data Platform
MapR Technologies announced that it is bringing Hadoop and NoSQL capabilities together on an easy, dependable and fast platform. With MapR M7, big data operations ranging from batch analytics to real-time database functions can be performed with enterprise-grade reliability and protection. Through the utilization of innovative data structures that minimize the read- and write-amplification factor, inserts and updates in M7 are much faster. It also supports in-memory columns and eliminates the need for compactions.
“M7 is taking Hadoop and HBase to the next level,” said Jan Gelin, vice president of Technical Operations, Rubicon Project, a leading real-time advertising platform that was recently named number one in advertising reach by comScore. “The enterprise-grade capabilities of M7 give us a more complete platform and the ability to do new things with data.”
M7 users can create more than a trillion tables. With M7, HBase has more than 20X the number of column families and has increased row and cell sizes to handle large data objects. M7 also expands HBase use cases and applications. “The complexity of deploying and optimizing Apache Hadoop has inhibited organizations from integrating it into their business intelligence ecosystems,” said Manoj Goyal, senior director, Converged Application Solutions Engineering, Enterprise Group, HP. “HP solutions for Hadoop are built to enable rapid deployment, and innovations such as the HBase enhancements in MapR M7 further help customers integrate Hadoop into their data centers.”
EMC Greenplum and Kaggle Join Forces
EMC announced the availability of the EMC Greenplum Chorus open source code and continued its goal of enabling organizations to derive greater insight and economic value from big data with an announcement with Kaggle, a platform for data science competitions. EMC and Kaggle will collaborate to address the short supply for data scientists by integrating Greenplum Chorus, the social platform for collaborative data science, with Kaggle’s community of over 55,000 data scientists. Greenplum Chorus users who want to engage the Kaggle community will search, browse and drill into profiles of Kaggle community members who are interested in collaborating together, and those in the Kaggle community can opt-in to doing contract work through Chorus.
“Collaboration by individuals, organizations and communities is essential in achieving success with big data analytics,” said Scott Yara, Senior Vice President of Products, Greenplum, a division of EMC. ”The OpenChorus Project is part of a wave of big data technologies, strategies, and tools announced by EMC Greenplum all with one unified mission—to expand big data opportunities that help customers drive greater business insight and economic value from their data than ever before. Success depends on having a collaboration platform and solving the number one problem of the big data era: the supply and demand for data scientists. And today with Kaggle and their community of over 55,000 data scientists we’ve believe we are forever changing the way data science will be done.”
In addition to Kaggle, a number of EMC Greenplum partners have voiced support of the OpenChorus Project and to integrate their tools and solutions with Chorus. Those partners include Actuate, ADVIZOR Solutions, Alpine Data Labs, Gnip, Informatica, Pentaho, Pervasive, SAS, Syncsort, and Tableau Software.
EMC Vice President of Global Marketing CTO Chuck Hollis discusses the ecosystem play with Kaggle, and the true value of big data analytics and data science.
Stay updated on Big Data trends by bookmarking our Big Data Channel.