Recognizing the needs of an evolving data management architecture in organizations Oracle launched Big Data SQL software as a way for integrating a variety of data sources, including Hadoop, NoSQL and Oracle Database.
One option is a full-stack solution, an engineered system that combines Oracle's Big Data Appliance with Big Data SQL, Cloudera's Hadoop distribution and Oracle's own NoSQL Database. At launch Oracle Big Data SQL only supports Apache Hive and the Hadoop File System. Other vendors have ported SQL relational databases to run on top of Hadoop.
Single, optimized SQL query for distributed data
The goal of creating this Big Data management system is to have one SQL query to run across diverse data sources and enable organizations to leverage existing skills and maintain enterprise-grade data security and governance for sensitive or regulated information. To help speed data analysis and distribution Oracle says that its unique architecture and Smart Scan technology inherited from Oracle Exadata permits Oracle Big Data SQL to query all forms of structured and unstructured data while minimizing data movement.
This also facilitates Oracle Database security capabilities, including an organization’s existing security policies, which extend to Hadoop and NoSQL data.
Oracle's Dan McClary said that the product has been in development for some time now, and that it goes beyond existing connector support Oracle offers to Hadoop, NoSQL and others for moving data around on platforms. He said Big Data SQL is co-resident with HDFS DataNodes and YARN NodeManagers, and that queries from the new external tables are sent to these services to ensure that reads are direct path and data-local.
Cloudera founder, chairman and chief strategy officer Mike Olson said running Cloudera's software suite on Oracle's Big Data Appliance was "more cost-effective and quicker to deploy than a DIY cluster. When it comes to querying data in Hadoop, we’ve seen overwhelming demand from customers for SQL.
"This is why Cloudera has developed Impala—which Oracle includes on Oracle Big Data Appliance—to enable customers to query data with SQL natively and efficiently in Hadoop.”