Looking to make it much simpler to deploy Apache Spark in-memory computing clusters in production environments, Mesosphere today announced a partnership with Typesafe to provide support for an instance of Spark that can be deployed on top of the Mesosphere Data Center Operating System (DCOS) running in the Amazon Web Services (AWS) cloud.
Matt Trifiro, senior vice president of marketing for Mesosphere, says the goal is to enable IT organizations to be able to deploy Spark in a few minutes. The Mesosphere DCOS itself can be downloaded and installed in much the same amount of time, says Trifiro.
“Using a single command, Spark can now be deployed on Docker containers on AWS,” says Trifiro. “We’re also working on an implementation that can be deployed on premise.”
Mesosphere is partnering with Typesafe because Spark is written in Scala, a derivative of Java developed by Typesafe. Mesosphere has also had its implementation of Spark certified by Databricks, which originally developed Spark. For its part, Mesosphere just began shipping a commercially supported version of its software along with a community edition that runs on AWS earlier this month.
While interest in Spark as a foundation for processing Big Data analytics applications in the cloud is high, the amount of expertise IT organizations have with frameworks such as Spark is often limited. By making use of AWS and Mesosphere, Trifiro says organizations can at the very least begin developing Spark applications in the cloud and then determine where they might want to deploy them in a production environment later.
Spark itself is emerging as a faster alternative to the MapReduce programming construct originally developed for Hadoop. Closely associated with Hadoop, Spark itself does not store any data. Instead, data is processed in memory then stored back in the Hadoop cluster from which it was originally pulled.
Trifiro notes at Mesosphere and Spark share a common University of California at Berkley heritage that has generated a number of open source technologies all aimed at solving challenges associated with deploying and managing applications at scale.
In general, platforms such as Mesosphere are taking advantage of advances in IT automation to greatly simplify not only the provisioning and orchestration of IT infrastructure, but also now application frameworks that invoke the application programming interfaces they expose. The end result is a simplification of the overall IT infrastructure environment at a time when application frameworks themselves are becoming more distributed than ever.