Coho, Intel Rev Up Docker Containers with Flash and Hadoop

Reference architecture tackles storage I/O bottlenecks exacerbated by containers

Michael Vizard

March 30, 2015

2 Min Read
Solomon Hykes Docker
Docker founder Solomon Hykes appears on stage at the 2014 TechCrunch Disrupt Europe/London 2014. (Photo by Anthony Harvey/Getty Images for TechCrunch)

Coho Data, a provider of Flash memory arrays, is now working with Intel on a way to redefine how I/O in the age of containers will be managed in the data center.

The two companies have created a reference architecture through which Docker containers running on the distribution of Hadoop from Cloudera can now run directly on a Coho storage cluster based on Flash memory technologies.

Coho CTO Andy Warfield says the goal is to be able to scale micro services based on containers across a storage cluster using a Data Plane Development Kit (DPDK) created by Intel.

“We’ve built it from the ground up for containers,” says Warfield. “We’re essentially using HDFS (Hadoop Distributed File System) as a protocol.”

Warfield says the fundamental problem that two companies are trying to address is that on the one hand processors are faster than ever. On the other side, network bandwidth is increasing as well. But Warfield notes there hasn’t been a corresponding increase in storage I/O performance to eliminate the bottlenecks that will inevitably occur.

The Flash storage arrays developed by Coho plug directly into a PCIe slot. Coho Data then built what it describes as a DataStream NFS datastore that can be accessed via a single IP address. The company says the DataStream platform enables data to be accessed over a linearly scalable NFS implementation that can scale from 180K IOPs across two Coho Data MicroArrays to 1.8 million IOPs across 20 MicroArrays.

But Coho can also define multiple data profiles using, for example, containers rather than NFS. Each Data Profile directly connects to the DataStream DirectConnect application programming interfaces (APIs).

In general, more primary storage is moving into Flash memory because it’s both faster than magnetic storage and requires a whole lot less effort to optimize I/O performance. Magnetic storage utilization rates are often kept low to optimize I/O performance across spinning media. Flash storage, on the other hand, may be more expensive than magnetic storage, but reductions in I/O management overhead make it a more appealing alternative for primary storage.

Containers, such as Docker, further exacerbate I/O challenges by increasing the number of applications trying to access storage resources on a server. Where there might have been 30 virtual machines on a physical server it’s not uncommon to find 100 Docker containers. Unless new approaches to managing I/O access are found, it’s clear that emergence of containers will inevitably lead to I/O bottlenecks that will constrain the number of containers that can be deployed per physical server.

Andreessen Horowitz-backed Coho came out of stealth in 2013 with an on-premise storage solution built on commodity hardware with sophisticated software-defined storage capabilities. Also that year, the company raised $25 million.

Subscribe to the Data Center Knowledge Newsletter
Get analysis and expert insight on the latest in data center business and technology delivered to your inbox daily.

You May Also Like