As containers from Docker and other vendors grow in popularity, so does the need for enterprise-ready data storage solutions that work well with containers. Here's an overview of the challenges on this front, and how developers are solving them.
You may be wondering why data storage for containers is an issue at all. After all, in our era of scale-out storage, automatic failover and redundant arrays, figuring out ways to store and protect data is not usually difficult.
The Container Data Storage Paradox
Containers, however, pose a sort of paradox when it comes to data storage. That paradox is this: One of the chief advantages of using containers is that they are ephemeral. They can be spun up or down quickly, which gives your data center scalability. Yet because containerized apps are not persistent, the data inside containers isn't, either.
In other words, you can't rely on ordinary containers to store data over the long term. You need another solution.
That makes containers different from traditional, VMware-style virtualization. With traditional virtual machines, you can easily create persistent data storage by configuring virtual disks.
A related problem is that, because containers are designed to be isolated from one another, as well as from the host, there is no simple way for an app running inside one container to share data with an app in a different one.
Container Data Solutions
So far, the solutions to this conundrum mostly fall into one of two categories.
The first is to create special containers whose purpose is to store data, rather than run apps. This is essentially the approach that Docker takes with Docker Data Volumes.
The advantage of this method is that Docker does a lot of the dirty work required to share data between containers for you. With a few Docker commands, you can create and share data volumes between containers.
The main disadvantage is that you're still relying on containers to store your data, and those containers do not exist forever. Relatively speaking, data volumes provide more persistent storage than you'd get by storing app data inside containers. But it's still not persistent in the full sense of the word.
The second main approach to data storage is to create a networked or cloud file system and allow containerized apps to access it over the network. This is more or less the method that CoreOS, Docker's main competitor, promotes, although Docker containers are also compatible with CoreOS.
Sharing data via the network is nice because it's familiar to anyone who has worked with clusters or the cloud in the last decade. But the drawback is that containerized apps themselves need to be written to work with networked data storage. Many do, but it's not a sure bet that a given app image will support data shares delivered via the network instead of the local file system running inside the container.
Future Storage Solutions
If the outline of container data storage above sounds complicated, it's because it is. Data storage in the container world has come far in the last two years. But there is certainly room for more elegant, enterprise-friendly solutions to emerge in this area.
Those solutions will no doubt be key in convincing more enterprises to use containers in production environments. In turn, they'll also provide another way for organizations to cope with the data deluge.