Most of the data your organization is storing probably doesn’t need to be available instantly. IDC estimates that only 10 percent of corporate data is “hot,” only 30 percent is “warm,” and the other 60 percent belongs in cold storage. But a lot of that inactive cold data is often stored in your expensive primary storage tier.
With disk prices falling faster than the cost of data management software, IT shops often cannot take full advantage of cheaper storage tiers, be they tape or cloud, because of the productivity hit (and recovery costs) if the wrong data is archived. At the same time, they struggle with keeping backups up to date, as the growing volume of data makes backup windows longer and longer – sometimes that means backups never complete.
Spectra Logic, a Boulder, Colorado-based enterprise storage company, thinks its new StorCycle storage management software can deliver the cost savings of longer-term storage options without the productivity hit to both users and storage admins. It also simplifies storage down to just two tiers: “primary” and “perpetual.”
“I’ve got all this big primary tier-one storage, I don’t even know what’s on it, and I have very few choices,” is something the company’s product management VP David Feller tells us he often hears from customers. “I either have to impose quotas on people, or I just have to buy more disks,” they tell him. Even if they have the budget, extra capacity only makes the management issue worse. “It’s impossible for a human to sort through a petabyte of data and decide what to delete. We think at least 80 percent of the data that’s on expensive tier-one storage now shouldn’t be there.”
StorCycle scans any primary storage that’s mountable from a Windows server (Linux and OpenID support are on the roadmap) and automatically moves files that haven’t been used recently to slower, cheaper storage, according to user-defined policies.
“Go find the things that nobody's touched in two years that are more than a gigabyte every Saturday, with a moving window, so I just keep the latest two years on primary storage and everything else goes to perpetual,” Feller explains. You can apply meta data, either manually or through software using the StorCycle API, to keep important files on the primary tier or retrieve them for regular events like quarterly finance reviews.
It can also do project-based migration, moving for example raw video footage or scientific data to the perpetual tier and leaving post-processed files on the primary one.
Depending on what storage medum makes up the perpetual tier, files can be replaced either by symbolic links that retrieve them on-demand or by HTML files with the same name that explain where the file is (including a button a user can click to retrieve it).
Flexible Perpetual Storage
Software options for storage tiering have existed for some time, but their costs have not gone down along with the cost of physical storage media, according to Feller.
StoreCycle is offering an “affordable” option, ESG principal analyst Mark Peters tells us. It’s also easy to use and comes from a trusted supplier, he says. “It offers users a way to continue to keep, use and protect data without spending lots of money storing (relatively) inactive data and projects on expensive tiers of high-performance flash and disk.”
Peters also pointed out StoreCycle’s strong data protection capabilities, characterizing them as “a valuable addition to a necessary trend.”
Other storage tiering offerings, such as Druva, claim to have more intelligence by way of machine learning-driven data classification, but they tie you to AWS for storage. StorCycle works with public cloud – AWS initially, but Azure, Google Cloud, and other public clouds are on the roadmap, according to Feller – but also with your existing network-attached storage, as well as object storage on disk and tape.
The two-tier notion might seem at odds with the proliferation of new media types, from Intel Optane to research into using crystals or DNA for the longest-term storage, but the idea is to group multiple media types into either instant-access, high IOPs storage or long-term storage of whatever kind, abstracted behind a RESTful interface that makes it easy to integrate new types of persistent storage as they come on the market.
“The perpetual tier can’t be proprietary, you can’t be locked into a vendor,” Feller says. “It has to be accessible, so you can get to your data even if Spectra Logic isn’t in business in 20 years. Once data is in the perpetual tier, there's a large focus on protection, multiple copies, verification, migration to new technologies. If DNA storage actually happens, no problem, you push a button and migrate to that technology.”
Ease of using tiers is more important than the number of tiers, Peters says. “Consider driving a car – whether it has three gears or nine, or even CVT – it is the addition of automatic that makes things easy. Fewer tiers but still with a ‘clutch’ isn’t easy… and isn’t more efficient.”