We can all be a little hesitant to delete old photos or other valuable assets off our devices. Who knows when you might need to access that information again, right?
Businesses are the same way. As organizations across all verticals increasingly become more data-intensive – global data created continues to grow around 23% annually – holding onto this data for the long run is becoming more and more of a challenge.
The solution is cold storage. By taking a closer look into an organization’s data needs, IT leaders can leverage this approach to ensure that valuable data is around – and accessible – whenever they may need it.
The Rise of Cold Storage
So what exactly is cold storage? Think of it as a use case, one that entails retaining data that is not actively in use. In this case, data is stored in archives, or ‘cold’ storage where the information is infrequently accessed, and while it still needs to be readily accessible, IO performance is not a priority. This is in contrast to “hot storage” which may often need high performance characteristic for processing, and to “frozen storage” where data is likely never needed and access in hours or days is acceptable.
For a growing number of organizations, there’s a good reason to explore cold storage. As more data continues to be generated, thanks in large part to rapid technology innovation and digital transformation, there is more of a reason not to get rid of this data, even if it’s not in use at the time.
For the surveillance and security industry, for example, you can’t take the same video twice. Once something happens, it won’t occur exactly the same way again, and you never know when you’ll need to go back and check a recording, so you retain it indefinitely. For the automotive industry, when you test drive an autonomous vehicle and have an abundance of data to work with, you don’t want to have to go back and perform the same test again, right? Instead you retain the data so it can be leveraged as autonomous algorithms continuously evolve. And for a TV network broadcasting a pro football game, you never know when you’ll have to pull footage of a player’s college football days, so you keep the video, no matter how old it may be.
Regardless of the industry or the specific scenario, there could always be a reason to go back and access older data. And for some companies, they might actually already know when that moment will be – they just don’t need the data right now. An example of this could be with machine learning. Huge catalogs of training data can be very time consuming to create. Once the dataset has been accumulated, it may be accessed infrequently, but it’s valuable data to retain for retraining machines or simply to save time recreating such a large dataset again. It could also be valuable for eventually selling to others for training purposes.
For companies needing to hold onto this data in the long run, they are faced with a key question: can they afford to put that data on their most expensive storage infrastructure and treat it like it is being used right now, or would they rather put it in a place that is cost-effective and revisit it later? That is the benefit of cold storage, which allows data to be stored at a lower cost because it is infrequently accessed versus the live, ‘hot’ production data like financial transactions, which need to be accessed immediately and multiple times. Because of this benefit, organizations can more easily avoid the tug-of-war that can frequently occur when it comes to data storage – the balance between expanding storage resources at a higher cost to keep data accessible versus making the difficult decision of deleting some information that may prove to be invaluable. Now, it doesn’t have to be either-or.
For these reasons, cold storage is taking off, and will continue to accelerate. In fact, according to industry analysts, today at least 60% of all digital data can be classified as archival, and it could reach 80% or more by 2025. As such, cold storage is becoming one of the fastest-growing segments in the industry, and cloud providers are reinventing their architectures with accessible archives to keep pace and ensure effective management of the data that resides in cold storage.
The Key Steps
So an organization wants to go on a path toward cold storage. Now what?
While cold storage is best defined as a use case more generally, organizations now have to explore specific technologies and solutions to help them make this use case a reality, and they should keep a few things in mind.
First, treat cold data like primary data. It must be online and easily searchable and accessible. Storing data is one thing, but being able to access it efficiently when you need it can be a whole other challenge, especially as these datasets only continue to grow.
Think of that TV network that needs to pull up an old replay of a sporting event that happened years ago. They need an infrastructure in place that allows them to pinpoint where this data resides. Just because the data is older, doesn’t mean it’s not primary data. It still needs to be treated as such, especially in terms of accessibility.
Not treating archival data as primary data is a misstep that can commonly be made for this next important consideration as well – data protection.
Even though this data is being stored on secondary storage, it still needs to be protected like primary data, because that is ultimately what it is. Whether it’s ensuring data isn’t lost in a site disaster, or safeguarding it against threats like ransomware, cold storage can only be bulletproof if the data is being treated like it would as if it were being used right now.
Another key area to keep in mind is cost. While cold storage provides a more cost-effective solution to keeping data on hand for the long term, there are still different considerations that must be addressed to ensure the best efficiency.
For example, people in the industry may often believe that cloud storage is going to be the most cost-effective cold-storage repository. For some organizations and situations, however, it might not be, and if you keep adding more data to the cloud, it could in turn continue to add more to your monthly bill. While the cloud certainly has incredible benefits – especially around elasticity and protection – it may not be the panacea for all situations or organizations.
On the other hand, IT leaders might assume that because tape technology is the lowest cost storage technology, their organization must deploy tape for all of their cold storage needs. However, in many situations, this can drastically overcomplicate these efforts. Organizations must do a thorough cost of ownership analysis and dig into service level agreements of the technologies being leveraged because, for some, investing in a number of storage tiers might not always save you that much, and could add unnecessary complexity. Simply put, there is not a one-size-fits-all solution.
The Future is Cold
Given the growth and importance of this segment, substantial investments are being made toward new technologies and innovations in cloud storage, HDD-based solutions, and tape that will address the needs of the market going forward.
Regardless of the path taken, one thing is clear. Because of these now more mature use cases that have come around, whether it be AI, machine learning, autonomous driving, smart video, IoT or smart city applications, the world has grown to be a greater producer of data than it ever has in the past and as a result, cold storage is going to be even more critical as time moves along.
Scott Hamilton is senior director, product management and marketing at Western Digital