Mohit Aron is CEO and Founder of Cohesity.
As companies’ data grows exponentially, and different ways to use data continue to multiply, businesses are waking up to the urgent need to consolidate their storage. A hyperconverged approach, which is being used successfully for consolidating primary mission-critical infrastructure, provides a compelling answer to the chaotic state of secondary data storage.
While the issue of data sprawl among non-primary data use cases has been obvious to most storage administrators and CIOs, the solution has often been murky. That’s because most vendors have approached secondary storage use cases as a set of separate problems with separate answers, leading to the current fragmented landscape of point solutions. To recognize the value of hyperconvergence for secondary storage, companies and vendors must take a holistic view of these workloads and apply the same principles of hyperconvergence to deliver the same benefits that have been so effective with primary storage.
What is Secondary Storage and Why is It Such a Headache?
The first step toward consolidation requires defining secondary storage workloads. The idea of secondary storage data as a distinct category is relatively new, but the simplest definition is that it includes all data that isn’t directly being used for mission-critical (or “primary”) business applications. Common secondary use cases for data include backup, file shares, development and testing, object stores and analytics. Unlike mission-critical data, which typically requires the highest performing and most expensive on-premises architecture (often using all-flash arrays), requirements for secondary data storage vary significantly by response time, cost/TB, retention and many other factors. Therefore, secondary data needs to leverage a broader range of storage infrastructure, from SSD and HDD to cloud storage and tape.
The varying requirements for different secondary workloads are one of the main reasons vendors have approached the problem with separate point solutions. However, this has led to enormous data sprawl across companies that must juggle dozens of different data management systems. That creates a lot of extra work and headaches for IT departments managing these different architectures (especially as they grow in number and size). It also means that companies waste money on storage resources because data is needlessly duplicated across different architectures, and admins have difficulty understanding what data is being stored where and how it overlaps.
The Principles of Hyperconvergence and How They Deliver Value for Users
Hyperconvergence is commonly defined as being able to bring together compute, storage and networking on a single system, but the details – and the principle behind it – require more explanation. The term came about to describe a radical new approach I pioneered at Nutanix to tightly integrate compute with storage into scalable infrastructure building blocks. The concept of hyperconvergence garnered greater attention as other vendors emerged with solutions for primary storage consolidation, and legacy providers scrambled to offer solutions of their own.
There are three principles that define hyperconvergence, and each is closely connected to the value it delivers across the data center. First, a hyperconverged system must be able to run any data center workload on any portion of its infrastructure. This translates to better performance because compute and storage are tightly coupled and workloads are not held up by network bottlenecks. This also delivers greater data storage efficiency because companies don’t have to provision separate resources (each with separate buffer space) for each individual workload.
The second core characteristic of a hyperconverged architecture is that it is fully software-defined. Software-defined architectures separate the control plane from the underlying compute and data plane. This approach allows users to manage data through automated policies rather than manual adjustments to the underlying infrastructure, simplifying system administration for IT personnel.
Finally, true hyperconverged architecture consolidates network, compute and storage into scale-out blocks that can be extended infinitely (and removed individually without disrupting the data center). This of course makes it easier and more transparent for companies that need to increase or decrease their data footprint by adding units only as they need them, rather than figuring out whether to build major new datacenters that might not be used to their full capacity for months or even years. This characteristic also enables application provisioning through a single interface, eliminating time wasted on performance tuning across siloed systems.
How Hyperconvergence Can Dramatically Improve Secondary Storage Infrastructure
The key principles that make hyperconvergence so valuable for primary or mission-critical data can also be applied to secondary data to deliver similar benefits. Hyperconverged secondary storage also offers a few added bonuses that will become more important as the industry moves towards a hybrid-cloud future.
First, the single control plane of software-defined infrastructure that covers the entire array of secondary storage workloads delivers enormous efficiencies. The use cases for secondary storage are far more diverse than primary storage, which means that consolidating secondary solutions unlocks even greater value. It dramatically reduces the amount of work admins devote to separately administering each secondary storage point solution (for tasks like disaster recovery, file shares, development, etc.) for much simpler data management. A single control plane also provides much clearer insight into data that was previously stored across different systems, allowing for more intelligent resource allocation.
Hyperconverged secondary storage also eliminates redundant data copies across the organization by consolidating all workloads on a single architecture, thereby maximizing storage resources. Different non-critical use cases, like disaster recovery and analytics, have traditionally been spread across separate, siloed architecture, each of which require their own copy of the same data. By taking a hyperconverged approach, the same data stored for disaster recovery can also be used for analytics or any other secondary application. Secondary uses cases typically account for about 80 percent of most organizations’ data, so the benefits a typical enterprise can realize through consolidation are substantial.
Finally, hyperconverged secondary storage architecture provides a foundation for the hybrid cloud model that most companies are looking towards for the future. Keeping secondary data spread across a collection of point solutions makes it much more complicated to move data between on-premises and cloud infrastructure. This forces companies to choose between storing different data sets on-premises or in the cloud and makes a dynamic combination of both infrastructures practically impossible. However, the single, software-defined architecture of hyperconverged secondary storage enables automatic, policy-based movement of data to and from cloud and on-premises.
The seamless movement of data across cloud and on-premises infrastructure is crucial for companies to be able to take advantage of the flexibility and cost-effectiveness of cloud infrastructure when it’s appropriate, and the performance and accessibility of on-premises infrastructure when that’s required. It simply doesn’t make sense for most businesses to move their entire storage infrastructure to the cloud (Even those that tried it, like Dropbox, are moving back to a hybrid model). However, organizations cannot afford to ignore the benefits of cloud storage that hyperconvergence will unlock.
The move to consolidate secondary storage solutions and workloads is inevitable given the way that data – and how it’s used – continues to expand in volume and complexity. Applying the principles of hyperconvergence, companies can counter the problem of data sprawl that has become a major focus for many IT organizations. In fact, the benefits of consolidating secondary storage on a hyperconverged platform extend beyond the obvious resource and management efficiencies. A unified platform empowers companies to create a seamless connection between cloud and on-premises infrastructure that will be even more important as we move toward a hybrid cloud future. The question is not whether enterprises will decide to consolidate secondary storage workloads but how and when they will do it.
Opinions expressed in the article above do not necessarily reflect the opinions of Data Center Knowledge and Penton.