Why is Data Storage Such an Exciting Space?
Srivibhavan (Vibhav) Balaram is the Founder and CEO of CloudByte Inc. He is a General Manager with more than 25 years of industry experience. He has spent 5 years working in the United States with companies like Hewlett Packard, IBM and AT&T Bell Labs.VIBHAV BALARAM
For a while, the storage industry appeared to be fairly stable (read: little technology innovation), with consolidation around a few large players. Several smaller companies were bought out by larger players – 3PAR by HP, Isilon by EMC, Compellent by Dell. However, in the last year, we’ve seen a renewed action in the space with promising new start-ups, dedicated to solving the storage problems in the new-age data centers. So, what exactly is the problem with legacy storage solutions in new-age data centers?
Evolution of Storage Technology
For better perspective, let’s start with a quick recap of data storage technology evolution. In the late 1990s and early 2000s, storage was first separated from the server to remove bottlenecks on data scalability and throughput. NAS (Network Attached Storage) and SAN (Storage Area Networks) came into existence, Fibre Channel (FC) protocols were developed and large scale deployments followed. With a dedicated external controller (SAN) and a dedicated network (based on FC protocols), the new storage solutions provided data scalability, high-availability, higher throughput for applications and centralized storage management.
Server Virtualization and the Inadequacy of Legacy Solutions
Legacy SAN/NAS based storage solutions scaled well and proved adequate, until the advent of server virtualization. With server virtualization, the number of applications grew rapidly and external storage was now being shared among multiple applications to manage costs. Here, the monolithic controller architecture of legacy solutions proved a misfit as it resulted in noisy neighbor issues within shared storage. For example, if a back-up operation was initiated for a particular application, other applications received lower storage access and eventually, timed out. Further, storage could no longer be tuned for a particular workload as applications with disparate workloads shared the storage platform.
Rising Costs and Nightmarish Management
Legacy vendors attacked the above issues through several workarounds – including faster controller CPUs and recommending additional memory with fancy acronyms. Though these workarounds helped to an extent, the brute way to guarantee storage quality of service (QoS) was to either ridiculously over-provision storage controllers (with utilization below 30-40 percent) or dedicate physical storage for performance-sensitive applications. Obviously, these negated the very purpose of sharing storage and containing storage costs in virtualized environments. Subsequently, storage costs relative to overall data center costs increased dramatically. Being hardware-based, legacy vendors didn’t see any reason to change this situation. With dedicated storage for different workloads, there were several storage islands in a data center which were chronically un-utilized. Soon, “LUN” management became a hot new skill and also a nightmare for storage administrators.
The New-Age Storage Solutions
With the advent of the cloud, today’s data centers typically have 100s of VMs which require guaranteed storage access/performance/QoS. Given the limitation of legacy solutions to scale in these virtualized environments, it was inevitable that a new breed of storage start-ups cropped up. Many of these start-ups chose to simplify the “nightmarish” management either by providing tools to observe and manage “hot LUNs” (a term to denote LUNs that serve demanding VMs) or by providing granular storage analytics on a per-VM basis. However, the management approach does not really cure the “noisy neighbor” issues, leaving a lot of other symptoms unresolved.
Multi-tenant Storage Controllers
There is a desperate need for solutions which attack the noisy neighbor problem at its root cause i.e., by making storage controllers truly multi-tenant. These controllers should be able to isolate and dedicate storage resources for every application based on its performance demands. Here, storage endpoints (LUNs) will be defined in terms of both capacity and performance (IOPS, throughput and latency). These multi-tenant controllers will then be able to guarantee storage QoS for every application right from a shared storage platform.
Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.
The short answer is that storage is interesting now because of pent-up “technical debt” from the years in which it was largely ignored. Many of the people trying to innovate in this space now should have been doing so years ago. The need was there, any user could have told you, but most people were off chasing The New Shiny in applications or networking. Welcome to the party, folks. Now get to work.