Delivering Effective Quality of Service

Brandon Salmon works in the Office of the CTO for Tintri.

When most folks in the data center think about Quality of Service (QoS), networking is most likely to come to mind. QoS typically refers to the capability of a network to provide better service to selected network traffic. But, it is a term increasingly adopted by storage, which you can think of as QoS for IOPS—a measure of performance.

Why does QoS for IOPS matter? Businesses depend on applications, and applications depend on storage; they need sufficient performance from storage in order to operate smoothly. As a result, storage admins need to understand how QoS can help them guarantee application performance and even differentiate the services they offer to (internal and external) customers.

LUNs and the Noisy Neighbor Problem

You’ve probably heard of noisy neighbors—in fact you’ve probably dealt with them yourself. Why do noisy neighbors exist? LUNs.

Conventional storage is built on LUNs, which made sense as a storage management unit when there were a few physical workloads. But today, more than 80% of workloads have been virtualized, and so instead organizations are stuffing those same LUNs with tens or hundreds of virtual machines (VM). Within a LUN, if a single VM goes rogue—if it starts demanding more than its share of performance—it can negatively affect the performance of other VMs in that same LUN. It’s a noisy neighbor. Even worse, you’ll only see that the LUN is behaving badly; you won’t know which resident VM is the real troublemaker.

Fortunately, there’s an alternative—just move out of the LUN neighborhood. More organizations are turning to VM-aware storage (VAS), which uses individual VMs as the unit of management. There are no LUNs, and so there are no neighbors. If an individual VM goes wrong, it doesn’t affect any other VMs on the VAS storage platform.

Band-aids for Bottlenecks

You can eliminate the conflict over resources, or you can simply increase the performance resources available. That’s one of the reasons for the explosion in all-flash storage; organizations are throwing more and more all-flash at their performance problems.

But all-flash alone is not enough—it’s a band-aid. It postpones having to deal with the underlying problem (LUNs), and you have to apply more and more over time. It’s easy to see how costs can spiral out of control.

Now, some storage providers tout QoS despite having a LUN-based architecture—but that’s not a solution either. You can set QoS for an entire LUN. If a VM within that LUN goes rogue you can use QoS to assign the entire LUN even more performance. Since you can’t see which specific VM is causing problems, you’re just pouring performance resources at the LUN, not addressing the root cause.

Use Cases for VM-level QoS

With VM-aware storage you have visibility into every VM, and that means when behavior changes you can take action. That’s because you can specifically set the minimum and maximum QoS for IOPS on any individual VM. For example:

When a VM Goes Rogue …

On occasion a VM will start misbehaving. Maybe it’s expected (a finance server at the end of the month), or perhaps it’s not (a print server that goes awry). Either way, with VM-level QoS, you can set a maximum ceiling for IOPS. To keep things contained, clamp that ceiling down.

When a VM is Mission Critical …

If you’ve got a mission critical VM that must get sufficient performance, then you need the ability to take the minimum IOPS up to a set level. You’ve used QoS to guarantee performance for that VM.

When You Want to Differentiate Tiers of Service …

And when you’ve got VM-level QoS you can even create multiple tiers of service on a single platform. That’s incredibly difficult with LUNs, since the residents might be a mix of mission critical and less critical VMs. In the past, enterprises and service providers typically bought multiple storage devices, with some dedicated to “gold” applications, others to “silver” applications and so on. The device itself was the dividing line. But with VM-aware storage you can establish gold, silver, bronze and/or other tiers on one device, and then assign each VM to a tier.

Importantly, in any of the above scenarios you can see you are receiving immediate, visual feedback. You know whether your changes are spurring any contention and/or latency, and its exact source. That way you know that your actions are having the intended effect.

To guarantee the performance of your virtualized applications, you need more than all-flash alone; you need all-flash with VM-aware storage capabilities. That way you can add per-VM QoS to your toolbox and rapidly fix (or prevent) the problems that might otherwise plague performance.

Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.

Comments

Plain text