Advantages of Disaster Recovery as a Service

4 comments

Nitin Mishra, VP, Product Management & Solutions Engineering, Netmagic Solutions Pvt. Ltd.

Nitin MishraNITIN MISHRA
NetMagic

This column is part one of a two-part series on disaster recovery as a service (DRaaS). See Part II, Cloud-based Disaster Recovery as a Service.

Many businesses rely on Disaster Recovery (DR) services to prevent either man-made or natural disasters from causing expensive service disruptions. Unfortunately, current DR services come either at very high cost or with weak guarantees about the amount of data lost and time required to restart operation after a failure. However, with cloud computing and virtualization opening up a plethora of opportunities, business enterprises are discovering that a lot of applications can be availed as services, DR being no exception.

This has resulted in the emerging model of delivering Disaster Recovery as a Service (DRaaS) or DR as a cloud service or DR on demand. DRaaS as a model is gaining popularity among enterprises mainly due to its pay-as-you-go pricing model that can lower costs, and use of automated virtual platforms that can minimize the recovery time after a failure.

Virtualized cloud platforms are well suited to providing DR. Under normal operating conditions, a cloud-based DR service may only need a small share of resources to synchronize state from the primary site to the cloud. The full amount of resources required to run the application only needs to be provisioned (and paid for) if a disaster actually happens. The use of automated virtualization platforms for disaster recovery means that additional resources can be rapidly brought online once the disaster is detected. This can dramatically reduce the recovery time after a failure which is a key component in enabling business continuity.

Disaster Recovery
Exhibit 1 – Disaster Recovery

Key Requirements for Effective DR Service

The requirements for an effective DR Service may be based on business decisions such as the monetary cost of system downtime or data loss, while others can be directly tied to application performance and accuracy.

Requirements for DR Service
Exhibit 2 – Requirements for DR Service

The level of data protection and speed of recovery depends on the type of backup mechanism used and the nature of resources available at the backup site. In general, DR services fall under one of the following categories:

  • Hot Backup Site: A hot backup site typically provides a set of mirrored stand-by servers that are always available to run the application once a disaster occurs, providing minimal RTO and RPO. Hot standbys typically use synchronous replication to prevent any data loss due to a disaster.
  • Warm Backup Site: A warm backup site may keep state up to date with either synchronous or asynchronous replication schemes depending on the necessary RPO. Standby servers to run the application after failure are available, but are only kept in a “warm” state where it may take minutes to bring them online.
  • Cold Backup Site: In a cold backup site, data is often only replicated on a periodic basis, leading to an RPO of hours or days. In addition, servers to run the application after failure are not readily available, and there may be a delay of hours or days as hardware is brought out of storage or re-purposed from test and development systems, resulting in a high RTO. It can be difficult to support business continuity with cold backup sites, but they are a very low cost option for applications that do not require strong protection or availability guarantees.

Cloud-based Disaster Recovery (DR)

The on-demand nature of cloud computing means that it provides the greatest cost benefit when peak resource demands are much higher than average case demands. This means that cloud platforms can provide the greatest benefit to DR services that require warm stand-by replicas. In this case, the cloud can be used to cheaply maintain the state of an application using low cost resources under ordinary operating conditions.

Only after a disaster occurs, a cloud-based DR Service pays for the more powerful – and expensive – resources required to run the full application. These resources can be provisioned in a matter of seconds or minutes. In contrast, an enterprise using its own private resources for DR must always have servers available to meet the resource needs of the full disaster case, resulting in a much higher cost during normal operation.

Disaster Recovery as a (Cloud) Service

“Cloud-based DR moves the discussion from data center space and hardware to one about cloud capacity planning,” noted Lauren Whitehouse, senior analyst at Enterprise Strategy Group (ESG) in Milford, Massachusetts.

Although the concept – and some of the products and services – of cloud-based disaster recovery (DR) is still nascent, some companies, especially smaller organizations, are discovering and starting to leverage cloud services for DR. Cloud based DR can be an attractive alternative for companies that are strapped for IT resources because the usage-based cost of cloud services is well suited for DR where the secondary infrastructure is parked and idling most of the time. Having DR sites in the cloud reduces the need for data center space, IT infrastructure and IT resources, which leads to significant cost reductions, enabling smaller companies to deploy disaster recovery options that were previously only found in larger enterprises.

Devising a Blueprint for Cloud-based DR

Just as with traditional DR, there isn’t a single blueprint for cloud-based disaster recovery. Every company is unique in the applications it runs, and the relevance of the applications to its business and the industry it is in. Therefore, a cloud disaster recovery plan (cloud DR blueprint) will be very distinct and unique for each organization.

Triage is the overarching principle used to create traditional as well as cloud-based DR plans. The process of devising a DR plan starts with identifying and prioritizing applications, services and data, and determining for each one the amount of downtime that’s acceptable before there’s a significant business impact. Priority and required recovery time objectives (RTOs) will then determine the disaster recovery approach.

Identifying critical resources and recovery methods is the most relevant aspect during this process, since an organization needs to ensure that all critical apps and data are included in the blueprint. With applications identified and prioritized, and RTOs defined, the organization can then determine the best and most cost-effective methods of achieving the RTOs (by application and service). A combination of cost and recovery objectives drives different levels of disaster recovery.

Traditional DR vs. DR as a (Cloud) Service

Exhibit 3 – Traditional DR vs. DR as a (Cloud) Service

Choosing to go with a cloud disaster recovery service will be governed purely by the business imperative. If an organization has critical applications that should be available within minutes of downtime, it should consider cloud-based DR.

Part two of this series will cover cloud-based disaster recovery options.

Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.

Add Your Comments

  • (will not be published)

4 Comments

  1. I went through the "cloud DR as a service" thing about a year ago. The results were not pretty. The bottom line is: someone has to buy the hardware. I talked to two very large well known cloud service providers, as well as two smaller ones. The standard for cloud DR(at the time and I imagine this has not changed), is they hard allocate all resources up front whether or not your actually using the service. This certainly helps their profit margins but more importantly it allows them to guarantee to the customers that the capacity they purchased will be available when the time comes to push the big red button. I was in Seattle at the time, so if Seattle burned to the ground, and 100 different companies in the area were using cloud DR all hit their big red buttons the cloud provider would have a lot of egg on their face if they didn't have the capacity to handle that hit all at once. So would the CIOs of those companies who's DR strategy just blew up in their faces at the worst possible time. I was very up front with all 4 of the service providers I talked to, told them what we needed, how much it would cost me to do it myself, and estimates for hosting. For perspective it was a small deployment, 1.5 racks of equipment, densly populated blade servers, some nice 3PAR storage, switches and load balancers. Two of the cloud providers almost immediately declined to provide real pricing after doing research on their own and determined that they could not compete. A 3rd cloud provider declined(this was one of the major ones) after about 3 weeks of analysis, I had a good long chat with one of their senior engineers it was pretty refreshing we were both in agreement on so many things. He said to me "Just give me a price, that you know your management won't accept so we can stop wasting people's time and I'll quote that price to you." I said - anything over $1M and you got a deal. He sent the quote and that was the last we heard from them. The 4th provider, which was actually the first one we contacted was the only one to give real pricing. Their 'discounted' pricing was such that I could build four full DR sites (literally) for the cost of their installation charges alone. Their lower up front solution ($100 install fee which I found amusing) was going to cost more than a quarter million dollars a month to host. I explained the situation to the provider and they asked "well do you have the staff to manage that?" I said, are you kidding? It's 12 high end servers, 1 storage array, 4-6 switches, 2 load balancers, not even two full racks of equipment. The provider could not understand (or at least the people we were talking to) how we could manage our own DR for less than they could do it. At the same time my own internal management could not understand why external cloud DR was so much more expensive. bottom line: someone has to buy the hardware, do you want to bet your business with a cloud that is built around "on demand" capacity ? It's your risk to take. It's similar to another DR plan at another company I was at years ago, they signed a deal with the big DR provider, I forgot the name. But the deal was this provider was going to pull up two 18 wheeler tractor trailers to our "DR site" and hook servers up in the event we needed to hit the big red button. The tractor trailers had generators and cooling built in. My company knew, IN ADVANCE that our "DR site" which was a small colocation facility would never allow such trucks on their property. But they signed the contract anyways, so they could tell their customers "yes we have DR". Fortunately for them they never had to push the big red button while they had that DR plan in place, they later went to active-active facilities and better replication. The costs for that original DR plan were very high too, after something like the first 30 days it was something like $50,000/day (this was back in 2004). At the time, even assuming the colo would of allowed the trucks it probably would of taken us at least a week or week and a half to get things running again, and probably another 6-8 weeks to spec and build a new facility to transition to from the DR gear.

  2. I don't think there's such a thing as preventing disasters. Nothing is foolproof. The only thing that can be guaranteed is to greatly reduce man-made or natural tragedies from ruining data and the business in the process. Regardless, you do have a point. In fact, even small businesses should learn to look into investing on disaster recovery services, especially if they have plans on expanding.