Nitin Mishra, VP, Product Management & Solutions Engineering, Netmagic Solutions Pvt. Ltd.
This column is part one of a two-part series on disaster recovery as a service (DRaaS). See Part II, Cloud-based Disaster Recovery as a Service.
Many businesses rely on Disaster Recovery (DR) services to prevent either man-made or natural disasters from causing expensive service disruptions. Unfortunately, current DR services come either at very high cost or with weak guarantees about the amount of data lost and time required to restart operation after a failure. However, with cloud computing and virtualization opening up a plethora of opportunities, business enterprises are discovering that a lot of applications can be availed as services, DR being no exception.
This has resulted in the emerging model of delivering Disaster Recovery as a Service (DRaaS) or DR as a cloud service or DR on demand. DRaaS as a model is gaining popularity among enterprises mainly due to its pay-as-you-go pricing model that can lower costs, and use of automated virtual platforms that can minimize the recovery time after a failure.
Virtualized cloud platforms are well suited to providing DR. Under normal operating conditions, a cloud-based DR service may only need a small share of resources to synchronize state from the primary site to the cloud. The full amount of resources required to run the application only needs to be provisioned (and paid for) if a disaster actually happens. The use of automated virtualization platforms for disaster recovery means that additional resources can be rapidly brought online once the disaster is detected. This can dramatically reduce the recovery time after a failure which is a key component in enabling business continuity.
Exhibit 1 – Disaster Recovery
Key Requirements for Effective DR Service
The requirements for an effective DR Service may be based on business decisions such as the monetary cost of system downtime or data loss, while others can be directly tied to application performance and accuracy.
Exhibit 2 – Requirements for DR Service
The level of data protection and speed of recovery depends on the type of backup mechanism used and the nature of resources available at the backup site. In general, DR services fall under one of the following categories:
- Hot Backup Site: A hot backup site typically provides a set of mirrored stand-by servers that are always available to run the application once a disaster occurs, providing minimal RTO and RPO. Hot standbys typically use synchronous replication to prevent any data loss due to a disaster.
- Warm Backup Site: A warm backup site may keep state up to date with either synchronous or asynchronous replication schemes depending on the necessary RPO. Standby servers to run the application after failure are available, but are only kept in a “warm” state where it may take minutes to bring them online.
- Cold Backup Site: In a cold backup site, data is often only replicated on a periodic basis, leading to an RPO of hours or days. In addition, servers to run the application after failure are not readily available, and there may be a delay of hours or days as hardware is brought out of storage or re-purposed from test and development systems, resulting in a high RTO. It can be difficult to support business continuity with cold backup sites, but they are a very low cost option for applications that do not require strong protection or availability guarantees.
Cloud-based Disaster Recovery (DR)
The on-demand nature of cloud computing means that it provides the greatest cost benefit when peak resource demands are much higher than average case demands. This means that cloud platforms can provide the greatest benefit to DR services that require warm stand-by replicas. In this case, the cloud can be used to cheaply maintain the state of an application using low cost resources under ordinary operating conditions.
Only after a disaster occurs, a cloud-based DR Service pays for the more powerful – and expensive – resources required to run the full application. These resources can be provisioned in a matter of seconds or minutes. In contrast, an enterprise using its own private resources for DR must always have servers available to meet the resource needs of the full disaster case, resulting in a much higher cost during normal operation.
Disaster Recovery as a (Cloud) Service
“Cloud-based DR moves the discussion from data center space and hardware to one about cloud capacity planning,” noted Lauren Whitehouse, senior analyst at Enterprise Strategy Group (ESG) in Milford, Massachusetts.
Although the concept – and some of the products and services – of cloud-based disaster recovery (DR) is still nascent, some companies, especially smaller organizations, are discovering and starting to leverage cloud services for DR. Cloud based DR can be an attractive alternative for companies that are strapped for IT resources because the usage-based cost of cloud services is well suited for DR where the secondary infrastructure is parked and idling most of the time. Having DR sites in the cloud reduces the need for data center space, IT infrastructure and IT resources, which leads to significant cost reductions, enabling smaller companies to deploy disaster recovery options that were previously only found in larger enterprises.
Devising a Blueprint for Cloud-based DR
Just as with traditional DR, there isn’t a single blueprint for cloud-based disaster recovery. Every company is unique in the applications it runs, and the relevance of the applications to its business and the industry it is in. Therefore, a cloud disaster recovery plan (cloud DR blueprint) will be very distinct and unique for each organization.
Triage is the overarching principle used to create traditional as well as cloud-based DR plans. The process of devising a DR plan starts with identifying and prioritizing applications, services and data, and determining for each one the amount of downtime that’s acceptable before there’s a significant business impact. Priority and required recovery time objectives (RTOs) will then determine the disaster recovery approach.
Identifying critical resources and recovery methods is the most relevant aspect during this process, since an organization needs to ensure that all critical apps and data are included in the blueprint. With applications identified and prioritized, and RTOs defined, the organization can then determine the best and most cost-effective methods of achieving the RTOs (by application and service). A combination of cost and recovery objectives drives different levels of disaster recovery.
Exhibit 3 – Traditional DR vs. DR as a (Cloud) Service
Choosing to go with a cloud disaster recovery service will be governed purely by the business imperative. If an organization has critical applications that should be available within minutes of downtime, it should consider cloud-based DR.
Part two of this series will cover cloud-based disaster recovery options.
Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.