Cloud-based Disaster Recovery as a Service Options
October 27th, 2011 By: Industry Perspectives
Nitin Mishra, VP, Product Management & Solutions Engineering, Netmagic Solutions Pvt. Ltd.NITIN MISHRA
This column is part two of a two-part series on disaster recovery as a service (DRaaS). See Part I, Advantages of Disaster Recovery as a Service.
The emerging model of delivering Disaster Recovery as a Service (DRaaS) is gaining popularity among enterprises mainly due to its pay-as-you-go pricing model that can lower costs and use of automated virtual platforms that can minimize the recovery time after a failure.
What are Cloud-based Disaster Recovery Options?
An increasingly popular option is to put both primary production and disaster recovery instances into the cloud and have both handled by a managed service provider. By doing this enterprises can get all the benefits of cloud computing – from usage-based cost to eliminating on-premises infrastructure.
However, in this case the choice of service provider and the process of negotiating appropriate service level agreements (SLAs) are of utmost importance. By handing over control to the service provider, an enterprise needs to be absolutely certain whether the service provider is able to deliver uninterrupted service within the defined SLAs for both primary and DR instances.
Back Up to and Restore from the Cloud
Applications and data remain on-premises in this approach, with data being backed up into the cloud and restored onto on-premises hardware when a disaster occurs. In other words, the backup in the cloud becomes a substitute for tape-based off-site backups.
Back Up to and Restore to the Cloud
In this approach, data is not restored back to on-premises infrastructure; instead it is restored to virtual machines in the cloud. This requires both cloud storage and cloud compute resources. The restore can be done when a disaster is declared or on a continuous basis (pre-staged). Pre-staging DR VMs and keeping them relatively up-to-date through scheduled restores is crucial in cases where aggressive RTOs need to be met.
Replication to Virtual Machines in the Cloud
For applications that require aggressive recovery time (RTO) and recovery point objectives (RPOs), as well as application awareness, replication is the data movement option of choice. Replication to cloud virtual machines can be used to protect both cloud and on-premises production instances. In other words, replication is suitable for both cloud-VM-to-cloud-VM and on-premises-to-cloud-VM data protection.
Exhibit 5 – Cloud-based DR Approaches
An ideal cloud backup and DR service provides the following key elements:
- A replica of all protected systems frequently updated by incremental backups or snapshots at intervals set by the user for each system. The user determines the settings according to recovery point objectives (RPO).
- Full site, system, disk, and file recovery via a completely user-driven, self-service portal. This portal allows the user the flexibility to choose which file disk or system they want to recover.
- Fast SLA-based data recovery. Recovery is, after all, what backup is all about, and there can be no compromise when choosing a cloud service for backup and DR. The SLA is negotiated up front, and the customer pays for the SLA required. No data, no file or system disk, should take more than 30 minutes to recover.
- WAN optimization between the customer site and the cloud that enables full data mobility at reduced bandwidth and storage utilization and cost.
- Data validation. There must be an automated or user-initiated validation protocol that allows the customer to check their data at any time to ensure the data’s integrity.
- DR rehearsal that demonstrates the viability of the DR plan.
The Benefits of Cloud-based DR
The cloud can facilitate disaster recovery by significantly lowering costs:
- The cloud’s pay-as-you go pricing model significantly lowers costs due to the different level of resources required before and during a disaster.
- Cloud resources can quickly be added with fine granularity and have costs that scale smoothly without requiring large upfront investments.
- The cloud platform manages and maintains the DR servers and storage devices, lowering IT costs and reducing the impact of failures at the disaster site.
The benefits of virtualization, while not necessarily specific to cloud platforms, still provide important features for disaster recovery:
- VM startup can be easily automated, lowering recovery times after a disaster.
- Virtualization eliminates hardware dependencies, potentially lowering hardware requirements at the backup site.
- Application agnostic state replication software can be run outside of the VM, treating it as a black box.
These characteristics can simplify the replication and deployment of resources in a cloud DR site, and enable business continuity by reducing recovery times.
Evaluating DR as a (Cloud) Service
While evaluating disaster recovery as a service, there are some aspects which a company should check internally, as well as other factors which it should check with the service provider. Internally, as the first checkpoint, a company should find out whether its own data security policies comply with regulatory requirements. If this isn’t an issue, it should then assess the TCO of maintaining dedicated DR infrastructure for itself, so that a comparative study can be made.
With regard to the service provider providing disaster recovery as a service, the company should check the portfolio of services being offered by the vendor. It is advisable to check the competency of the service provider to bring the company’s systems to at least a warm state of operations, should the DR be invoked. Both the company and service provider should work together during periodic drills to build such competencies.
After selecting a vendor to provide disaster recovery as a service, the next step is to decide the service level agreements (SLAs). Some of the key specifics that SLAs should include are:
- Lead time to allocate the minimum required resources, should DR be invoked
- Lead time to scale up resources to the defined (or full) level
- Duration for which such resources will be retained on a dedicated basis for the company
- Additional fees for occupancy beyond the pre-defined period
- Additional facilities such as conference rooms and video conferencing
- Capability to provide additional hardware as and when needed
- Parameters related to work area recovery can also be included if such services are used
Exhibit 7 – Disaster Recovery as a Service Enables Business Continuity
Disaster Recovery as a Service, or DRaaS, is an emerging category for organizations that wish to control their own infrastructure but not maintain the disaster recovery systems themselves. With a DRaaS offering, an IT organization does not directly build a contingency site, but instead relies on a vendor to do so on a dedicated or utility computing infrastructure. The cloud’s advantages in elasticity and cost-reduction are significant benefits in a disaster recovery scenario, and service offerings allow organizations to outsource portions of contingency planning to vendors with expertise in the area. However, many of the complexities remain and it is essential to perform the due diligence to ensure that the contingency plan will work and provide a sufficient level of service if called upon.
Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.
In your experience what have you found really validates that a disaster recovery plan will work? A lot of people I talk to have disaster recovery plans, and everyone knows you should test them, but it’s incredibly expensive. Any thoughts would be appreciated.
Nitin MishraPosted November 29th, 2011
Cost of DR has been the key reason why lot of enterprises have not built full fledged DR and settled for offsite tape backup to protect data. They are aware that RTO willbe long with this approach but its trade off to cost. Cloud based model addresses this dilemma by providing on tap resources at low cost. Second aspect is replication tools, providers like us have made back to back arrangement on pay as you use model on this and bundled as part of servie. Third aspect is DR drills to test whether DR is available for immediate switchover , we have tied up with technology partner that provides a dashboard to constantly update on the compliance on RTO RPO objective this is backed up by DR drills assisted by NOC. All this will work and that has been our goal, if it comes at fraction of the cost of primary site cost.