Richard Dolewski, Disaster Recovery SME and Vice President Business Continuity Services for Velocity Technology Solutions.
It’s that time again. Hurricane season is upon us, which should always trigger a very important question in the world of IT: “Does our current Disaster Recovery (DR) plan demonstrate confidence to our business that we can recover if needed?” Last summer, Hurricane Irene wreaked havoc along the East Coast, leaving many businesses, employees and families with irreversible damages and loss. Irene was certainly no “lady.” Irene taught, or should have taught, IT a few very important lessons.
Lessons from Irene
First and foremost, all companies must have a fully-tested DR plan in place. You can’t afford not to, when disasters can have such a dramatic impact on the overall health of the business. What’s more, it’s critical that the plan supports your current business requirements. IT and business units must communicate and coordinate. IT must recognize that their plans need to be integrated across the enterprise to mitigate vulnerability and minimize data loss. Companies should also make sure they have an infrastructure they can recover to, by maintaining a recovery facility in an alternate FEMA geographic region outside the disaster zone.
DR Plan Failures
Many DR plans fail in these five categories:
- Incomplete: Plan does not include all critical systems
- Outdated: Plan does not protect current IT infrastructure
- Delivery Gap: IT staff training has not been completed
- Testing: Plan has not been recently tested in its entirety
- Coordination: Plan lacks integration with the business
So, how can you ensure that your DR plan will not fail? First of all, it’s important to make sure that you have skilled technical resources that are available to perform the recovery. After all, during a disaster like Hurricane Irene, you can’t predict the availability of key IT staff. The DR plan should designate team members, either internal or from a service provider, who reside outside the disaster zone and have the expertise to manage the recovery for you.
It’s important to implicitly understand that a DR plan must be tested regularly to ensure that both systems and staff are capable should the plan need to be activated. You must always ask yourself, “If I were to invoke the plan, am I 100 percent confident I can recover the business within stated objectives?”
Create AND Test Your DR Plan
So, what should your plan look like? And, how should you test it? Here are my 10 suggestions to consider when creating and testing your DR plan:
- Current, complete and comprehensive: Make sure your plan is up to date, detailed and easy to follow, and, that it supports all critical aspects of your business.
- Prioritize, categorize and distribute: Not all servers in your computer room are of equal importance to your business. Prioritize servers and mission-critical apps and identify those in the plan. Distribute your plan to all plan holders and ensure it is easily accessible—and that they receive updated copies whenever the plan changes.
- Book a test date! Enough said.
- Test often: To ensure business continuity, DR plans should be tested at least once a year and more if major business or infrastructure changes occur, or if you have very short recovery time requirements. Book your DR tests in advance of known pending events.
- Test differently: Incorporate a variety of tests with the goal of exercising/testing all components of the plan. Test in the context of simulated, realistic disaster scenarios, so you get practice before the real thing hits.
- Test actively and passively: Conduct two categories of testing: actively, where you exercise procedures and actions of the plan and passively where you talk and walk through the procedures with key participants. Both are equally important.
- Incorporate surprise: Because disasters often come as a surprise, incorporate an element of surprise into your tests and see how the plan reacts. To truly be prepared, you need to experience simulated disaster and evaluate the effectiveness of current procedures.
- Perform basic routine exercises and logistical checkups: Perform a call tree exercise to confirm that contact information for anyone potentially involved (including vendors) is up to date. Make sure you can readily recall backup tapes from offsite storage. Don’t forget to check conference bridges.
- Test generators: Test the generator (if you have one) under full load to see how it will react. Ensure you have a backed agreement with multiple fuel suppliers so when you need to refill the tanks, diesel fuel is available and delivered within stated Service Level Agreements (SLAs).
- Examine backup strategies: Take a look at your backup strategies on a regular basis and ensure that they reflect the priority, recovery time and recovery point objectives of your data correctly.
The bottom line: companies rely on technology to run their businesses, so downtime is a business issue, not just a technology issue. Any disaster, whether natural (hurricane, flood or earthquake) or related to equipment/hardware failure, will undoubtedly cause downtime or, even worse, negatively affect a company’s bottom line. Time and time again, I have seen companies in the midst of a DR nightmare wishing they had been better prepared. And, all these companies have one thing in common: they never thought it would happen to them.
My advice is to assume that disaster will affect you at some point. Forrester research shows that 60 percent of businesses have invoked their DR Plans in the past five years. Internalize the important lessons learned from Hurricane Irene and use the onset of the 2012 hurricane season to kick start your DR planning and testing. I promise you won’t regret it.
Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.