The discussion following last Sunday's New York Times story on web site outages may have made the industry more self-conscious about downtime, but that doesn't seem to have translated into fewer problems. This week we've had outages for Apple's Mobile Me service, Facebook (and again today), Google Docs, 37signals and LiveSide.
When downtime strikes, what's a data center operator or online service to do? Finding the right way to acknowledge and manage downtime can be crucial to maintaining the confidence of your users, according to companies that have weathered outages. A good first step is to get over "downtime denial" and accept that your communications strategy is often as important as your efforts to restore service.
"One of the critical areas is listening to your users," said Sandy Jen, co-founder and VP of engineering at Meebo, whose instant messaging service has more than 35 million users. "It's all about expectations. The more honest you are, the more forgiving your customers are going to be."
"When you provide a compelling service to your user base, you become an essential part of a user's life," said Raj Patel, Vice President, Network Systems of Yahoo. "You have to develop trust. There's really no other way."
That trust gets tested when a site goes down. The basic framework of provider downtime messaging usually looks a lot like this:
- 1. Sorry. We're offline at the moment.
- 2. We know this site/service is important to you.
- 3. We have our best people working really hard on fixing the problem.
- 4. We'll keep you informed as best we can.
- 5. Once we're back online, we'll sort out what went wrong and how to prevent it from happening again.
The past year has had its share of outages and downtime, and some data center providers have responded to customer concerns with unusually detailed information for customers in the wake of incidents. The days of brief outage advisories on customer-only support forums seem to be numbered.
Paying attention to customer expectations is critical. But caring about your customers doesn't mean you'll satisfy them. And once they lose trust, things can get ugly. During the recent downtime for The Planet, wild rumors circulated advancing alternate explanations for the extended downtime, and were picked up by at least one well-read blog. This prompted threads on The Planet's support forum in which customers sought photographic evidence of the damage to the data center.
Sound crazy? Not if you ask the folks at HostDime/Surpass Hosting, which after a May outage encountered customer skepticism about whether it really had the backup infrastructure it described. The company posted a video of their data center to address the crticism.
You can't satisfy everyone. But being straight with your customers/users and acknowledging their pain is better than heavy spin, according to marketing expert Seth Godin, whose thoughts on lessons learned from a lengthy 2006 outage at DreamHost are worth repeating.
"Lesson one: when things get messed up, being clear, self-critical and apologetic is really the only way to deal with customers if you expect them to give you another chance," Seth writes. "Lesson two: your story is all you've got. If you sell the 'up-time' story, better over-invest in whatever it takes to be sure your story is true."