The Outage and Dashboards

Tuesday's 40-minute outage at, which left 900,000 subscribers offline, raises questions about the effectiveness of status dashboards.

Rich Miller

January 8, 2009

1 Min Read
Data Center Knowledge logo

Is a network uptime dashboard useful if it goes down during an outage? That's the question raised by Tuesday's 40-minute outage at, which left 900,000 subscribers without access to their applications. The downtime was attributed to a network device failing due to memory allocation errors.

"The failure caused it to stop passing data but did not properly trigger a graceful fail over to the redundant system as the memory allocation errors were present on the failover system as well," reported on its status dashboard. "This resulted in a full service failure for all instances. had to initiate manual recovery steps to bring the service back up."

The Register said the outage "exposed the dark side of cloud computing," demonstrating the vulnerability of the cloud. Others took a more practical view of the issues raised by the downtime. 

"This event clearly shows us why hosting your own public health dashboard is a problem," writes Lenny Rachitsky at Transparent Uptime. "The dashboard was down along with the site itself."

Rachitsky is not without an interest in the topic, as he's a senior engineer at WebMetrics, which provides third-party performance monitoring. But his blog provides some useful analysis of web site performance, including a list of public health dashboards and status sites for SaaS providers., for its part, says it will "continue to work with hardware vendors to fully detail the root cause and identify if further patching or fixes will be needed."

Subscribe to the Data Center Knowledge Newsletter
Get analysis and expert insight on the latest in data center business and technology delivered to your inbox daily.

You May Also Like