Windows Azure Cloud Crashed by Expired SSL Certificate
Microsoft has spent more than $15 billion building one of the largest global cloud computing infrastructures. An SSL certificate can be had for as little as $70 a year from a commercial certificate authority, or can be effectively free if you issue your own, as Microsoft does.
So how did an expired SSL certificate crash the Windows Azure storage cloud computing platform Friday and Saturday? It’s an expensive question for Microsoft.
“Given the scope of the outage, we will proactively provide credits to impacted customers in accordance with our SLA,” wrote Steven Martin, General Manager of Windows Azure Business & Operations, in a blog post. “The credit will be reflected on a subsequent invoice.”
The global outage for encrypted storage traffic began Friday at 12:44 PM PST, and services were restored to 99% worldwide by 1:00 AM PST on Saturday. Full availability wasn’t restored until 8:00 PM PST Saturday, placing the outage duration at 12 hours for most users but as much as 24 hours of impact for some. Up to 52 different Microsoft services reported performance problems during the outage, including Xbox Live.
“Windows Azure Storage experienced a worldwide outage impacting HTTPS traffic due to an expired SSL certificate,” Martin reported. “HTTP traffic was unaffected but the event impacted a number of Windows Azure services that are dependent on Storage.”
This marks the second time in a year in which Microsoft has issued a service credit for problems associated with an SSL certificate. Last Feb. 29 a “Leap Year Bug” triggered a date-related bug with a security certificate. The incident left Azure customers unable to manage their applications for about 8 hours and knocked Azure-based services offline for some North American users.
To help make sure Microsoft’s silly mistake doesn’t happen to anybody else, we just launched a free SSL certificate monitor: http://www.stackify.com/stackify-launches-free-certalert-me-service-to-monitor-ssl-certificates/
To ensure service availability and avoid unnecessary downtime caused by SSL certificate expiration, a cross-organization certificate management is needed. DCM is the solution administrators need in order to accomplish the challenging mission of certificates management. DCM manages certificates issued by Microsoft CA’s and certificates scanned in predefined IP ranges and ports
for more information http://www.advice-tech.com
[...] off with a bang when an expired SSL certificate caused a major Microsoft Azure Cloud outage. Data Center Knowledge gives a recap of exactly what happened, but suffice it to say the outage, which lasted 12 hours for [...]