Rolling Outage for Google
May 14th, 2009 By: Rich Miller
Many users are experiencing trouble reaching Google today in a rolling outage that is affecting some regions more than others. The troubles were first seen at Google News, which came back online after an outage this morning, apparently to add video links to news searches.
Meanwhile, there are widespread reports on Twitter of trouble reaching other Google services, and even including the home page. The Google Apps status page is acknowledging a “service disruption” for Gmail and says a problem with Google Calendar has been resolved.
UPDATE: Urs Holzle, who oversees the company’s data center operations, has posted an explanation on the official Google blog. ”An error in one of our systems caused us to direct some of our web traffic through Asia, which created a traffic jam,” Holzle wrote. “As a result, about 14% of our users experienced slow services or even interruptions. We’ve been working hard to make our services ultrafast and ‘always on,’ so it’s especially embarrassing when a glitch like this one happens. We’re very sorry that it happened, and you can be sure that we’ll be working even harder to make sure that a similar problem won’t happen again.”
Google updates its software and systems on an ongoing basis, usually without incident. s Holzle spoke about the process in a March interview with Data Center Knowledge.
“Configuration issues and rate of change play a pretty significant role in many outages at Google,” Holzle said. “Someone once likened the process of upgrading our core websearch infrastructure to ‘changing the tires on a car while you’re going at 60 down the freeway.’”
UPDATE: In earlier versions of our story we mentioned reports that Google’s performance problems may have been related to issues with AT&T’s network. Here’s a statement from AT&T: “After receiving speculative reports in the media that Google experienced an outage related to the AT&T network, we looked into the matter. We have not identified any specific problems in our network that could have caused the reported outage.”