Urs Holzle, Senior Vice President for Technical Infrastructure at Google, speaks during the Google I/O 2014 conference in San Francisco. (Photo by Stephen Lam/Getty Images)

Google Traces Sunday’s Cloud Outage to Faulty Patch

Google Compute Engine, the company’s Infrastructure-as-a-Service cloud, suffered its second outage in less than one month’s time. While not as serious as the Google cloud outage a few weeks ago, the network was again the culprit. Some Google cloud users experienced disruptions for 45 minutes beginning Sunday around 10 a.m. PST.

Google identified a patch problem as the culprit for network egress issues that caused the cloud outage for some users. The configuration change was tested prior to deployment to production, but it still had a negative impact on some VMs when made live.

The configuration change introduced to the network stack was designed to provide greater isolation between VMs and projects by capping the traffic volume allowed by an individual VM, according to Google.

It was a partial outage. Some users weren’t impacted, some saw slowdown, while some were experiencing timeouts when trying to contact their cloud VMs.

Google engineers are changing the protocol in response to the latest outage. The rollout protocol for network configuration has been changed, so future production changes will be applied incrementally across small fractions of VMs at a time, reducing the exposure if something unpredictable occurs.

The test suite that gave the a-O.K. signal will be modified in response to the incident as well.

“Future changes will not be applied to production until the test suite has been improved to demonstrate parity with behavior observed in production during this incident,” said the company in a statement.

Last month, a network issue led to loss of connectivity to multiple zones. That cloud outage lasted roughly an hour.

Google’s IaaS cloud had a total of 4.5 hours of downtime last year across more than 70 outages, according to CloudHarmony.

Get Daily Email News from DCK!
Subscribe now and get our special report, "The World's Most Unique Data Centers."

Enter your email to receive messages about offerings by Penton, its brands, affiliates and/or third-party partners, consistent with Penton's Privacy Policy.

About the Author

Jason Verge is an Editor/Industry Analyst on the Data Center Knowledge team with a strong background in the data center and Web hosting industries. In the past he’s covered all things Internet Infrastructure, including cloud (IaaS, PaaS and SaaS), mass market hosting, managed hosting, enterprise IT spending trends and M&A. He writes about a range of topics at DCK, with an emphasis on cloud hosting.

Add Your Comments

  • (will not be published)