Technical Details of Facebook Outage

Facebook was down for more than two hours Thursday afternoon, marking its longest outage in about four years. The Facebook Engineering blog has posted a detailed explanation of what happened.”The key flaw that caused this outage to be so severe was an unfortunate handling of an error condition,” writes Facebook’s Robert Johnson. “An automated system for verifying configuration values ended up causing much more damage than it fixed.”

In short: A configuration change created a feedback loop that overwhelmed a database cluster. The only way to fix the problem was to take the whole cluster offline – which meant downtime for web site. Read the Engineering blog for more details.

Get Daily Email News from DCK!
Subscribe now and get our special report, "The World's Most Unique Data Centers."

Enter your email to receive messages about offerings by Penton, its brands, affiliates and/or third-party partners, consistent with Penton's Privacy Policy.

About the Author

Rich Miller is the founder and editor at large of Data Center Knowledge, and has been reporting on the data center sector since 2000. He has tracked the growing impact of high-density computing on the power and cooling of data centers, and the resulting push for improved energy efficiency in these facilities.

Add Your Comments

  • (will not be published)