• Outage in Dublin Knocks Amazon, Microsoft Data Centers Offline

    A lightning strike has caused power outages at the major cloud computing data hubs for Amazon and Microsoft in Dublin, Ireland. The incident has caused downtime for many sites using Amazon’s EC2 cloud computing platform, as well as users of Microsoft’s BPOS (Business Productivity Online Suite).

    Amazon said that lightning struck a transformer near its data center, causing an explosion and fire that knocked out utility service and left it unable to start its generators, resulting in a total power outage. While many sites were restored, Amazon said some sites that rely on one of its storage services may take 24-48 hours to fully recover. The company is bringing additional hardware online to try and speed the recovery process, but is advising customers whose sites are still offline to re-launch them in a different zone of its infrastructure.

    UPDATE: On Wednesday, Aug. 10 local utility ESB Networks now says that lightning was not the cause of the transformer failure that dropped utility power to both major data centers. The power company said the actual cause of the transformer failure remains under investigation.

    Amazon said the event affected one of the EC2 Availability Zones in its Dublin data center, which is the company’s primary European hub for its cloud computing platform.

    Generator Systems Disrupted

    “Normally, upon dropping the utility power provided by the transformer, electrical load would be seamlessly picked up by backup generators,” Amazon said in an update on its status dashboard. “The transient electric deviation caused by the explosion was large enough that it propagated to a portion of the phase control system that synchronizes the backup generator plant, disabling some of them.”

    “Power sources must be phase-synchronized before they can be brought online to load. Bringing these generators online required manual synchronization. We’ve now restored power to the Availability Zone and are bringing EC2 instances up.”

    Amazon said the power outage began at 10:41 a.m. Pacific time, with instances beginning to recover about three hours later at 1:47 p.m. Pacific time. Recovery is taking longer for some user instances, including those using Amazon Elastic Block Storage (EBS), the company said.

    UPDATE: As of 10:45 p.m. Eastern, Amazon reported that 60% of the impacted instances have recovered and are available. “Stopping and starting impaired instances will not help you recover your instance,” AWs said. “For those looking for what you can do to recover more quickly, we recommend re-launching your instance in another Availability Zone.”

    UPDATE 2: Early Monday Amazon said that problems with EBS were slowing the recovery. “Restoring these volumes requires that we make an extra copy of all data, which has consumed most spare capacity and slowed our recovery process,” Amazon said in a status update. “We are in the process of installing additional capacity in order to support this process both by adding available capacity currently onsite and by moving capacity from other availability zones to the affected zone. While many volumes will be restored over the next several hours, we anticipate that it will take 24-48 hours until the process is completed.”

    Microsoft Outage

    The Twitter feed for Microsoft online services reported that a European data center power issue had affected access to its BPOS services. Microsoft reported that services were stating to come back online as of 7:30 pm Eastern/4:30 pm Pacific. A Microsoft statement said that a “widespread power outage in Dublin caused connectivity issues for European BPOS customers. Throughout the incident, we updated our customers regularly on the issue via our normal communication channels.”

    “We were informed the incident was the result of an lightning strike that caused an explosion on one of the substation’s transformers,” said a Microsoft spokeswoman. “The lightning strike created an electrical surge large enough that it affected a portion of our backup power systems. Our team is highly trained to respond to various types of issues, so they then began to manually transfer to generator power.”

    Microsoft said that the Dublin data center’s utility power was restored at 03:10 AM PST Monday and the data center “has returned to normal operating conditions.”

    Key European Cloud Computing Hub

    Dublin has become a key cloud computing gateway to Europe and beyond for U.S. companies due to several factors, including the city’s location, connectivity, climate and ready supply of IT workers. Dublin’s temperature is ideal for data center cooling, allowing companies to use fresh air to cool servers instead of using huge, power-hungry chillers to refrigerate cooling water.

    This allowed Microsoft to design and build one of the world’s most efficient data centers, a huge facility that hosts the company’s cloud services for Europe and operates entirely without chillers. At 550,000 square feet, it is also one of the world’s largest data centers.

    Amazon opened a data center in Dublin in December of 2008 to house the European availablity zones for its EC2 cloud computing services. The company recently acquired a 240,000 square foot building in Dublin which will be converted into an expansion data center.

    The company’s property moves reflect the rapid growth of its European cloud computing operation, which was chronicled by Netcraft in December.

    Lighning photo by Dagpeak via Flickr.

    About

    Rich Miller is the founder and editor-in-chief of Data Center Knowledge, and has been reporting on the data center sector since 2000. He has tracked the growing impact of high-density computing on the power and cooling of data centers, and the resulting push for improved energy efficiency in these facilities.

  • Sign up for the Data Center Knowledge Newsletter

    Get daily email alerts direct to your inbox.

    Mundo hiperconectado | Blog PeruW

    Posted August 8th, 2011

    [...] un rayo que cayó en las afueras de Dublín provocó un incidente en los transformadores que abastec… y con los que dan servicio, en el caso de Amazon, a su plataforma de cloud computing EC2 y, en el [...]

    Mundo hiperconectado | La Isla Buscada

    Posted August 8th, 2011

    [...] Ayer, un rayo que cayó en las afueras de Dublín provocó un incidente en los transformadores que abastec… y con los que dan servicio, en el caso de Amazon, a su plataforma de cloud computing EC2 y, en el [...]

    paulz

    Posted August 8th, 2011

    We use an Amazon cloud based order management software (linnworks), it is down yesterday (7 Aug 2011) around 6PM, it is now 7AM 8 Aug 2011, it is still not working (the software can not find any available cloud server).

    Mundo hiperconectado

    Posted August 8th, 2011

    [...] un rayo que cayó en las afueras de Dublín provocó un incidente en los transformadores que abastec… y con los que dan servicio, en el caso de Amazon, a su plataforma de cloud computing EC2 y, en el [...]

    Memonic wieder verfügbar » memonic

    Posted August 8th, 2011

    [...] Ein Blitzschlag nahe vom Dubliner Rechencenter von Amazon hat zu einem Stromausfall geführt. Der Blitz hat einen Transformator getroffen und eine Explosion und Feuer verursacht. Als Folge der Explosion wurden auch die Überbrückungsgeneratoren beschädigt und mussten manuell gestartet werden. Es kam zum kompleten Stromausfall. [...]

    [...] (ang. cloud) Amazon’u. Ucier­piał również sprzęt Microsoft’u. A to wszystko przez burzę w Dublinie i feralny piorun oraz wywołany nim pożar w okolicy, który położył na łopatki [...]

    [...] because of the city’s location, connectivity, climate and supply of IT workers, according to DataCenterKnowledge.com. The latest disruption in services is bound to raise questions over the reliability of cloud [...]

    person287

    Posted August 8th, 2011

    That’s pretty bad, but I didn’t actually notice it on any of the sites or services I use.

    Julian

    Posted August 8th, 2011

    This is quite worrying news as we move more and more of our data online. Hope cloud services can improve their contingency and redundancy plans!

    [...] Lightning in Dublin Knocks Amazon, Microsoft Data Centers Offline (Data Center Knowledge) [...]

    Daan

    Posted August 8th, 2011

    We have also been hit (www.kinderfee.de) and have been offline since last night.
    It seems Amazon will still be taking some time to get things running.
    A large number of European startups, relying on the amazon infrastructure will not generate any sales today…
    amazon seems to have repaired its own sites first (e.g. amazon.de)…

    Brian

    Posted August 8th, 2011

    Round 1: “The Cloud” v Lightning….(winner Lightning)

    Soluweb

    Posted August 8th, 2011

    Those are some of the challenges Cloud computing is facing, many collateral damages.

    [...] Europe and the U.K. due to geography, connectivity, climate and available work force, according to DataCenterKnowledge.com. Dublin’s temperature is said to be ideal for data center cooling, enabling them to avoid [...]

    Joerg Steegmueller

    Posted August 8th, 2011

    There are definitely odd discrepancies between the apparent facts and the stories Amazon and Microsoft have released:

    There was no lightning strike yesterday in Dublin. The local power supply company had a short and very isolated outage in one small part of Dublin. There was NO explosion and no fire according to the local power supplier!

    Not sure where Amazon got their information from and surprising that they can claim that their backup generators didn’t start because of an explosion if no explosion did take place.

    Microsoft’s new and modern data centre is not in the area where the power outage happened, and due to the limited information available from Microsoft it can not be concluded that it happened at the same time!

    Yes, there was a very short (less than a second) power outage and the local electricity supplier says that the power source was immediately switched over to an alternate supply. But there seems to be some issues in Amazon! Otherwise their backup systems would have kicked in properly!

    KeepItHome

    Posted August 8th, 2011

    Never trust the online data world. Keep your data at home. If something goes wrong with the cloud, you have no way to save yourself. If it goes wrong at home, you know what to do, what it will take, how to do it, and so on.

    Skaperen

    Posted August 8th, 2011

    “Power sources must be phase-synchronized before they can be brought online to load.”

    That’s only true if you are trying to make a closed-transition switchover (whether mains is live or dead). A closed-transition system is useless for maintaining power from generators, since the mains outage can come as a surprise faster than even the fastest startup generators can kick in. The solution is to use open-transition switching to generators supplying a datacenter that is fully protected by a UPS/battery infrastructure (whether one per datacenter, per row, per rack, or per machine). Stagger the switchover in groups or section the generators for a smoother transition. Don’t even try to parallel generators in the design, just section everything apart by generator/switch.

    [...] traf es, einem Bericht von Datacenterknowledge.com zufolge, die Betreiber der Rechenzentren gleich doppelt hart. Denn die Intensität des Blitzeinschlages war [...]

    DG

    Posted August 8th, 2011

    Skaperen,

    I think they mean that they had to parallel the generators together onto one bus, not paralleling the utility power and the generators. i.e. bring up one generator, bring up the second, watch synchroscope go around, and hit the breaker when it gets to 12 oclock.

    They had UPS. They just didnt have a good staff to manually bring up the gensets before the UPS power died. Still, a better design would have prevented the downtime.

    [...] ha indicado que un rayo ha caído sobre un transformador cerca de su centro de datos en Dublín, lo que ha [...]

    [...] service and left it unable to start its generators, resulting in a total power outage, according to Data Center Knowledge, which originally reported the [...]

    [...] service and left it unable to start its generators, resulting in a total power outage, according to Data Center Knowledge, which originally reported the [...]

    [...] by the same lightning strikes in Dublin that affected Amazon’s datacenter was explained in a Data Center Knowledge story. Microsoft referred customers to private dashboards used by Business Productivity Online Service [...]

    [...] impacto del rayo se desató un incendio y Amazon se quedó sin corriente eléctrica, apunta DataCenterKnowledge. Amazon no ha sido la única empresa afectada. Microsoft también vio como su centro de datos [...]

    [...] Center Knowledge reported this week that lightning knocked out major cloud computing data centers in the Dublin [...]

    [...] this case, the data centers in Dublin run by Amazon and Microsoft got hit by lightning the other day causing them to go down. These [...]

    [...] It affected one Availability Zone (AZ) in AWS Europe. Rich Miller of Data Center Knowledge has detailed information on the incident.  Amazon said that lightning struck a transformer near its data center, [...]

    Brian Adler

    Posted August 9th, 2011

    A common misconception of the cloud is that it is a panacea for everything that ails IT, but it is important to remember that regardless of where your infrastructure is, “everything fails all the time”, and as such, you need to plan for it. Only one (of three) of the AWS AZs were affected by this outage, so if architectural best practices are followed, events such as this can be tolerated with little to no service disruption. Many RightScale customers come to us to discuss this very issue, including HA solutions and DR scenarios. Our best practices in this regard are summarized here: http://bit.ly/rightscalewp

    [...] this article: Lightning in Dublin Knocks Amazon, Microsoft Data Centers Offline – Data Center Knowledge Related [...]

    Henrik

    Posted August 10th, 2011

    it is just to much they dont have enough control of back up procedures and so on

    Deiric

    Posted August 10th, 2011

    There was a massive lightening strike and my neighborhood (in Dublin) was out for more than 30 minutes which hasn’t happened in many years. Sometimes you get a one second drop due to a local lightening strike but this was definitely more serious.
    So, I’d question the source of the local utility statement.

    Rich Miller

    Posted August 10th, 2011

    The local utility has now confirmed that lightning was not the cause of the transformer failure that led to the outage. We’ve updated our story to reflect this. See this update for details:

    http://www.datacenterknowledge.com/archives/2011/08/10/dublin-utility-power-outage-not-caused-by-lightning-strike/

    När naturen rasar | Surftown

    Posted August 11th, 2011

    [...] Läs mer här (English). Tweet [...]

    Doug

    Posted August 12th, 2011

    It’s instances like these that have to remind all of us how vulnerable our industry can be. Something so small can have such a massive international effect.

    Asaf Meir

    Posted August 14th, 2011

    Perhaps the solution is atomic shelter – not less than that!

    [...] week’s possibly lightning-caused outages at Microsoft and Amazon Web Services reiterated a very important lesson in cloud computing: Stuff happens, and even the best-laid plans [...]

    [...] week’s possibly lightning-caused outages at Microsoft and Amazon Web Services reiterated a very important lesson in cloud computing: Stuff happens, and [...]

    [...] Tech news. News gadgets reviews and secrets. Last week’s possibly lightning-caused outages at Microsoft and Amazon Web Services reiterated a crucial lesson in cloud computing: Stuff happens, or even [...]

    [...] week’s presumably lightning-caused outages during Microsoft and Amazon Web Services reiterated a really critical doctrine in cloud computing: Stuff happens, [...]

    [...] August 7, power equipment failures took a portion of Amazon’s EC2 cloud computing platform in Europe offline for 24-48 [...]

    [...] and back ups for when EBS fails[5]. And cloud is made by humans. Trucks hit generators[6]. Lighting strikes[7]. And that means the cloud will fail and the customer may not have any recourse to have prevented [...]

    [...] snapshots, and back ups for when EBS fails. And cloud is made by humans. Trucks hit generators. Lighting strikes. And that means the cloud will fail and the customer may not have any recourse to have prevented [...]

    [...] snapshots, and back ups for when EBS fails. And cloud is made by humans. Trucks hit generators. Lighting strikes. And that means the cloud will fail and the customer may not have any recourse to have prevented [...]

    [...] y centralizados que no sólo impulsan el control sino que por su lógica centralizada son, además poco robustos. Y por si quedaran dudas, los que realmente saben de esto, las dictaduras, no tienen dudas: quieren [...]

    [...] snapshots, and back ups for when EBS fails. And cloud is made by humans. Trucks hit generators. Lighting strikes. And that means the cloud will fail and the customer may not have any recourse to have prevented [...]

    [...] electric deviation” and disabling some of a energy synchronization systems, according to a report by record site, Data Center Knowledge. Tags: malpractice, malpractice insurance, medical malpractice, professional liability [...]

    [...] In April, AWS suffered a prejudiced disaster to a Elastic Compute Cloud (EC2) during a northern Virginia site, that brought down important sites such as Quora and Reddit. Its Dublin, Ireland, EC2 site also gifted downtime final month, following an “explosion that caused a transitory electric deviation” and disabling some of a energy synchronisation systems, according to a report by record site Data Center Knowledge. [...]

    [...] 212011   Selling Down The River is back online.  I lost everything due to a mixture of an issue at the Amazon Ireland Datacentre and my own incompetence about how EC2 works with snapshots and AMI’s.  It’s taken a [...]

    [...] configuration error. In August the European cloud operations of both Microsoft and Amazon were knocked offline by a power outage in [...]

    [...] those with their own agenda tend to pile on — even when it’s a complete freak of nature like the lightning strike in Ireland last [...]

    Data Center Availability « CCSK Guide

    Posted November 9th, 2011

    [...] have been recent reports of lightning in the Dublin, Ireland area that had knocked out major cloud computing data centers. [...]

    [...] DataCenterKnowledge report that this lightning strike has caused downtime for many sites using Amazon’s EC2 cloud computing platform, as well as users of Microsoft’s BPOS (Business Productivity Online Suite). In fact,  some sites that rely on one of its storage services took between 24 and 48 hours to be fully recover. [...]

    [...] few months ago, Amazon’s cloud went through repeated areas of turbulence (on April 21st and August 7th) causing outages for many online [...]

    [...] Some can be puzzling, like one in Dublin last year affecting both Amazon and Microsoft that Amazon initially said was caused by a lightning strike hitting a generator, leading to an explosion and fire. It turned [...]

    Add Your Comments

      RESOURCE LINKS:

Sign up for the Data Center Knowledge Newsletter

Get daily email alerts direct to your inbox.

ARCHIVED ARTICLES