Wikipedia’s Data Center Overheats
March 25th, 2010 By: Rich Miller
Wikipedia was offline yesterday after a cooling problem in Wikimedia’s European data center led to a heat condition that caused a server shutdown. The initial problem affected Wikipedia users, but the main English-language Wikipedia site was affected when the failover to WikiMedia’s Tampa data center didn’t go as planned.
“Due to an overheating problem in our European data center many of our servers turned off to protect themselves,” Wikimedia reported on its tech blog. “As this impacted all Wikipedia and other projects access from European users, we were forced to move all user traffic to our Florida cluster, for which we have a standard quick failover procedure in place, that changes our DNS entries. However, shortly after we did this failover switch, it turned out that this failover mechanism was now broken, causing the DNS resolution of Wikimedia sites to stop working globally.” The DNS problem was fixed quickly, but it took an hour or more for the changed DNS settings to propagate to ISPs around the globe.
Wikipedia is one of the world’s busiest web sites. But unlike commercial ad-supported sites, the cost of downtime for the user-managed encyclopedia is minimal. “Down time used to be our most profitable product,” joked Domas Mituzas, a performance engineer at Wikipedia, during a 2008 overview of Wikipedia’s infrastructure. The gag is that when Wikipedia is offline, the site often displays a page seeking donations for additional servers.
Wikipedia houses about 50 servers in the EvoSwitch data center in Amsterdam. EvoSwitch is a 100,000 square foot data center supported by 20 megawatts of power capacity that is generated entirely from sustainable energy sources including, solar, wind and biomass. The facility uses free cooling (fresh air economization) to reduce its use of energy for air conditioning. It’s not clear why the cooling system for the Wikipedia servers encountered problems yesterday.
Wikipedia may soon be getting more infrastructure to support its operations. In February the Wikimedia Foundation received a $2 million grant from Google, which it will use to expand its data centers.
[...] to an overheating situation in the data center and then a failure of their failover procedure and here. I won’t start on the failure of their data center provider to detect and head off whatever [...]
[...] Downtime for Wikipedia as Data Center Overheats, Data Center Knowledge [...]
[...] Wikipedia, linked heavily above, suffered an outage to their European data center this week due to overheating. Mark Bergsma from Wikimedia, the company that runs Wikipedia, posted on their tech blog that some of their servers powered down to protect themselves from a worse fate. Services had to be relocated to the Florida cluster causing an outage of at least 1 hour, more for some sites that don’t honour DNS TTL (time to live) properly. You can read more in Mark’s technical blog, called Global Outage (cooling failure and DNS). There is also more at Data Center Knowledge, see Wikipedia’s Data Center Overheats. [...]
[...] sure tenants use the latest technology that produces a minimum of heat. Pay special attention to data centers which require substantial [...]
[...] If you answer anything else but something like “last month and every month before that”, then you are probably in troubles. Learn from Wikipedia’s Data Center Overheating. [...]