Hosts: Leap Second Caused Spike in Power Usage
July 3rd, 2012 By: Rich Miller
Several large web hosting providers says the “leap second” last weekend caused power usage to spike on their Linux servers, which continued drawing power at above-average levels until the server was restarted. One large German host, Hetzner AG, says the higher level of server activity increased the level of IT power usage by 1 megawatt. Another large European hosting firm, OVH, also reported a spike in power usage as servers struggled with the leap second issue.
“The reason for this huge surge is the additional switched leap second, which can lead to permanent CPU load on Linux servers,” Hetzner said in an email, in which it asked customers to perform either a soft reboot (or if that doesn’t address the issue, a hard reboot in which the server is turned off) in order to reduce server power usage. Hetzner runs more than 35,000 servers across its infrastructure.
A leap second is a one-second adjustment that is occasionally applied to Universal Time (UTC) to account for variations in the earth’s rotation speed. A number of web sites and IT systems reported service outages Saturday as servers experienced difficulty handling the time adjustment.
The additional second caused particular problems for Linux systems that use the Network Time Protocol (NTP) to synchronize their systems with atomic clocks. The leap second caused these systems to believe that time had “expired,” triggering a loop condition in which the system endlessly sought to check the date, spiking CPU usage and power draw. Hosting companies are major users of Linux systems, which are released as open source and normally do not require a paid license.
Power usage is a major expense for web hosts and companies that maintain thousands of servers, which has prompted an increased focus on energy efficiency in recent years.
For a deeper technical explanation of the Linux leap second bug, see this post by John Stultz at LKML.org.
So a side-bar discussion for this event is as a use-case data point regarding power draw as a result of server processor activity. In the “useful work” discussion, it would seem this is an example of excessive power draw due to “non-useful work.”
[...] A 61 second minute was added to clocks around the world on June 30, 2012 at 23:59:60 UTC in order to compensate for slight variations in earth’s rotation speed. This triggered a number of software bugs one of which caused a spike in data center electrical power consumption. [...]