Did a common data center safety feature play a role in the Mubarak government’s move to cut off Egypt’s access to the Internet? No one seems to know precisely how the nation-wide Internet block was accomplished, but recent reports suggest that equipment at a key Internet exchange in Cairo may have been powered down by “throwing a breaker.”
Other analysts believe the Egyptian government either altered the routing tables that manage Internet traffic, or simply called ISPs and ordered them to halt services. This view would seem to be supported by routing data indicating that Egyptian networks lost service gradually, rather than all at once.
But the reference to a “breaker” being used to power down equipment raises the prospect that the incident may have involved an Emergency Power Off (EPO) button or a similar hardware shutdown. The EPO, also often known as the “Big Red Button,” instantly shuts off power for the entire data center in the event of an emergency, a requirement for fire codes in many jurisdictions.
The Internet as Disruptor
In the wake of the Egyptian Internet shutdown, there has been an intensive effort to understand how it was accomplished, partly out of surprise that such a cutoff was possible, and partly to raise awareness of the potential for it to recur in other nations – a concern reinforced by sporadic national Internet outages in Libya this week amid anti-government protests.
Much of the analysis of Egypt’s net blackout has focused on 26 Ramses Street, the primary telecom building and Internet exchange in Cairo. The New York Times reported this week that the building was the “focal point” of the Net blackout.
“There has been intense debate both inside and outside Egypt on whether the cutoff at 26 Ramses Street was accomplished by surgically tampering with the software mechanism that defines how networks at the core of the Internet communicate with one another, or by a blunt approach: simply cutting off the power to the router computers that connect Egypt to the outside world,” The Times writes.
A report last week in Wired cited a presentation by Bill Woodcock of the Packet Clearing House, which stated that “most of the outage was effected through a breaker ﬂipped in the Ramses exchange, and the rest was phone calls and arm-twisting.” A timeline compiled by Woodcock says breakers were thrown in the international transport and national IXP section of the Ramses Exchange at 12:28 a.m. on Friday, January 28, dropping the number of connected netblocks from 3,500 to about 300.
Kill Switch ‘Not Realistic’
Some other analysts reject the notion of a physical shutdown of equipment, saying software and phone calls likely did the trick. Among them is Jim Cowie, the co-founder and chief technology officer of Renesys, which tracks Internet routing. The Renesys blog provided regular analysis of Egyptian networks during the crisis, and asserts that the shutdown “was not an instantaneous event on the front end; each service provider approached the task of shutting down its part of the Egyptian Internet separately.”
“People have talked about a ‘kill switch,’ but that is not realistic,” Cowie told the Wall Street Journal. “What is most likely is that somebody in the government gives a phone call to a small number of people and says, ‘Turn it off.’ And then one engineer at each service provider logs into the equipment and changes the configuration of how traffic should flow.”
EPO as Show Stopper
The ability for a data center EPO system to quickly knock networks offline has been well documented over the years, often when an unsuspecting technician, vendor or cleaning crew presses the button by mistake.
The history of the emergency power off switch dates back to 1959, when a fire in the Air Force’s statistical division in the Pentagon caused $6.9 million in property damage and destroyed three IBM mainframe computers.The National Fire Protection Agency (NFPA) was tasked to develop rules to address fire risks in IT environments.
Rick Sawyer, a Strategist at HP Mission Critical Facilities ia a leading expert on EPO function and design. Sawyer says that although the emergency power off systems are part of the U.S. fire code, many international data centers were designed and built by U.S. firms.
“It is entirely reasonable to assume that EPO’s are in a lot of world wide data centers where they are not required by local codes, even if codes don’t exist – and they don’t in some places,” said Sawyer.
Inelegant but Effective Shutdown Method
The EPO button has previously figured in attempts to disable critical infrastructure. In April 2007 a disgruntled technician hit the EPO button at the data center that controls the electrical grid for the state of California, with the FBI calling the incident an act of deliberate sabotage. Officials said that the outage could have disrupted the power grid for the Western U.S. if it occurred during normal business hours instead of late Sunday night. While sabotage is difficult to predict and prevent, the EPO button provided a mechanism for doing maximum damage in an instant.
Hitting the EPO button is a particularly disruptive method for interrupting network traffic, and unlikely to be the first choice for anyone familiar with IT equipment. Sudden power shutdowns are hard on equipment, especially power supplies. Sudden power losses in data centers can result in some servers being unable to start and requiring new power supplies.
Several Network Control Methods
Sawyer said there are several alternate methods that could be used to power down equipment, including turning off a breaker supplying power to the servers and routers supporting the network, or walking around the data center and physically powering down servers.
He said an admin could also send a “power off” command through network management software. “This is commonly used to reboot a network device remotely, but instead of restarting as in a reboot, you just leave the device off,” he said. Remote reboots are common in hosting environments, allowing a technician or customer to reboot a server without entering the data center.
Time for a Business Continuity Review?
Whatever the methodology, Sawyer says the recent events in Egypt and Libya drive home the need for multi-national corporations to review their business continuity procedures.
“It is assumed by most companies that the internet will always be there and that the business processes from sales to delivery can rely on it,” said Sawyer. “But the Internet can fail, and Egypt is a case in point. Companies should plan on being able to sustain their business activities for a period of time using back-up ‘low tech’ solutions – like paper – until function is restored.”