NEW YORK - As SuperStorm Sandy came ashore on the evening of Monday, Oct.29, the staff at Datagram believed they were as ready as they could be, and hunkered down for a busy night. They had no idea how busy.
"We had our NOC (network operations center) on call, and we had been been testing generators to make sure they were ready," said Alex Reppen, the CEO and founder of Datagram, a managd hosting provider. "We felt very confident that we would weather the storm.”
Mother Nature had other ideas. As the storm surge from Sandy pushed into the south end of Manhattan, water poured into the streets surrounding 33 Whitehall Street, home to Datagram's primary data center.
"It was apocalyptic," said Reppen. "It was like a tidal wave over lower Manhattan. Cars were picked up and swept away. We began to see these incredibly powerful surges of water into our basements. It was absolute chaos."
Water quickly filled the building's two basement levels, which house the diesel fuel tanks and pumps supporting Datagram's emergency backup generators, as well as key switch gear. By then, Con Edison had already shut down the local power grid. As several of its best known customer sites went dark, Datagram began a week-long struggle to bring its storm-ravaged infrastructure back online.
On The Front Line of the Storm Surge
Sandy was a huge challenge for the entire New York/New Jersey data center industry, but the superstorm's greatest impact was felt by a handful of facilities in the "Zone A" flood zone in lower Manhattan, whose basements and lobbies were flooded by the brutal storm surge. These buildings - which included 33 Whitehall, 60 Broad Street, 121 Varick Street and several Verizon facilities- confronted unprecedented damage as the storm came ashore.
A flood that damages mission-critical equipment is among the worst scenarios a data center can face, offering little hope for a quick fix. The extent of the problems facing 33 Whitehall became apparent quickly in the Datagram NOC.
"It was a lot happening, all at once," said Reppen. "The water set off alarms on our building management system. It was like a Christmas tree. We systematically sorted through the alarms. We practice this, and we have procedures, but it was a lot of scrambling."
The first task was to assess whether the diesel pumps in the basement remained operational, and could continue to support Datagram's rooftop generator. The news wasn't good.
'Devastation Outside Our Windows'
"Our main priority was to keep our generator running," said Reppen. "One of our technicians got showered with diesel fuel removing a solenoid (a valve to improve the flow of fuel through the supply line). That helped for a little while. But we saw the devastation outside our windows, and began to concentrate on cutting customers over (to backup facilities)."
Datagram owns and operates two data centers. In addition to the 16,000 square foot facility on the 25th floor of 33 Whitehall, the company also has a facility in Bethel, Connecticut, as well as colocation space at major New York and New Jersey data hubs. Many of Datagram's customers, especially those in financial services, are "double-homed" and can operate their infrastructure from either location. The Datagram staff focused on helping those customers maintain their operations.
The news was less promising for customers with single-homed servers at 33 Whitehall, who were facing days of downtime. Water filled both basement levels and the building's ornate lobby. The neighborhood was underwater.
"It was desolation," said Reppen. "The streets were like a riverbed, with mud and garbage everywhere. I saw a filing cabinet and a typewriter lying in the street. It was 36 hours before we could walk out to the lobby without being up to our waist in water.
"At first, pumping was futile," as there was water everywhere and no place for it to go. "Once the sewers started working, we were pumping like crazy. It really took 2 to 3 days solid days of pumping (to empty the basements)."
With the fuel pumps badly damaged, Reppen and his team began searching for portable street-level diesel backup generators that could restore power to the data center.
"We ordered six generators, and only one arrived," said Reppen, who said one generator was sold out from under them when the generator owner received a better offer while the unit was en route. Other delivery attempts resulted in repeated delays from heavy traffic and restrictions on access to the flood zone in lower Manhattan. It wasn't until the afternoon of Friday, Nov. 2 that the 2-megawatt Caterpillar generator arrived outside 33 Whitehall. By the following day, services were restored.
A Street Full of Generators
"Once we got the physical generator on site, we were on our way," said Reppen, who said two other tenants, Verizon and Cogent, also have mobile generators on-site.”We have a whole street full of generators.”
The generators also allowed power to be restored to the building's elevators - which was key for Datagram, whose staff had been using the stairs to access the company's data center on the 25th floor. Once the backup generator was on-site, the repairs continued on switchgear, and the basement pumps. Once those were fixed, the rooftop diesel fuel "day tank" needed to be cleaned, polished and refilled.
With those systems now back online, Datagram continues to operate on generator power. "We should have at least temporary feed from ConEd (this) week," said Reppen. The building is still not open to most tenants.
Datagram was founded in 1994 as an ISP and managed services provider. After a period of growth in colocation facilities, in 2004 the company opened its data center at 33 Whitehall, a 30-story building also known as the Broad Financial Center. The building was the original NASDAQ headquarters, and a major telco hub for PSINet and Verizon Business.
So what are the lessons learned from Datagram's experience with Sandy?
"The biggest lesson learned is redundancy," said Reppen. "A large number of customers buy services that are redundant. A lot of what we do is to guarantee 100% uptime, and many of our customers had a seamless experience. But I think a lot of complacency has built up over the years, and it left a lot of people at risk. A lot of customers opted out (of redundant hosting).
"In the future we're going to speak up with our customers that are single-homed," he added. "That's 'pre-Sandy thinking.' We’re not seeing this a sales opportunity, but we intend to be very direct in helping them understand the consequences."
While he's not without a stake in the issue, Reppen doesn't believe the flooding and downtime is likely to diminish the demand for data center space in lower Manhattan, as some have suggested.
Customers 'Need to be Downtown'
"We have customers that are downtown with us because they need to be downtown," said Reppen. "They have no option. If you need a lot of connectivity, the tip of New York is the best place for that. Jersey or uptown are not the answer for these customers. There are other customers that really didn't need to be downtown, and most have already moved to Connecticut or one of our other facilities.
"We haven't lost any customers," he added. "We have one or two that are unhappy with us and likely to move, but our larger customers have all been solid."
Reppen also said the outage provided an interesting test of the "cloudability" of complex infrastructures. "Some customers who are single-homed tried to go to Amazon and other clouds, and ran into a lot of trouble adapting their environments," he said. "Once they figured it out, their bills were 2 to 4 times what they were paying us. I think a lot of people discovered very quickly that thinking the cloud is going to save you is ludicrous."
Moving to the Mezzanine
One of the most common questions about the outages at data center providers in Lower Manhattan concerns the location of the diesel tanks and fuel pumps. Why would you put mission-critical equipment in a basement in a potential flood zone? According to Reppen, the answer is simple: because the city wouldn't let them put it on the roof.
New York's restrictions on rooftop diesel storage tanks arose from the Sept. 11 terrorist attacks, when an early engineering analysis suggested that leaking fuel from diesel tanks contributed to the collapse of 7 World Trade Center. The building wasn't struck by a plane, as was the case with the Twin Towers, but was damaged by debris from the collapse of 1 World Trade Center. The official report from FEMA gave credence to this theory, but a subsequent in-depth technical analysis from the NIST found that the diesel fuel was not a significant contributor to the fires and subsequent collapse. But the restrictions persist.
"We had recently applied to move the fuel tanks to the roof and were rejected,” said Reppen. "We're hoping the city may change its view on fuel. We've talked to a lot of our competitors, and believe that this issue will be revisited."
Despite the ban on rooftop fuel storage, Reppen said that Datagram and the building management at 33 Whitehall realize that housing critical infrastructure in the basement is no longer viable. A project is underway to move the diesel fuel storage tanks, pumps and switchgear to the mezzanine level. Located just above a three-story atrium at 33 Whitehall, the mezzanine level is about 35 to 40 feet above street level - above flood risk, but low enough that the city will allow fuel storage. The project is expected to be completed sometime in the first quarter of 2013.
"We're prepared to make the investment," said Reppen. "It's investing in our customers and our future. Being downtown is critical for us. There's not a whole lot we can do about the location except fix it."
Datagram is also addressing the difficulties in procuring a mobile generator. "We're going to buy a street generator and park it in our Connecticut data center," said Reppen. "Then we just need a driver to get it to New York."
The lesson of Sandy is that in envisioning risks to uptime, you need to think beyond your previous experience. "We’re still thinking about what the next disaster could be," said Reppen. "We've had a terrorist attack. We've had a flood. Maybe a plague of locusts jamming the air intakes.
"This has been a a tremendous learning experience, which we’re not going to waste."