The business networking site LinkedIn announced plans yesterday for an initial public offering in which it hopes to raise at least $175 million. The company's SEC filing contained an interesting tidbit: LinkedIn said it currently has no real-time backup data center, meaning a failure of its primary data center would knock its LinkedIn.com site offline.
"We recently implemented a disaster recovery program, which allows us to move production to a back-up data center in the event of a catastrophe," The company said in its SEC filing. "Although this program is functional, it does not yet provide a real-time back-up data center, so if our primary data center shuts down, there will be a period of time that the website will remain shut down while the transition to the back-up data center takes place."
Data Backed Up, But Recovery Not Instant
Is this problematic? The company says some of its key infrastructure is located in San Francisco and southern California, which are both prone to earthquakes. "Despite any precautions we may take, the occurrence of a natural disaster or other unanticipated problems at our hosting facilities could result in lengthy interruptions in our services," the company said.
Despite the warning, LinkedIn has taken steps to protect its user data. In mid-2008 LinkedIn announced that it was deploying equipment to support a business continuity program in an Equinix data center in Chicago. The company said it already housed equipment in Equinix data centers in California.
Last month, LinkedIn opened a new data center in Los Angeles, saying that the expansion would provide "an additional, more robust data center that not only helps us handle the increasing traffic load on our servers, but to also provide more redundancy in case of an emergency."
It appears LinkedIn has its data backed up to a remote data center using a "cold " or "warm" backup configuration. These approaches don't provide an instant rollover in the event of a major downtime event, but allow a site owner to redeploy the site from the most recent backup. Servers in the backup data center are typically configured with the required software and applications, so they're ready to be deployed as needed. LinkedIn didn't indicate how long it might be offline in the event of a data center failure.
Larger Internet companies like Google, Microsoft, Yahoo and Facebook have multiple data centers and can use their network to quckly shift workloads between different facilities. LinkedIn's infrastructure has not yet reached that scale.
Real-Time Replication Challenges
Why has LinkedIn not arranged for a real-time backup setup? It's not a simple matter for database-driven sites, as discussed by the Facebook engineering team when Facebook expanded its infrastructure to add its first East coast data center in Virginia. Setting up a second site serving real-time data created "two main application level challenges: cache consistency and traffic routing," noted Facebook's Jason Sobel in his recap of the effort.
LinkedIn isn't the first aspiring public company to have its redundancy scrutinized. In 2007 the IPO documentation filing for software-as-a-service provider NetSuite revealed that the company had no backup data center. Companies are required to disclose potential business risks to investors, which for web sites typically includes data protection policies. Such warnings don't necessarily portend downtime; in 2010 NetSuite had uptime of 99.97 percent.
Ironically, the LinkedIn web site appeared to be down for some users Thursday after it announced the filing of its IPO. But here's the nice thing about IPOs: they give you additional cash to invest in infrastructure.