Managing Megasites: ‘An Insane Amount of Will’

One runs a popular service on just 350 servers, while another likely has more than a million servers. The common denominator: major traffic. Executives from six of the web’s most popular properties – Google, Microsoft, Yahoo, Facebook, MySpace and LinkedIn – shared the stage at Structure 09 yesterday to discuss their infrastructure and innovations.

Managing a megasite requires plenty of hardware. But that’s not the secret sauce, according to Vijay Gill, the Senior Manager of Engineering and Architecture at Google (GOOG). “The key is not the data centers,” said Gill. “Those are just atoms. Any idiot can build atoms and have this vast infrastructure. How you optimize it – those are the hard parts. It takes an insane amount of will.”

The challenges faced by the six sites varied. “I’m taking a minimalist approach,” said Lloyd Taylor, the VP of Technical Operations for the LinkedIn social network. “How little infrastructure can we use to run this? The whole (LinkedIn) site runs on about 350 servers.” That’s due largely to the fact that much of content served by LinkedIn consists of profiles and discussion groups are heavy on text. “We’re not a media intensive site,” said Taylor.

Not so for Google, which operates the YouTube video portal. Google says YouTube users upload 10 hours of video content every minute. “We realized we couldn’t build the capacity as fast as we needed it,” said Gill, who said Google engineers developed a sophisticated distributed caching system that instantly determines a user’s location and and serves YouTube videos from a local cache. “You cannot outsource this,” said Gill. “We have to do this in-house because it’s our core competency.”

No Religion About It’
Yahoo takes a different approach to content delivery, according to VP of Global Networks Raj Patel. “We use a mix of approaches for CDN and caching,” said Patel. “It’s indispensable to what we do, but there is no religion about doing it ourselves. It’s based on the economics of what we do, as well as performance. There’s a very direct tie-in from performance to business revenue.”

Microsoft has invested heavily in building an Edge Computing Network, but continues to use major commercial contnet delivery providers, including Akamai Technologies (AKAM), Limelight Networks (LLNW) and Level 3 (LVLT). “The challenge we end up with is that we have all sorts of applications,” said Najam Ahmad, the General Manager of Global networking Services for Microsoft. “You can’t handle all these applications in the same way. We end up with a varied mix of our own capabilities and CDNs.”

Ahmad said that shifting applications to its in-house Edge Computing Network has produced an 80 percent performance improvement in some applications. “That’s why it’s a competency we need to have,” he said.

Human Error … Still
Gill said the toughest challenges are not related to hardware. “The major problems are human error and software error,” said Gill. “They always have been, and I believe they always will be.”

What’s on the wish list for the megasite minders? “If I could have one thing I don’t have right now, it would be mass-scale, super-fast storage,’ said Richard Buckingham, the VP of technical Operations at MySpace. Buckingham said MySpace has been testing flash storage technology from Fusion I/0. “It’s pretty ground-breaking and revolutionary,” he said.

Facebook VP of Technical Operations Jonathan Heiliger served as the moderator of the panel. In an earlier presentation, he noted the importance of investing infrastructure to drive site performance.

Google’s Gill agreed. “We have a saying: speed costs money; how fast do you want to go?” said Gill. “And we want to go very fast indeed.”

Get Daily Email News from DCK!
Subscribe now and get our special report, "The World's Most Unique Data Centers."

Enter your email to receive messages about offerings by Penton, its brands, affiliates and/or third-party partners, consistent with Penton's Privacy Policy.

About the Author

Rich Miller is the founder and editor at large of Data Center Knowledge, and has been reporting on the data center sector since 2000. He has tracked the growing impact of high-density computing on the power and cooling of data centers, and the resulting push for improved energy efficiency in these facilities.

Add Your Comments

  • (will not be published)

One Comment