How does the social media site Digg manage more than 26 million visitors a month? Site administrator Ron Gorodetzky talks to System Management News about the architecture Digg uses on its web site, with a focus on database management techniques. An excerpt:
“The first pain point we hit was just database stuff. The first thing you’ll notice is when you start to grow these queries, the database can’t commit as much time to committing a certain query as it used to,” said Gorodetzky. “You’ll find the normal things that work, suddenly don’t. You’ll find that, one day, you’ll see a spike in your graphs telling you that something’s going slower. Once you do that, you get to the point where the database part is as fast as it can be, you cache things. You scale out your Web server so you have more resources there, generally caching and doing less work per request.”
Gorodetzky also talks about the challenges involved with image serving, especially the expanded use of thumbnails. The site runs on a LAMP stack (Linux-Apache-MySQL-PHP).
You can track what we’re up to at Digg here. If you’re interested in data center and cloud computing news, add us to your friend list. Also, if you haven’t yet seen System Management News, it’s definitely a read. One of its columnists is John Rath, who many of you may know from his Data Center Links blog.