One of the key selling points for cloud computing is scalability: the ability to handle traffic spikes smoothly without the expense and hassle of adding more dedicated servers. But this week some users of Amazon EC2 are reporting that their apps on the cloud computing service are having problems scaling efficiently, and suggesting that this uneven performance could be due to capacity problems in Amazon’s data centers.
The reports emerge as rival services focus on Amazon’s performance in the battle for cloud computing mindshare and customers.
Amazon says that if customers are experiencing performance problems, it isn’t because EC2 is overloaded. “We do not have over-capacity issues,” said Amazon spokesperson Kay Kinton. “When customers report a problem they are having, we take it very seriously. Sometimes this means working with customers to tweak their configurations or it could mean making modifications in our services to assure maximum performance.”
Performance Issues Prompt Instance Upgrades
Kinton said Amazon has reached out to Alan Williamson of the cloud consultancy AW2.0, who is also editor of the Cloud Computing Journal and has been using EC2 for three years. In a blog post Tuesday, Williamson wrote that he has experienced growing performance problems running a sizable EC2 installation for a customer, which he believes are tied to growing load on Amazon’s servers. Williamson said he has needed to buy larger instances to maintain the same performance, increasing his client’s costs.
“The problems that we are starting to see from Amazon are more than just the overhead of a virtualized environment,” he wrote. “They are deep rooted scalabilty problems at their end that need to be addressed sooner rather than later.
“Has Amazon become over subscribed?” Williamson wondered. “Sure feels like it, as we are being ‘taxed’ by being forced to move up their offering stack to just get the same level of performance we are currently enjoying. It appears that even Amazon have a limit to what they can scale to.”
Cloudkick Weighs In
The post prompted Cloudkick to post charts showing periodic ping latency between EC2 nodes. “Alan Williamson’s post on EC2 oversubscription seems to make a lot of sense,” Cloudkick observed. “The network behind EC2 appears to be experiencing very sporadic latency issues.” UPDATE: Cloudkick has run additional data over a longer time frame and says the latency issue started just after Christmas.
Cloudkick provides management tools for cloud infrastructure at both Amazon and Rackspace, and is hosted on the SliceHost service owned by Rackspace.
Amazon says the issues seen by Williamson and Cloudkick are not capacity related, and has pledged to work with customers experiencing problems. “I have been contacted by Amazon regarding this issue now and hopefully the data we provide them can help them diagnose any problems that may exist,” Williamson reported in a follow-up post.
The cloud rivalry between Amazon and Rackspace was underscored by this week’s announcement that cloud startup Encoding.com had shifted its primary operations from EC2 to Rackspace Cloud Servers, while retaining some operations at Amazon.
In conjunction with the Encoding.com announcement, Rackspace revealed results of data from The Bitsource comparing performance of EC2 and Rackspace Cloud Servers. The study, which was commissioned by Rackspace, favored Cloud Servers on most key metrics.
Amazon (AMZN) has forged a dominant position in the cloud computing market, and has reinforced its early leadership by consistently cutting prices and adding features. With the emergence of cloud performance monitoring tools and a growing field of competitors, it’s not surprising that the performance of Amazon Web Services will be closely watched and much discussed. A key point: Kinton says that thus far Amazon has met all of its service level agreements.
UPDATE: There’s been an active discussion of Amazon’s performance on Slashdot and in worthwhile blog posts by Reuven Cohen (Oversubscribing The Cloud) and Chris Hoff (Over Subscription vs. Over Capacity: Two Different Things). The distinction Chris is making is an important one, and will be familiar to those who’ve been involved in the historic debates about overselling and “unlimited” accounts in the shared hosting industry.