Rackspace reports that its cloud computing service is “degraded,” with many customers reporting their sites are unreachable. The company attributed the problem to an unusual load spike in the storage system supporting its cloud platform. The outage came several hours after the Rackspace Cloud disabled CRON, a command commonly used to automate tasks on Unix and Linux systems. By early evening, the company said performance had improved.
“Starting yesterday we began experiencing very high loads on our storage devices for cluster WC1 in DFW,” Rackspace said on its status page. ”In order to reduce load we have shut down processes like CRON to ensure core site content continue to load. While load spikes are common in our cloud infrastructure, we have not been able to fully identify the root cause of these unusual issues.
“We are working with engineers from inside and outside the company with the best expertise on these issues to resolve them and develop a plan of action to ensure we do not repeat this state. We have a series of changes that are being implemented in real time. We are being careful to minimize issues as we proceed.”
UPDATE: At about 6 pm Central time Rackspace provided an update: “We have been seeing improved performance on our Cloud Sites WC1 storage cluster for the last few hours. Assuming stability continues we will resume CRON operations this evening. At this time, we cannot declare victory on this issue, but we have many plans in place to continue to increase headroom and ensure stability under all conditions.”