The Sidekick Failure and Cloud Culpability
There’s a been a trend recently in which every outage of a web-enabled service is interpreted by the tech media as a reflection of the reliability (or lack thereof) of cloud computing. This in turn has prompted defensive reactions from some cloud technologists and bloggers, who counter that these incidents aren’t “cloud failures” at all and have nothing to do with whether cloud computing is safe.
This week’s data catastrophe with the Sidekick mobile device has prompted fresh debate about cloud culpability. InformationWeek dubbed the apparent loss of all user data a “code red cloud disaster.” ZDNet called it “one of the biggest cloud computing disasters so far.” Cnet wrote that the incident “threatens to put a dark cloud” over Microsoft’s cloud ambitions. Barron’s Tech Trader Daily writes that “you have to wonder if the high hopes about cloud computing just suffered a mammoth setback.”
Not a Cloud Failure?
But some analysts argue that the Sidekick snafu isn’t a “cloud failure” at all. “Every outage is not a frigging cloud outage,” Redmonk’s James Governor writes on Twitter. “The T-mobile failure is at a traditional data center – its not running on Force, AWS, or other cloud infrastructure is it? If someone doesn’t back up their data and loses it I don’t call that a cloud failure.”
Martin Glassborow at StorageBod agrees, saying the Sidekick incident “should not be seen as a failure of the Cloud; it’s not! It’s the failure of a centralised service which was apparently run by incompetents! It is yet another lesson that if you only have a single copy of your data; you might as well only have no copies of your data. So if you are archiving and deleting, you better make sure that you have two copies of the archive or at least the ability to recreate that data.”
But one leading cloud blogger thinks the cloud is fair game on this one.
“Many of the cloud pundits out there will try to tell you that the Sidekick service isn’t a cloud application,” writes Reuven Cohen at Elastic Vapor. “Let’s call it what it is, it’s a cloud app – your data when using a Sidekick is hosted in some else’s data center. In the most basic terms, if I choose a device such as a mobile phone that requires me to use some else’s data centers for storing my personal data, I expect it to be at the very least backed up automatically, and preferably I should have the ability to do so myself. It appears that neither was an option for T-Mobile Sidekick customers.”
Cohen says the Sidekick failure has implications for the type of cloud apps users should choose. “The best and easiest way to be prepared for the inevitable failures that will occur is to rely on services that allow for portability,” he writes. “Make sure you have a clear exit strategy before you choose a cloud service provider and avoid the ones that attempt to lock you in.”
What do you think? Will the Sidekick data loss affect the enthusiasm for cloud computing? Please share your thoughts in the comments.
As I was just saying (If it’s dangerous it’s NOT cloud computing), Reuven’s on his own on this one – I’m yet to find a “cloud pundit” who supports his opinion that the Sidekick epic fail was in any way related to cloud.
FWIW I also discussed How Open Cloud could have saved Sidekick users’ skins over the weekend…
Ultimately, perceptions (of the cloud) will determine reality (to cloud or not).
I believe when the power grid was being built we did have some “availability” issues that we ultimately overcame.
All the same, from a user standpoint, they lost their data. The data was somewhere in space (from their viewpoint). We seem to be using our IT view of the cloud to define the cloud. In reality, if we call it a space app, then indeed the space has some issues (yah, service management of the space dust was faulty).
EIther way, it seems that all the cloud vendors that were not-cloud vendors before cloud was invented as a name are now magically cloud.
Heck, I used to be a grid computing person and now am a cloud computing person. So, am somewhat doing the same thing.
All the same, if it were not for us grid people some of the cool things in the cloud may indeed not be possible.
Obviously, if it were not for cloud some of us grid people may not really have a need to exist.
Cloud is great!
ProderPosted October 12th, 2009
Check out Wikipedia’s entry for “No true Scotsmen.” Oh, the cloud lost your stuff? Wasn’t really cloud. If the cloud is defined as “everything great and nothing bad,” then whenever something bad happens it’s obviously not cloud. That means cloud can never fail, which means of course, that it cannot exist in this world.
So what exactly were they selling? They were selling the idea that your data would be available from everywhere. That is the definition most people have of the cloud. When it fails, they can’t turn around and say, it’s only cloud on days when it works. Every other Tuesday, it’s not cloud.
Give me a break.
What is the problem here? How can anyone say this is not the cloud? First they tell us let us take care of your data its so much easier and saver and we call it the cloud. So they grab your data and loose it. And now its not the cloud. A big duuuuuuuuuhhuuu
MPosted October 12th, 2009
Users have no way of knowing if the company they trust with their data is going to be properly protected. Without the ability to evaluate the merits of one cloud platform over another, any failure in any cloud platform is going to affect the public’s perception of EVERY cloud platform.
NathanPosted October 12th, 2009
What this boils down to is poor disaster recovery practice. Cloud or not. T-Mobile, Microsoft, Danger all failed to implement a system that could be recovered, and when their data became corrupt, could not correct or recover. Sorry, But just because a system is implemented in a “cloud” does not mean that it is in-vulnerable to poor planning, coding, or implementation. Sorry this is not a failure of a cloud.
Sure it does, and sure it a failure of the cloud. The cloud proposers always had to row against the critic that it would be unsafe to store your data there. You’d loose control over your data. This is exactly what happened and therefore very well a failure of a cloud. Cloud doesn’t work, will never work. Cloud is just that, Cloud.
NathanPosted October 12th, 2009
First I have to say, I am not a fan of “Cloud” or “Grid” or any outsourced service where we do not have control of the system in part or whole. Never have been, most likely never will be.
That said I still look at the cloud as nothing more than a set of systems running processes that manipulates data. These same systems could very well be running in yours, or my data center. Just because its a cloud or running in a cloud doesn’t make it impervious failure. Failures always will occur, Its what happens after a failure that can make or break a system. Look at the titanic. Look at Apollo 11.
Now look at the situation. A Failure occurred. What happened to the system? We don’t know. Something broke. But we know something ultimately corrupted data to the point it could not be recovered in its current form. And when it came time to implement recovery procedures, the recovery plans failed, the system they built broke.
This is not a failure of a cloud. Its a failure of a system in a cloud. Did NASA point to space and say that’s that caused Apollo 11 to fail?
burrisPosted October 12th, 2009
Whenever you give someone else responsibility for your data there is the risk that the other party is incompetent or will otherwise contribute to increasing the possibility of data loss. If you’re handing over your entire business IT function to an outsourcing company, you’re supposed to be diligent in assessing that risk. If you’re just a regular guy with a phone do you have any other choice than to trust the phone company? TMobile apparently breached that trust.
The defenders of “cloud computing” are lying when they deny that this incident is a catastrophic example of the types of risks inherent in using “cloud” services. I hope that this incident will motivate vendors to be more open in showing how their services are designed and operated in a manner that minimizes these risks.
Sorry mate I would say you have learned nothing. But you know what, your lucky today I heard from trusted sources that ebay will have Danger handsets for near to nothing soon. Go get one. This will never happen again. Right? Microsoft will take care of it, maybe rename the service Microsoft/Titanic.
JerryPosted October 12th, 2009
Keep in mind what a “cloud” really is, you are running an application on someones elses hardware. If the application lost data/did not provide proper backup, then the “cloud” can do nothing about it. Its a failure of an application in a cloud. The cloud is a service not unlike what used to be called a service bureau or the modern outsourced data center. They need to counter the perception that it was the cloud/hardware. Bet they did not have a DR site setup in the cloud!
NathanPosted October 12th, 2009
Guess you had a sidekick? or your just really passionate?
Your right i didn’t learned nothing.. That i didn’t already know… Its a straight analysis of a failure that i personally have nothing to do with.
But if you happen to have more information on the situation than what has been released, share it, please. But until more information is available my opinion of this failure stands as, poor disaster recovery practice.
Hope they have a back up, I wonder what kind of danger I will be in with my G1.
no sidekick but passionate, passionate anti cloud. Cloud is just another stupid hype to rob people of there hard earned money. As everybody with a practical view on the cloud and it’s possibilities I too advocate ‘Give your date to someone else? Might as well trow it in the vertical archive.” Cloud is slow, impractical, and will never become much more then a place where app builders can roam your money for a service that you already have at home. They just want to control your data en look what they do with it. Further more nobody seems to realize that without global connectivity it’s pretty useless. Put it in fridge and don’t open for the next 10 year.
ScottPosted October 12th, 2009
The Sidekick system stores and retrieves the handheld’s contact information, notes, bookmarks and pictures to/from the backend system.
This data is (normally) available through a web portal hosted at T-Mobile.com, giving Sidekick users an additional way to access their content. The system also uses a backend web browsing application to process/compact and send the resulting streamlined web content back out to the devices.
To me this sounds like a cloud technology, and recent events a reminder that cloud technologies can fail spectacularly just like anything else if improperly managed.
StuartPosted October 12th, 2009
I agree that this is not the way to do cloud but the whole point is that you can’t argue it is not cloud at all.
Bigger story is that I posted a post on the T-Mobile forums about starting a “Revolt” by stating on the forum when we will cancel service and how much money T-Mobile will lose a month from us.
Apparently that is enough to get banned from the site. Our IP address is blocked. The problem is that the forum is the only place you can go right now to get updated information on what is going on.
Obviously both T-Mobile and Microsoft are more concerned about the perception of their “cloud” services than the services themselves…
It’s a failure in the design of process which is ultimately supported by technology. Backups apparently weren’t part of the process, so no technology will support something it is not aware of or told to do. As for failure of the cloud? I don’t see it that way. The same thing would happen in many environments – no backups=no restore point. No restore point=do over.
From what I’ve read, it’s not a cloud failure, but it is a failure in the cloud.
I’ve read a few of Sam Johnson’s posts, and I understand his recommendation for cloud to be geographically dispersed and redundant, but just because a particular instance doesn’t meet those criteria doesn’t mean it is not a “cloud.” Maybe there’s some kind of “Cloud” v. “cloud” debate going on where the “big-C” Cloud is the Utopian version he references and the “little-c” cloud covers any kind of platform that exists independently above specific pieces of hardware.
The problem (as I understand it) is an administration issue. It doesn’t matter what you call where the data was stored … it seems like it was poor change control management. Did I miss something? Is the fact that the data was in a “cloud” relevant to the fact that it appears this was a botched upgrade?
Forum.sidekickfail.com has recently been created as an open and neutral place sidekick customers can exchange ideas and vent without the fear of their valuable thoughts, ideas, and opinions being deleted and disrespected as T-Mobile has been doing on their forums.
If you loose data or cant access data because of your provider its a FAIL. Maybe not a cloud FAIL but a FAIL just the same. Everone needs to remeber that your data s king AND YOUR RESPONSIBILITY! Completley trusting any outside vendor with your data is risky.
Storm CloudPosted October 13th, 2009
well, this just goes to show you that the fat cats, c-levels and such, in the tech industry should stop running around talking about the greatness of the cloud! the media blitz has consumed the technical reality that only a very small % of what is online is actually utilizing the future ‘cloud computing’ architecture. Cloud has now replace ‘Internet’ as the word of the day. Get ready techies and geeks, everything in ‘cyberspace’ is now ‘the cloud’.
Uh, and the answer is YES the outage will impact the future of cloud acceptance, as the average dude walking down the street could not tell what is ‘cloud’, ‘internet’, ‘cyberspace’ and has been sold the magic ‘cloud’ vision from our own tech industry talking heads….
Microsoft=FailPosted October 13th, 2009
Sigh, Microsoft fails yet AGAIN!!! I only feel sorry for T-Mobile for having to suffer due to such an incompetent company, but then again why would they trust Microsoft with their cloud computing needs!?