Power Failure KOs Intuit Sites for 24 Hours
A major site outage that knocked Intuit web sites offline for more than 24 hours was caused by a data center power failure during routine maintenance, the company said tonight. The downtime affected the Intuit.com web site and thousands of small business customers who host their websites on Quickbooks.com.
“Our preliminary investigation indicates the outage occurred during a routine maintenance procedure Tuesday night,” Intuit CIO Ginny Lee said on the company’s customer forum. ” An accidental power failure during that procedure affected both our primary and backup systems, taking a number of Intuit websites and services offline. While power was quickly restored, we’re working diligently to validate our systems and bring them back into full operation.”
As of 7:30 p.m. Pacific time Wednesday, Intuit said it was beginning to restore the web sites and services that had been offline since 7 p.m. Tuesday. The effort followed a long day of frustration for Intuit’s small business hosting customers, most of whom were unable to access their web sites or process transactions.
AnonPosted June 16th, 2010
Is it really a great idea to publish the locations of any company’s data centers? I was under the impression that most businesses prefer that the location of their data centers remain unadvertised and secret as to not bring any unwanted attention. Irresponsible reporting here in my opinion.
VirusPosted June 17th, 2010
A power failure does not knock out primary and backup systems for this long!!! The site was attacked! Intuit does not want to admit it.
Such a large scale system that holds personal info has complete redundancy
built into it, unless a major security problem has taken place that enables someone to shut the system down.
IF it is actually true that the system went down from power failure, Intuit needs to find a new CIO! Someone who knows how to build redundant systems and have contingency rules in place. Ginny Lee…..you screwed up!
bksPosted June 17th, 2010
Ginny Lee knows nothing about computers and everything about
speaking management gobbledygook. Intuit *is* an IT company.
There is no excuse for this and heads should roll.
I guess we now know what Ginny meant by “Lean IT”!
AaronPosted June 17th, 2010
There are a lot of possibilities and to speculate Intuit was attacked or that big name companies always build redundant systems or manage their systems properly is naive.
There are a lot of reasons why an outage can last longer with the primary reason being safety to personnel. People think you should just flip a switch and magically “restart” 480V gear. If something in the electrical system fails, if you value life and property you will be certain to identify what it is prior to just pushing that reset button. In the data center world, if something fails it usually did so for human or mechanical reasons.
Secondly, maybe the power problem was minor but the data center personnel had to wait on vendor resources to check the equipment and resume normal power operations. It’s also very possible that the facility folks did their job and then it was up to the IT resources to start equipment and databases backup and/or restore after a loss of power But I’m just speculating – the point is you don’t need a new CIO because a data center has a 24 hour outage….You probably need a new maintenance plan, procedures, or equipment..
JeffPosted June 17th, 2010
They probably just need bigger “TURN LEFT FOR SOURCE A, RIGHT FOR SOURCE B” signs on their transfer switches… And maybe a remedial “Left vs Right” class for any operators of said switch.
In all likelihood they were doing single side maintenance, had it de-energized and accidentally switched their load to it thanks to a careless operator. After a few operators soiled themselves, they returned the switches to “auto”, opened the downstream breakers, then slowly re-energized the whole system. Even after closing all the breakers, there is still hours if not days worth of work to return every single server to operation. Most companies don’t plan for graceful shutdown/restart since they have many layers of redundancy (and 5 days of fuel for their generator) and never expect to have to go without power. Underestimating the human ability to screw something up is a very dangerous mistake!
I would tend to agree, a power outtage does not know out primary and back up systems. All professional web hosting systems have power backups so there is no interruption of service. I do not host my site with QuickBooks, but I do use them for credit card processing. Of which I could not do at all on Wednesday. When their system has had maintenance in the past, it is usually 20 to 30 minutes that it is down, not 36 hours.
And truthfully, Intuit does not have to tell us what caused the problem. They have the right to say “We are experiencing technical difficulties.” So we can conjecture all we want, and we know something is wrong, and we will go on living just fine not even knowing what happened. And a month from now, just about nobody will remember….
bksPosted June 17th, 2010
There is no excuse for the lack of transparency. Even if it’s nothing
more than saying “there there, everything will be okay” they should
communicate every few hours. Ginny Lee must go.
CynthiaPosted June 18th, 2010
I was patient through all of this. I have a store plus a website and couldn’t process any transactions. For the store customers wanting to use a credit card, I simply took their information and processed it when Intuit was up and running. My big issue is that I called Quickbooks at 2:15 Wednesday afternoon and asked when they would be back up. The man on the phone told me it would be up in “exactly 43 minutes”. I didn’t get his name but he should be let go for lying.
Friday data center tidbits: Intuit data center face plants, Google patents stacking, and more! « The Server RoomPosted June 18th, 2010
[...] } First up is the piece about the 36 hour failure of Intuit’s data center as the result of a power failure cause by “routi…. What is it with data centers that they can’t resist screwing with critical power facilities [...]
[...] } I knew it was going to happen just the moment I read about Intuit face planting their data center and web sites for 36 hours. The anti cloud computing crowd are out in force with their mantra that this “proves” [...]
this was really unfortunate and a friend’s site got affected. whatever the reasons were, I hope they make sure that this never happens again as it can be a huge blow to those who depend on their sites for their livelihood.