Network Outage at The Planet

27 comments

When you’re one of the largest web hosting companies, even an outage on a Sunday night gets widely noticed. The Planet was offline for about 90 minutes overnight, affecting the operations of many customers housed in its Houston data centers.

UPDATE: As of 9:30 a.m. Eastern time, customers are reporting that they are offline again. Some are expressing frustration at the lack of status updates on The Planet’s Twitter feed. The only company update about the overnight outage appeared on the company forums, which are now offline.

UPDATE 2: As of 10:20 a.m. it appears most customers of The Planet are back online. “While many services have been restored, our network team continues to investigate and work on the remaining issues,” The Planet posted in a Twitter update. “We believe the network issues this morning are unrelated to connectivity problems customers in H1 & H2 may have experienced around 12am CDT.  Our initial analysis shows that a circuit between Dallas and Houston caused the ~8 a.m. CDT network disruption, and it has been repaired.”

Here’s the incident report on the overnight outage:

“On May 2 at approximately 11:45 p.m. Central time, we experienced a network issue that affected connectivity within The Planet’s core network in Houston that may have prevented your servers from communicating to the Internet,” the company reported on its customer forum. Network connectivity service was fully restored in less than 85 minutes; however, your servers may have been impacted by this incident.

“We determined that one of four border routers in Houston failed to properly maintain standard routing protocols,” the company reported. “Traffic destined for this router may have experienced a failure to communicate within The Planet’s core network and to several Internet transit providers directly connected to this border router. We are working through a root cause analysis and we’ve isolated this router from our network to prevent further issues.”

About the Author

Rich Miller is the founder and editor at large of Data Center Knowledge, and has been reporting on the data center sector since 2000. He has tracked the growing impact of high-density computing on the power and cooling of data centers, and the resulting push for improved energy efficiency in these facilities.

Add Your Comments

  • (will not be published)

27 Comments

  1. Scorch

    Looks like there is another problem with TP today, cannot access anything.

  2. Dan Kelley

    Planet appears to be down this morning again....at least at 9am eastern time.

  3. Well, they are offline again today -- prime time. They are not answering their phones and it is hit and miss getting into their control panel.

  4. Randy O

    They are still having issues as of 0946 EDT

  5. Pat

    I'm curious if this is related to the downtime I'm having since 9am EST. A Twitter search shows many upset clients at this time. Additionally, I'm unable to access www.theplanet.com.

  6. Andrew

    As of 10AM EST ... still down. Not all of our servers are down ... but, most of them are.

  7. I didn't know about the one last night. My server has been going down far too many times in the past month. I thought it was a problem with the hosting company , but now I wonder if every failure was caused by problems with theplanet. They are supposedly back up again, but none of my sites are working yet, and I can't even load theplanet.com.

  8. Johann

    They seem to be back up now. (At least our servers with them is back)

  9. Same experience here from Chicago: theplanet.com and their support sites also appear to be offline...

  10. John from Buffalo

    Well, you can only image the systems guy(s) who have to deal with that mess. When you run a data center and something goes wank like this, you either have a heart attack or you shit your pants (or both). Just recall back in 2008, thePlanet had a major explosion that blew power out. Pretty messed situation, guys. They are updating on twitter, but just not through thePlanet - odd, right? http://www.twitter.com/hostgator/ and the Slashdot article from 2008 .. http://tech.slashdot.org/article.pl?sid=08/06/01/1715247

  11. Simplex Brasil

    Here in Brazil, now its ok... But our websites was 40 minutes off

  12. dan

    10:18am est our sites at planet are coming back up okay now but not before alot of fallout and nastiness. monday morning is not an acceptable down time. pretty much this is the absolute least acceptable time ever you can have an outage.

  13. Looks they are finally back up, all of my servers are responding again

  14. Randy O

    Yea, Support says follow "Twitter" and the last update was a week ago... My servers in Houston are coming back up now... buttholes need to get their act together when talking to customers as where to get updates...

  15. The good part is that there doesn't seem to be much data lost, except a few failed database write actions which can go unnoticed. But it's embarrassing, and then again it's natural. Such is life.

  16. Why do people get so angry when their site is down? Crap happens. Oh wait. I forgot. ThePlanet wanted everyones websites to go down, they were feeling bored and found the best way to create work was to create a network issue @ the data center! LOL Relax everyone. Let them work :)

  17. Thanks for the heads up. A good reminder on the importance of mirroring for me

  18. dan

    @Krandall Relax? Prior to this event our income equals X after the event and in the weeks following our income drops to a number severely below X clients who are losing $thousands of dollars an hour in sales DO NOT want to hear the word relax today. and especially at 9am monday morning. epic disaster.

  19. JB

    If you are losing thousands of dollars an hour in sales, why have you architected a design that can take you down with one failure?

  20. dan

    question is a little naive we have to sell the next level of redundancy to the people that have those funds. and that doenst work until after a disaster happens. how can you work in this industry and not already know how this process works. "maybe now they will listen..." and yeah with all the money that is at stake, you would think some of it could come back to the people who make it all work. instead what we get is people screaming when the site stops work.

  21. Tom

    hmm... so if you have 4 border routers and are using BGP... wouldn't the other 3 just advertise the routes through other providers? Isn't that what BGP was made for? And a single circuit between two facilities? Shouldn't there be a fiber loop at the very least? Oh wait, all of that stuff actually costs money... Obviously, this is evidence that the reason The Planet offers such cut rate deals is because they aren't doing things right. Which wouldn't be a problem, if The Planet didn't market itself as "the highest quality and value for your hosting investment". Just say - "Hey! our datacenter is a server ghetto. if you want cheap, we'll give you cheap. but don't cry when it sucks!" Its a shame to over market and under deliver, but its also partly the responsibility of the people who buy into it. Does it make sense that you are running your business with a service that costs less per month than your cell phone? Something doesn't add up. When you buy something cheap, you know it. Thats fine for things that are disposable and you aren't basing your business around. Think people! At $99 per month, if they have to spend even 15 minutes dealing with you as a customer, they automatically lose money. Bottom line - hosting companies have some backbone, educate your marketing people so they don't make stupid caims like "100% uptime guaranteed" and "Unlimited Bandwidth" - Thats just a straight lie. And customers... use your head, don't buy cheap services and complain about it when they fail.

  22. It's the same thing again... I thought they said they fixed the issue. Ooops, this time I took it for granted and suffered a major loss!

  23. Grant

    I also just experienced another outage by the look of it. I got an email from pingdom saying my site was down for less than 10 minutes.

  24. Paul

    Tom - dead right - TP has single points of failure - I've even seen times when TP houston AND TP dallas were down at the same time, so SPOF's are clearly present within TP's network and/or DNS. We were already in the process or moving to rackspace- they just expedited it for us and took themselves out of the equation for who we host with as a failover center.