Monday’s glitchy internet performance, which reportedly affected a whole range of popular sites and services – from Amazon’s infrastructure cloud to Reddit and Facebook’s WhatsApp – has been blamed on problems with Telia Carrier, the backbone network operator arm of the Swedish telco TeliaSonera.
It’s unclear what caused Telia’s global backbone to lose data packets traveling between five continents (North America, South America, Africa, Europe, and Asia). Some news reports have tied the outage to an error made by a Telia engineer without indicating the source of the information.
Multiple Telia customers said on Twitter the outage was caused by human error.
Packet loss on Telia’s backbone has been documented in detail by one of its major customers, CloudFlare, which operates a global content delivery network. This was a second major Telia outage CloudFlare had experienced in four days, and the CDN provider’s CEO took to social networks to vent his frustration over what he said was a 60-day period of subpar reliability.
Here’s CloudFlare’s visualization of Monday's high-packet-loss period on Telia’s global network:
Telia is one of the biggest global backbone operators. Its mesh of interconnected metro networks and PoPs is hosted in many data centers around the world, operated by a variety of data center providers, including Equinix, Digital Realty Trust and its subsidiary Telx, CyrusOne, Interxion.
CloudFlare CEO, Matthew Prince, said on Twitter that Telia’s reliability over the last 60 days was unacceptable, and that CloudFlare would de-prioritize the carrier until it fixes its “systemic issues.” In a separate tweet, Prince said his company was spending “millions a year” with Telia.
Reliability of @TeliaCarrier over last 60 days unacceptable. Deprioritizing them until we are confident they've fixed their systemic issues.
— Matthew Prince (@eastdakota) June 20, 2016
The Importance of Transparency
That network and data center outages are unavoidable is the unfortunate reality for everybody who does business on the internet. All systems go down at some point in time, and while most customers recognize this as reality, service providers during outages are judged based on the speed of recovery and transparency. Prince and a representative with another Telia customer, provider of a web-based project management tool Basecamp, both said they were curious to see how transparent the carrier would be about the root cause of the outage.
Telia has apologized and said it was working with customers directly to resolve problems the outage has caused but has not revealed the cause of the incident publicly.
Sorry for recent outages! Extra checks and balances in place. Now, full attention on working directly with customers to sort out. — Telia Carrier (@TeliaCarrier) June 21, 2016