Alphabet subsidiary Google is so confident in its new approach to handling internet-scale network congestion problems that it is now bringing the technology to Google Cloud Platform, its infrastructure services for the enterprise.
Google’s new BBR networking algorithm is already used to accelerate its consumer services, such as YouTube and Google.com, and it could be the next step in improving performance of the public internet. The company says it’s seen significant improvements in those services as a result, and it is now making the technology available to GCP users.
BBR is a congestion control protocol designed to deal with a common problem: traffic congestion in the complex networks that make up the modern internet, with its crowded high-speed international links, mobile devices each getting only a share of base station backhaul, home users on shared connections from DSL or cable hubs, and businesses sharing thousands of devices through a handful of routers. All this adds up to a network that doesn’t quite work to its full potential.
“Today’s internet is a far more unwieldy beast than that which confronted its progenitors,” Eric Hanselman, chief analyst at 451 Research, told Data Center Knowledge. “Google’s efforts with BBR are the latest effort to tackle one of the thorniest of the legacy protocol performance problems that plague the internet.”
See also: Google Reveals Espresso, Its Edge Data Center SDN
While much of the data organizations deliver from their data centers isn’t affected by congestion, its effects are noticeable when they stream data, transfer large files, or when they want a near-real-time response. With its initial deployments of BBR, Google has seen significant improvements in its YouTube and Google.com services; good enough that it’s now deploying it in its Google Cloud Platform, where you can take advantage of it in your own applications and services.
So How Does BBR Work?
Packet loss has long been a reliable sign of network congestion and a signal that senders need to reduce data rates. Recent changes to the internet’s architecture have made those techniques less effective; the last mile of broadband connectivity has been configured with large buffers, while long-haul links are using commodity switches with shallow buffers. The combination means we have an internet clogged up by queuing delays in the large buffers and instabilities due to traffic bursts in the backbone.
With all those buffers, how do you to determine the best speed to send data? The answer is surprisingly simple, once you determine what the slowest link is in any TCP connection path. That link defines the maximum data-delivery rate of the connection, and where queues form. Knowing the roundtrip time and the bandwidth of the slowest link that acts as a bottleneck for the connection, the algorithm can determine the best data rate to use — a problem that’s long been considered nearly unsolvable.
That’s where the name BBR comes from: Bottleneck Bandwidth and Round-trip propagation. Using these calculations and recent developments in control systems, Google network engineers have come up with a way to manage dynamically the amount of data sent over a connection, so it doesn’t swamp the capacity of its bottleneck link, keeping queues to a minimum.
While TCP doesn’t track bottleneck bandwidth in a connection, it is possible to estimate it from the timestamps on packet responses. By understanding which connections are limited by the speed of the application generating the data and which are limited by the capacity of the network, and by knowing exactly which response packets should be sampled to get those estimates, BBR can send data at the maximum possible rate. Network connections over the internet aren’t static, so if a connection is operating at a steady state, BBR will also occasionally increase the data rate to see if any of the bottlenecks have changed, which means it can respond rapidly to changes in the underlying network.
Thousands of Times Faster Across the Atlantic
The improvement can be significant; Google claims that a typical transatlantic connection can run 2,700 times faster. BBR may also be a better match for newer protocols, like HTTP/2, which use a single TCP connection for multiple requests to the server rather than multiple connections, one after another.
Implementing BBR as a sender-side algorithm means Google is able to improve end-user experience without having to upgrade all the networking devices and services between GCP and the user’s device. While it’s been a big win for YouTube, bringing the algorithm to GCP is a significant step, as it will be handling traffic for a much more diverse set of applications.
How BBR Accelerates Google’s Cloud Services
GCP customers can take advantage of BBR support in three ways: connecting to Google services that use it, using it as a front end to their applications through Google cloud networking services, or using it directly in their own IaaS applications.
As Google’s own services will be using BBR, the latency to your cloud storage should be reduced, making applications that use services like Spanner or BigTable more responsive. End users will see a bigger effect from BBR support in Google’s Cloud CDN (in the form of better media delivery) and in Cloud Load Balancing, where it will route packets from different instances of an application.
If you want to use BBR in your IaaS applications running on Google Compute Engine, you’ll need to use a custom Linux kernel. While BBR has been contributed to the Linux kernel, it’s not yet in mainstream releases, and you’ll need to add it from the networking development branch, configure it for GCE, and then compile the kernel.
With BBR available to compile into Linux kernels, you can also start using it in your own network, especially if you’re using Linux-powered networking equipment, such as Open Compute switches. GCP switching to BBR may attract interest from outside Google, in the Linux community, and from other network operators and vendors.
451’s Hanselman sees this as a promising step forward for the internet. “There have been many efforts to adapt the inner logic of TCP to improve performance, and Google has taken a fair shot.” He also views Google’s cautious approach to rolling BBR out as sensible; “There are lingering questions on how well this version plays with others, and Google is clear that it doesn’t want to release a bully upon the unsuspecting.”