Google Will Lend You Its Own Engineers to Keep Your Cloud Apps Running Smoothly

Customer Reliability Engineers to work as part of customer teams to ensure reliability of critical cloud apps

Yevgeniy Sverdlik

September 30, 2016

4 Min Read
Google Will Lend You Its Own Engineers to Keep Your Cloud Apps Running Smoothly
A Raticate from Pokemon Go in front of the gates of Downing Street in London (Photo by Olivia Harris/Getty Images)

To critics that say Google lacks experience in selling and providing support to enterprise customers, the company says, “We’ll do you one better.”

Unlike most other companies, people who operate the global Google data center infrastructure are software engineers first and IT people second. The company’s philosophy is that software services run better if the infrastructure underneath is built and operated by those who know software.

“It turns out services run better when people who understand software also run it,” Melissa Binde, Google’s director of Site Reliability Engineering, said during a presentation at a company conference earlier this year.

These Googlers are called Site Reliability Engineers, and soon, enterprise customers of the company’s cloud services will be able to embed Googlers with similar credentials on their own infrastructure teams to ensure critical applications deployed in the Google cloud run smoothly.

Part of a bigger announcement about major changes and upgrades across Google’s entire cloud business – including the announcement of eight new cloud data center locations slated to come online next year – was a note about this unusual model for cloud customer support: Customer Reliability Engineering, or CRE.

Read more: Google Devs Get to Run Google Infrastructure for Six Months

The “Seriousness” of Google Cloud

Google, which has been slow to grow its enterprise cloud business in comparison to Amazon and Microsoft, is often criticized for not being “serious” about its cloud services. One of the criticisms was that the company didn’t really know how to work with enterprise customers, which is something other major cloud players – the likes of Microsoft, VMware, and IBM – have done for many years.

Starting last year, the company has been on a mission to prove those critics wrong. The first big step was hiring VMware founder Diane Greene to lead the cloud unit, and the following steps focused on investing tons of money into a global Google data center expansion to offer more cloud availability regions and improving the feature set around cloud services, including a major focus on enhancing services with machine learning.

Read more: What Cloud and AI Do and Don't Mean for Google's Data Center Strategy

“Designed to deepen our partnership with customers, CRE is comprised of Google engineers who integrate with a customer’s operations teams to share the reliability responsibilities for critical cloud applications,” Brian Stevens, VP of Google Cloud, wrote in a blog post. “This integration represents a new model in which we share and apply our nearly two decades of expertise in cloud computing as an embedded part of a customer's organization.”

Pokémon Go: a Trial by Fire

Google tested the CRE model on Niantic, the company behind the popular mobile “augmented-reality” game Pokémon Go. Niantic originated at Google but was spun out last year. While some past reports have suggested that the game runs on Google’s cloud, this is the first official confirmation by Google, for whom attracting and boasting high-profile cloud customers is another major way to prove its cloud’s worth.

Some might say the outage-ridden roll-out of Pokémon Go in July is not the best customer engagement to boast about, especially as part of the announcement about a new customer reliability team. The game was plagued by downtime throughout its first month on the market, players around the world frustrated by the frequently appearing message saying the game’s servers were overloaded.

A separate blog post on the launch of Pokémon Go, however, indicates that the rough start was due more than anything else to the game’s unexpected popularity. The worst-case estimate of Pokémon Go traffic on Google’s cloud datastore the team prepared for was five times Niantic’s target traffic. But once the game launched, it quickly outstripped the target fifty-fold:


“Throughout my career as an engineer, I’ve had a hand in numerous product launches that grew to millions of users,” Luke Stone, director of Customer Reliability Engineering, wrote in the post. “User adoption typically happens gradually over several months, with new features and architectural changes scheduled over relatively long periods of time. Never have I taken part in anything close to the growth that Google Cloud customer Niantic experienced with the launch of Pokémon Go.”

Google has not provide much more detail about the new CRE program, saying only that it will have more to share about it “soon.”

Read more about:

Google Alphabet
Subscribe to the Data Center Knowledge Newsletter
Get analysis and expert insight on the latest in data center business and technology delivered to your inbox daily.

You May Also Like