UPDATED 13:44 EDT / JULY 30 2018

CLOUD

Lessons from Google’s internal SRE methods for cloud efficiencies

As a new generation of corporations navigate the efficiencies of cloud computing, they are faced with a new challenge: running a business in a brand-new environment without the benefit of tried and true methods.

“The industry has done a really fabulous job of telling people how to get to cloud, but we’re awful about telling them how to live there,” said Dave Rensin (pictured), director of customer reliability engineering and network capacity at Google Cloud.

Rensin spoke with John Furrier (@furrier) and Jeff Frick (@JeffFrick ), co-hosts of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, during the recently concluded Google Cloud Next event in San Francisco. They discussed Google site reliability engineering and how the concept is being turned outwards to help businesses operate successfully in the cloud. (* Disclosure below.)

Parsing work for machines and human judgment

In 2004 Google LLC had just gone public, and internal calculations showed that in 10 years the company would need a million systems operators just for their popular search function. In its unorthodox way, Google reimagined its production systems by applying software engineering skills to operations problems and named the method Site Reliability Engineering, or SRE.

“The basic philosophy is simple, give to the machines all the things machines can do, and keep for the humans all the things that require human judgment. That’s how we get to a place where like, 2,500 SREs run all of Google,” Rensin said.

A primary principle of SRE is to forget about aiming for perfection. “Any system involving people is going to have errors. So any goal you have that assumes perfection, 100 percent uptime, 100 percent customer satisfaction, zero error, that kind of thing, is a lie,” Rensin said, going on to explain that there is a “magic line” — known as the service level objective — marking the boundary between satisfied, and unsatisfied customers. Operate below the SLO line and customers are angry; operate above it and resources are being wasted on incremental improvements that customers don’t notice.

“The difference between perfection, 100 percent, and the line you need [the SLO], which is very business-specific, we say treat as a budget,” Rensin said. This “error budget” represents time and money that can be spent on innovation.

As director of customer reliability engineering, Rensin takes Google’s internal SRE methodology and turns it outwards to work with businesses of all sizes. Google has published a book on SRE, with an accompanying workbook to help guide companies through implementing SRE in their own operations.

“Our goal is that every firm from five to 50,000 can follow these principles. And they can. We know they can do it, and it’s not as hard as they think,” Rensin concluded.

Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of the Google Cloud Next event. (* Disclosure: Google Cloud sponsored this segment of theCUBE. Neither Google nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.