UPDATED 12:59 EDT / MAY 24 2019

BIG DATA

Let’s play with particle physics! Kubernetes and Google Cloud open CERN research to everyone 

Winning the Nobel Prize for physics isn’t a goal most people can reach. But thanks to Google Cloud and Kubernetes, performing the same experiments as award-winning scientists is now possible. Open access to data from the CERN Large Hadron Collider experiments that led to discovery of the Higgs boson elementary particle in 2012 means that proving the existence of the Higgs Boson particle can now be done by anyone, anywhere.

“All this containerized infrastructure … is getting our soul together, because computing is getting much easier in terms of how to share pieces of software and even infrastructure,” said Ricardo Rocha (pictured, right), computing engineer at The European Organization for Nuclear Research, known as CERN.

Rocha and Lukas Heinrich (pictured, left), physicist at CERN, spoke with Stu Miniman (@stu), co-host of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, and guest host and cloud economist Corey Quinn (@QuinnyPig) during the KubeCon + CloudNativeCon event in Barcelona, Spain. They discussed how CERN manages the massive amounts of data generated by the LHC (see the full interview with transcript here). (* Disclosure below.)

Heinrich is a member of the Atlas research team, which along with CERN’s CSM experiment, discovered evidence of the Higgs boson. He and Rocha recently replicated the experiment that proved the existence of the Higgs boson during their keynote address at this week’s KubeCon event.

CERN science creates super-sized data

Scale, latency and performance are concerns for any enterprise, but at CERN they take on a much larger significance. Two high-energy particle beams travel at close to the speed of light inside the 27 km ring of the LHC, with 1.7 billion particle collisions occurring per second.

“The machines can generate something around a petabyte [of data] a second,” Rocha said.

Analyzing this data is the task of the Atlas trigger and data acquisition system. “We cannot write out all the collision data to disk; we don’t have enough disk space,” Heinrich said. Instead, the trigger system analyzes the data in real time and selects only the most interesting collisions to channel into storage.

The trigger system reduces this to around 10 gigabytes a second. “That’s what my side has to handle,” Rocha stated.

Businesses that think they have data storage issues will feel insignificant compared to CERN’s massive data inflow. “We’re collecting something like 70 petabytes a year,” Rocha said. “Our challenge is to make sure that all the effort physicists put into building this large machine, that in the end it’s not the computing that is breaking the world system. We have to keep up.”

Currently, CERN has one giant data center with around 300,000 cores and capacity of around 400 petabytes. “That’s not enough,” Rocha stated.

Linking institutes and research labs around the globe has doubled the storage capacity, but with a major upgrade to the LHC underway, the pressure is on to expand. “Very soon we’ll be talking about exabytes, so the amount of computing we will need there is just going to explode,” Rocha explained.

Kubernetes to the rescue

All options are on the table to solve the problem, as the engineers at CERN tend to be result-orientated, according to Rocha. “It’s a more open-minded community than traditional IT. So we don’t care so much about which technology we use as long as the job gets done,” he said.

CERN had distributed infrastructure years before everyone adopted cloud, but in the past they had to write all their own system software. Having access to open-source communities means CERN teams can focus on application development.

“If we start writing software using Kubernetes, then not only do we get this flexibility of choosing different public clouds or different infrastructures, but also we don’t have to care so much about the core infrastructure, all the monitoring. We can remove a lot of the software we were depending on for many years,” Rocha stated.

Heinrich agreed. “What’s kind of special about scientific applications is that we don’t usually just have our entire code base on one software stack. Sometimes you have a complete mix between C++, Python, Fortran, and all that stuff. So this idea that we can build the software stack as we want is pretty important.”

Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of the KubeCon + CloudNativeCon event. (* Disclosure: While this segment is unsponsored, Red Hat Inc. is the headline sponsor for theCUBE’s live broadcast at KubeCon + CloudNativeCon. Red Hat nor any other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+  
CUBE Alumni Network
C-level and Technical
Domain Experts
15M+ 
theCUBE
Viewers
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.