UPDATED 11:41 EST / MAY 20 2014

How climate researchers use high-performance computing + Big Data | #IBMEdge

IBM Edge - Pamela Gillman This week theCUBE is at IBM Edge in Las Vegas, broadcasting live for SiliconANGLE. In this interview, Pamela Gillman, Mgr. Data Analysis Services Groups for the National Center for Atmospheric Research (NCAR), sat down with Dave Vellante and Jeff Frick to talk about the resources her team uses for conducting research on the past, present and future of our climate.

The NCAR provides resources that allow atmospheric researchers to study the climate. It’s federally-funded and managed by the University Corporation of Atmospheric Research (UCAR), a consortium of universities that makes up their governing body.

Unlike weather forecasts, climate research is long-term. Gillman mentioned that some researchers run models where they look at what happened in the weather over a 1,000-year time period. Examples include cyclic patterns and correlations from year to year as well as through time.

Regarding today’s climate, Gillman said that “the majority of scientists believe that we’re in a period of change,” citing melting ice sheets and inconsistent weather changes. Gillman described that one group she works with is studying the climate in the Paleolithic era, trying to determine if their predictive models are able to show what happened in that time period.

“They’re trying to look at what they think can occur and what’s happening,” said Gillman, explaining this research is mostly for trying to figure why these current changes in the climate are occurring, if it’s normal change and reaching a consensus.

NCAR’s Supercomputer center

To gain clarity on what NCAR offers to researchers, Frick questioned Gillman on what they specifically provide. Gillman said that NCAR is built of five or six labs, where all but one does the science. She works with two groups that do the models; one produces the Paleolithic and future climate data, while the other focuses on hurricane forecasting. The Computational Information Sciences Lab where Gillman works produces and manages those resources. It’s a 25,000 sq. ft. supercomputer center that stores their flagship iDataPlex system.

Data and Flash

Vellante then asked Gillman what NCAR’s data sources are. Gillman said that most of the data is produced either on their computer or at other national centers. She added that one focus is in shifting data transfer protocols to the spinning storage, so that even when data is produced elsewhere, they’re able to effectively and efficiently retrieve it. NCAR’s facility currently has about 33 petabytes in their tape archive, and about 18 petabytes of available spinning storage.

At this time, NCAR doesn’t use flash, but are very interested in moving forward with it, and are looking at flash as the burst buffer. Because their models do a lot of small file output data, they run through time. Gillman said having that flash close to where they’re producing data that can handle these time steps, and allow it to trickle out to spinning storage.

HPC, Big Data and the IPCC

Moving on to the topic of high-performance computing (HPC) and Big Data coming together for commercial applications, Vellante asked Gillman to share her observations on where analytics fit in and how it affects architectures.

Gillman responded by discussing the data output work that she’s done for the International Panel for Climate Control (IPCC) runs, which occur every four-to-five years. She said that total amount of data output during her first run (IPCC 4) was 100 terabytes, which wasn’t difficult to manage. Once the data is produced, it’s available to the community for about five years. The IPCC 5 run was completed a few years ago, and Gillman said that NCAR isn’t able to curate all of the data from the run. They have about 1 – 2 petabytes of that data, and that isn’t even all of it.

Gillman then said that they bring that data in, hosted in Science Gateways that provide access to the shared resources. This allows analytics to occur so that the community is able to choose portions of data from a run. Gillman explained that they’ve coupled this functionality with their computational side so that the group managing that data can use their resources to pull variables out, package data sets and then deliver to customers.

Information-centric business model

Vellante then asked Gillman what architectural changes we should we expect coming down the pipe in the next few years. Gillman started off saying that, in the past, the job of supercomputer guys was to have a fast machine, able to put data out as quickly as possible. Then, it was someone else’s problem. That data moved around from resource to resource for each task.

Gillman then said that what NCAR did with the data center is pull all those resources together into a central pool. “So, what we’re trying to do is shift to where, as the data is produced, somebody can look at it, and they don’t have to move it,” she further explained. Gillman added that they’ve referred to this “as an information centric model of trying to get the user to move what they’re doing to where the data is”.

Gillman would like to see a tighter coupling of that, and hopes for analytics to be possible during computation. She also believes that flash plays into this. If some memory can be kept, and post-processing or analysis can be worked on before it goes to spinning storage, that would speed up workflow.

Vellante said that her concept sounds Hadoop-like. Gillman said that they don’t currently use Hadoop technology because their codes aren’t structured to work well with Hadoop at this time. The challenge is that climate codes are very large. In fact, NCAR’s setup is six pieces of code that talk together. They do, however, have an effort underway to look at their codes and see if they can move to incorporate newer technologies.

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

How climate researchers use high-performance computing + Big Data | #IBMEdge

NCAR’s Supercomputer center

Information-centric business model

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Dell AI Data Platform Event 2025

Nvidia GTC Washington, D.C. 2025

The AI Security Summit 2025

Audit & Beyond 2025

SHI Fall Summit 2025

How climate researchers use high-performance computing + Big Data | #IBMEdge

NCAR’s Supercomputer center

Information-centric business model

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Dell AI Data Platform Event 2025

Nvidia GTC Washington, D.C. 2025

The AI Security Summit 2025

Audit & Beyond 2025

SHI Fall Summit 2025

Cookies