UPDATED 11:00 EDT / DECEMBER 16 2021


How the National Cancer Institute is using Google Cloud to support researchers around the globe

The U.S. National Cancer Institute is using Google Cloud to connect researchers all over the world to cancer data sets along with powerful analytical tools to study the data quickly and securely.

The NCI provides a cloud-based research commons with its Cloud Resources, which allows data scientists to analyze cancer datasets in a cloud environment. That means researchers don’t have to download the data locally and manage their own custom hardware.

According to the World Health Organization, cancer is the leading cause of death worldwide, accounting for 10 million deaths in 2020. Breast cancer is the most prevalent type and accounted for more than 2.26 million cases in the same year. Cancer itself is extremely data-intensive to research because of its personal nature, since it affects each patient differently depending on genetics, progression and numerous other circumstances.

Attempting to research something such as breast cancer entirely on-premises is slow, cumbersome and extremely expensive for researchers. It also puts a great deal of extra stress on patients who already suffer through constant tests.

In collaboration with Google, the NCI created the Institute for Systems Biology-Cancer Gateway in the Cloud. It allows scientists to interactively define and compare protein data from specific cancer genes and share insights with peers. It also provides application program interfaces and Google Cloud Platform resources such as BigQuery and Google Pipeline resources on demand for complex queries using languages such as R and Python.

“We are spreading the message of the cost-effectiveness of the cloud,” said Dr. Kawther Abdilleh, lead bioinformatics scientist at General Dynamics Information Technology, a partner of ISB. “With Google Cloud’s BigQuery, we’ve successfully demonstrated that researchers can inexpensively analyze large amounts of data, and do so faster than ever before.”

Underlining the need for cloud-based sharing solutions, Dr. Abdilleh and Dr. Boris Aguilar, senior research scientists at ISB, demonstrated in a paper published in September 2020 how the use of cloud data analysis could be used to save time and work.

“Google’s AI platform, for example, allows us to easily create notebooks to use R or Python in combination with BigQuery or machine learning to perform large-scale statistical analysis of genomic data, all in the cloud,” Aguilar wrote.

The researchers used the Google Cloud platform to develop a set of BigQuery user-defined functions to perform statistical tests designed to provide a better picture of the breast cancer genome. By using the cloud-based analytical capabilities, they were able to complete their work in minutes that would have taken days for a supercomputer to finish.

Also, in light of the power of cloud-based technology, Abdilleh and Aguilar have made their UDFs available for use to their peers via BigQuery. That will make it possible for their research to be replicated and the opportunity for other researchers to build on their work.

Image: Pixabay

A message from John Furrier, co-founder of SiliconANGLE:

Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.

Join Our Community 

Click here to join the free and open Startup Showcase event.

“TheCUBE is part of re:Invent, you know, you guys really are a part of the event and we really appreciate your coming here and I know people appreciate the content you create as well” – Andy Jassy

We really want to hear from you, and we’re looking forward to seeing you at the event and in theCUBE Club.

Click here to join the free and open Startup Showcase event.