UPDATED 20:32 EDT / DECEMBER 10 2020

CLOUD

AWS hosts genomic data for curing COVID-19 and other health issues

Amazon Web Services Inc. recently announced that the AWS Open Data Sponsorship Program will begin hosting the National Institutes of Health Sequence Read Archive, or SRA, data — a large sequence of genomic data.

During the COVID-19 pandemic health crisis the world is facing right now, this is a huge deal, according to Brett McMillen (pictured), director of U.S. federal at AWS.

“It’s got not only human genomic data, but all life forms or all branches of life … to include viruses,” he said. “And that’s really important here during the pandemic. It’s one of the largest and oldest … sequence genomic data sets that are out there, and yet it’s very modern. It has been designed for next-generation sequencing, so it’s growing.”

McMillen spoke with Lisa Martin, host of theCUBE, SiliconANGLE Media’s livestreaming studio, during AWS re:Invent. They discussed why the SRA data is so important and why AWS is helping the scientists at NIH during this pivotal moment in time. (* Disclosure below.)

Using AWS to democratize of scientific data

AWS has been working with NIH since 2012, according to McMillen. Making this SRA data to scientists worldwide is extremely important because studying the genomic code is what helps scientists find cures for human health issues, including heart disease, diabetes and cancer — and even viruses that can cause pandemics. AWS is working within NIH with the National Center for Biotechnology Information to make the SRA an open data set.

“It’s all about increasing the speed for scientific discovery,” McMillen said. “I personally think that in the fullness of time, the scientists will come up with cures for just about all of the human ailments that are out there, and it’s our job at AWS to put into the hands of the scientists the tools they need to make things happen quickly or in our lifetime.”

SRA is a very large data set of 45 petabytes, and this is so large that if it were all human data, it would be equivalent to 90% of everybody living in New York City, according to McMillen. This is why keeping the this data in the cloud is so important, and it is available on Amazon S3, a popular storage solution in the scientific community.

“One of our goals here is go back to a democratization of research,” McMillen described. “For example, the very first … vaccine that came out was … done by a rural country doctor using essentially test tubes and a microscope. It’s gotten hard to do that because data sets are so large you need so much compute. By using the power of the cloud, we’ve re-democratized it, and now anybody can do it.”

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of AWS re:Invent. (* Disclosure: Amazon Web Services sponsored this segment of theCUBE. Neither AWS nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU