How to Manage 4.5 Million Human Gene Samples
You have a freezer. The ice cream has to stay cold or it will melt and go bad. You can look in the freezer and tell if everything is okay. There’s a thermometer. Pretty simple.
Now what if you had 10,000 ice cream samples in the freezer? Monitoring gets a bit more complicated. Now, let’s say you have to keep track of what you did to each ice cream sample. Then you had to put that new sample back in the freezer. Now you have potentially hundreds of thousands of samples of various varieties. Some just have cherries others have hot fudge. Each has to be catalogued for future use in case you needed to use it again in ten, 20 or 30 years from now.
That’s part of the challenge for the Coriell Institute. Except this New Jersey nonprofit is the leading biobank in the world, managing 4.5 million human samples such as DNA, RNA and blood for the purposes of research and a new personalized medicine project.
“We are the Amazon of blood,” said Scott Megill, CIO of the insitute.
Their story in many ways helps give perspective about the solutions required to manage vast amounts of data that has to be catalogued and tracked at a granular level. The organization had a homegrown storage system but it became outdated when the institute started the Coriell Institute Medicine Collaborative, a personalized medicine project that involved taking saliva samples from thousands of people. Each sample was input into the Coriell system that correlated it to create a data set. Each person generated two million points of data.
To manage the project, Coriell turned to IBM for its storage and data tracking.
With such a comprehensive study, the amount of data Coriell had to store and track became overwhelming.
The technology for analyzing genes has advanced faster than computer technology. The cost for mapping the human genome has come down a million fold. It took 13 years and $3 billion to do the first map of the human genome. Today it takes a day and costs about $5,000 to map a person’s genetic makeup.
The personalized medical project is to create a data set that maps the full genome sequences and correlates it to humans. With this data, electronic health records can be married with genetic information.
So you can see why managing the information is so critical. Coriell is using IBM technology to constantly monitor in real-time the temperature, the level of nitrogen and other aspects of the cryogenic chambers where the samples are stored. Data is fed to an IBM Tivoli control panel that provides real-time monitoring.
IBM Lombardi Websphere is used to track the cells when they are taken out of the freezer. Lombardi is a business process management software. It had once been too expensive for Coriell but prices came down when IBM acquired the company in 2009.
Services Angle
Mainline Solutions did the integration for Coriell. It took less than a week. Megill said the integration required working with a separate vendor that provided the sensors for the freezer.
The solutions component is of importance for Coriell as the organization requires its monitoring and process management technology to be on-premise.
The next step for Coriell is to explore the data. That’s where analytics enters the picture. Coriell expects to work with IBM’s Watson Labs to find ways for best analyzing its human genome information.
“I tend to think we are going to start to see bio informatics data analytics tools,” Megill said.
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU