BIG DATA
BIG DATA
BIG DATA
The science of genomics is a data-driven process that requires moving vast amounts of information from one place to another. The research effort is turning to the cloud and technologies such as machine learning and quantum computing to help fight the war on cancer and other diseases.
“Genomics has me totally fascinated. … The other thing I am interested in is eradicating cancer. … What they are finding is if [there is] enough biometric information from your genomes, from your proteomics, from your RNA, they can customize [precision medicine] a specific medicine track for you to fight the cancer successfully,” said Sam Greenblatt, chief technology officer of Nano Global Corp.
Greenblatt joined host Jeff Frick (@JeffFrick) and guest host Scott Raynovich (@rayno) of theCUBE, SiliconANGLE Media’s mobile live streaming studio, during the Open Networking Summit in Santa Clara, California.
Greenblatt discussed how genomics is dependent on the capability of networks to scale to meet the needs of how much data is in use with the research.
Greenblatt explained the necessity for technology when treating DNA and why open source is a critical component.
“When you get swab or blood, DNA is then processed, and it gets cut into how many samples they need. 23andMe uses 30x [x is the piece used], which uses 80 gigs of data. When you take 50x, which is what you need for cancer, that takes you up to 150 gigs per person,” he said.
The process for finding cancer includes capturing the DNA and RNA of the individual along with their biometrics, electronic medical records and radiology. Once gathered, the science can then determine a course of treatment.
The oncological doctors, however, have reservations about the process, as they are used to the traditional methodology of using radiation and chemotherapy, Greenblatt stated.
Correlating this data is a vast undertaking. “It is probably the largest number of big data outside of YouTube. It’s number two in number of bytes,” he said.
Believing that everyone needs to be sequenced and their information stored, Greenblatt noted that open source has the advanced technology to deal with the flood of data. He spoke about the Apache Kafka and Apache NiFi models. (NiFi is donated by the National Security Agency/NSA.)
Kafka is a pull model, as well as a producer, broker and subscriber model, that can open multiple channels. Data flows become critical as the process mainly takes the segments of DNA and puts them though Markovian chain and then back together again to establish a pattern and perform quality control, Greenblatt explained. What are they looking for in the process? The one-tenth of the DNA that is variant and exclusive to one person. It is this variant that may hold the key to curing cancer through the immune system.
IBM is also playing a role in genomics with its experimentation with quantum computing. Greenblatt thinks that this technology will be significant in the quest to find a cure for cancer.
Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of Open Networking Summit.
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.