UPDATED 11:00 EST / AUGUST 18 2012

Storing Data as DNA, Harvard Researchers Say Test Tubes Can Contain “the whole Internet”

Photo Credit: Kelvin Ma, Wall Street Journal

Test tube data. It may sound strange, now, but genomics researcher and serial entrepreneur, George Church, told the Wall Street Journal encoding data in DNA “could be the wave of the future” for archives. Church was the senior researcher of a Harvard experiment to encode his forthcoming book, Regenesis, in DNA. According to Kyle Alspach, each of the nearly 55,000 strands of DNA used to store the text, contained an indicator of where the text belongs in the sequence of the book. The ability to store data as DNA, in the form of a viscous liquid or solid salt, presents vast new possibilities for the quantities and lengths at which data can be stored. As, Church notes, DNA data storage means “a device the size of your thumb could store as much information as the whole Internet.”

So, how does it work? DNA has its own language, containing a genetic code of four chemicals called bases: adenine (A), guanine (G), cytosine (C) and thymine (T). Robert Lee Hotz’s article in the Wall Street Journal explains how the researchers translated the digital version of the book, composed of the ones and zeros binary code that computers read, into strands of DNA that each contained a section of the text.

“The Harvard researchers started with the digital version of the book…Next, on paper, they translated the zeros into either the A or C of the DNA base pairs, and changed the ones into either the G or T. Then, using now-standard laboratory techniques, they created short strands of actual DNA that held the coded sequence—almost 55,000 strands in all. Each strand contained a portion of the text and an address that indicated where it occurred in the flow of the book. In that form—a viscous liquid or solid salt—a billion copies of the book could fit easily into a test tube and, under normal conditions, last for centuries, the researchers said.”

The test tube version of the book retains all the original contents, containing 53,426 words, 11 illustrations and a JavaScript computer program. Bioengineer, Sriram Kosuri, and the project’s lead researcher notes, harnessing DNA in this way meant that the data is stored “[sequentially], like a magnetic tape.”

The innovation shows the promise of DNA as a reliable archive for various forms of data including “photographs, books, financial records, medical files and videos, all of which today are stored as computer code,” according to Hotz. Still, the cost of synthesizing and sequencing long DNA strands remains too costly for the commercial market. Church believes that the price will drop as technologies evolve. Drew Endy, synthetic biologist, at Stanford University (unaffiliated with the research project) suggests: “This new work demonstrates that there is a whole new market for these technologies, to synthesize DNA for people who want to store information.” It will be interesting to see how scientific advancement and data continue to intertwine.

 

 


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU