UPDATED 15:09 EDT / AUGUST 04 2011

Erasure Coding – The New Approach to Data Archiving

Long term data archiving is a major problem for businesses of all sizes and market areas, but particularly for those who operate under regulatory environments that require long-term archiving of fairly large amounts of data such as business e-mails and other communications. Even those not covered by such regulations have to preserve basic business records, including their financials and information about employees and customers, for years.

Over the years companies have tried solutions ranging from tape backup to three-data-center topologies to solve this problem, but none have really proven practical. Tape, the most common solution for the last 30 years or more, is relatively cheap but has several major drawbacks, including that data in an off-site vault is pretty much inaccessible to decision support systems, the tape wanted is invariably missing, data search takes forever on tape, and older tapes often are recorded in obsolete formats that can no longer be read by the tape systems in the data center. On the other end of the spectrum, three-data center systems provide strong data survival and recovery combined with excellent data access, but at a price that makes it totally impractical for all but large financial houses.

Now, says Wikibon CTO David Floyer in a new article on Wikibon.org, an emerging data archiving approach called Erasure Coding, based on a fairly complicated mathematical analysis, promises very secure data archiving in the public cloud at an order-of-magnitude less cost than the three-data-center approach. The trade-off is that data access is slower, although certainly much faster and easier than access to data archived on tape. And it avoids the issues of media obsolescence as well as deterioration that plague older tape-based data.

The basic idea behind erasure coding is that the data is chopped up into large numbers of fragments which then are duplicated and spread over multiple locations, sometimes involving more than one Infrastructure-as-a-Service vendor. The result is a system that virtually guarantees the recoverability of the data for literally thousands of years.

Floyer’s article is one of several examining aspects of erasure coding archiving that came out of a Wikibon Peer Incite meeting on the subject with Justin Stottlemyer, director of storage architecture for consumer cloud service Shutterfly, which uses Erasure coding to guarantee perpetual free archiving and availability of personal digital photographs to a huge population of consumers.  These and other articles on the Wikibon site by Floyer and Wikibon CEO David Vellante examine the promise of this new approach to data archiving in detail.


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU