UPDATED 14:24 EDT / OCTOBER 22 2012

Records Management’s Failure Costs Companies Millions

In the excitement about Big Data, what is being missed is that large enterprises are actually drowning in Petabytes of historical data, both structured and unstructured, generated by internal systems from ERP to e-mail, says consultant, lawyer and records management expert Randolph Kahn. And despite the commonly held myth that “storage is cheap”, this data is costing those companies millions of dollars annually in direct storage costs plus huge amounts in “soft costs” ranging from the 20%+ of executives’ time wasted searching for the significant information buried in that largely useless data to the huge cost of finding relevant information in the case of a legal action.

“I started my career as a trial lawyer,” Kahn told Wikibon Co-Founder and Chief Analyst David Vellante in an interview  in the Cube in Boston (full video below). “Litigation response tells us over & over & over again that companies don’t have their act together. So I would tell you that harnessing their data in a Big Data context is aspirational for most organizations. I would say ubiquitous mismanagement is much more the theme of the day.”

This data is supposed to be managed through records management, but that has failed completely. The rules are too complex, they are not applied evenly or at all to much company data. Nobody has clear ownership of any of the data, and while IT owns the storage systems, corporate counsel constantly warns them to be very careful about what is deleted, which means that all the data is saved.

“Most IT professionals don’t think of information as having a life-cycle,” he says. “Today everything is electronic, and it is all generating data. At the end of the information life-cycle, the idea of parking it somewhere in a repository, a shared drive, it’s an expense. It’s a liability. And I think IT professionals need to understand what’s in that life-cycle.”

Much of that old data is actually valueless, unneeded, and a source of legal risk to the organization. And with Petabytes sitting in a wide variety of systems, in multiple formats, the physical act of reviewing all that data is too massive and expensive for companies to undertake manually. “What is needed,” Kahn says, “is a technological solution.”

Kahn has devoted much of his career to this problem. Last year he founded Delve to provide large organizations with the technology and support they need to bring this data monster under control. This year he is completing a new book, Chucking the Daisies: How Companies Deal with Big Data that presents a set of specific rules for managing data over its lifetime.

Storage is Expensive

Despite the idea that “storage is cheap”, Kahn’s company, Delve, cost-justifies itself to large corporate clients purely on the hard savings they will realize by eliminating unneed data that is past its sell-by date. Many of those companies have not even bothered to dedupe, compress, and archive that data out of their production databases. And it is growing at a rate of 20% to 50% per year. “When we tell the CIO that we can save you $20 million, $30 million, $40 million per year it is an easy sell,” he says.

However it is not as easy as just deleting every piece of data created before a specific date. Different kinds of data have different life-spans. Therefore, everything has to be reviewed to determine, for instance, whether it is still valuable to some process, whether it is covered by compliance or civil procedure requirements, whether is it still in use. This is complex and difficult given the huge amount of data involved, and the number of systems, different formats, and physical and logical locations in which it resides.

And the selection of data to be deleted has to be made using a formal, rules-based process that makes corporate legal, compliance, and business executives comfortable “or in the end they will not pull the trigger and delete the data,” Kahn says.

Delve does this using a powerful machine-learning system. First it simplifies the organization’s retention schedule and rules and procedures and teaches that and the various formats the data is in – including unusual and custom formats associated with custom software – to the computer system. Then Delve’s automated records management systems works its way through the petabytes of data in the enterprise, seeking out all the places where data hides. It reads all the records, both structured and unstructured, and identifies all the data that is no longer required for legal or business purposes and can be safely deleted from the system. Once that is done the company can apply those rules to its data going forward to keep the expensive internal databases under control.

Records Management Rules

The book, which will be published before the end of the year, will present a set of clear rules to CIOs to manage their data. One of those, for instance, is to establish clear ownership and retention rules for the data from every new system before that system goes operational.

“For instance,” Kahn says, “One day the CIO wakes up and realizes that all the company salespeople are using Facebook to reach customers and sell products. This is good, but what happens to all that social media data on their Facebook pages? Suddenly the company has compliance issues, security issues, privacy issues, business analysis issues.” Without both a clear set of rules for handling this data that satisfies all those concerns and clear ownership of that data that makes someone responsible for managing it through its life-cycle and eventually deleting it, it will become another huge cost center.

“Moving that policy discussion up front forces you to deal with the business and legal issues,” Kahn says. That is Records Management 2.0, effective strategy and implementation of a management system that ensures that organizations get maximum value from the records at minimal risk and without wasting millions on petabytes of useless, over-age data that only clogs its systems and creates extra risk for the corporation.


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU