UPDATED 12:41 EDT / JULY 08 2014

Drowning in data with no way forward?

drowning drown warning sign helpBig Data is both a potential source of business advantage and of liability for companies, writes Wikibon CTO David Floyer in “The Growth and Management of Unstructured Data.”  Over the next few years the volume of Big Data companies are expected to capture – including log files, unstructured office documents and audio and video such as security footage – will grow astronomically.

Extracting value from this data requires good data management first to control its growth and eliminate duplication, and secondly, to make possible the quick identification of data that is relevant to each analysis project. This, Floyer writes, requires a step-by-step process.

  • Addressing Big Data management challenges

Big Data presents several challenges to management. First it is typically divided among multiple filers and systems rather than unified in a single place. Second, it lacks overall structure, and reading different kinds of Big Data requires different technologies. Searching across Big Data files to identify the subset that is useful to a specific analysis requires classification metadata that, among other things, indicates what each contains. However, most file creation technologies, Microsoft Office for instance, do not generate more than the most elemental metadata, nor do they often provide tools to allow the file’s creators to add that metadata easily.

Expert recommendations

 

Floyer recommends that companies create a universal method for automatically generating systems and user metadata at the time of creation. The files should be stored in a de-duplicated global file system that avoids data replication rather than in fragmented multiple systems. This file system should be integrated with modern extraction and analysis tools.

DF_Savings_from_converged_applications

This will involve some up-front cost, but it will avoid large amounts of wasted and duplicated effort later, when the data is used. The figure above illustrates the savings that can be realized by migrating unstructured data to a global file system, based on Wikibon research.

This, Floyer writes, is a long journey that starts by quantifying the growth of different components of unstructured data, consolidating that data and eliminating redundancy, and securing it. That allows IT to develop a pragmatic plan to add structure and functionality to derive value from this data.

 .

About Wikibon research

As with all Wikibon written research, this complete report is available without charge on the Wikibon Web site. IT professionals are invited to register for free membership in the Wikibon community. This allows them to influence the direction of Wikibon research and participate in that research and to post their questions, comments and relevant research on the Wikibon site.

Graphic Courtesy Wikibon.org
feature image by Musebrarian via photopin cc

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU