UPDATED 12:41 EST / JULY 08 2014

Drowning in data with no way forward?

drowning drown warning sign helpBig Data is both a potential source of business advantage and of liability for companies, writes Wikibon CTO David Floyer in “The Growth and Management of Unstructured Data.”  Over the next few years the volume of Big Data companies are expected to capture – including log files, unstructured office documents and audio and video such as security footage – will grow astronomically.

Extracting value from this data requires good data management first to control its growth and eliminate duplication, and secondly, to make possible the quick identification of data that is relevant to each analysis project. This, Floyer writes, requires a step-by-step process.

  • Addressing Big Data management challenges

Big Data presents several challenges to management. First it is typically divided among multiple filers and systems rather than unified in a single place. Second, it lacks overall structure, and reading different kinds of Big Data requires different technologies. Searching across Big Data files to identify the subset that is useful to a specific analysis requires classification metadata that, among other things, indicates what each contains. However, most file creation technologies, Microsoft Office for instance, do not generate more than the most elemental metadata, nor do they often provide tools to allow the file’s creators to add that metadata easily.

Expert recommendations

 

Floyer recommends that companies create a universal method for automatically generating systems and user metadata at the time of creation. The files should be stored in a de-duplicated global file system that avoids data replication rather than in fragmented multiple systems. This file system should be integrated with modern extraction and analysis tools.

DF_Savings_from_converged_applications

This will involve some up-front cost, but it will avoid large amounts of wasted and duplicated effort later, when the data is used. The figure above illustrates the savings that can be realized by migrating unstructured data to a global file system, based on Wikibon research.

This, Floyer writes, is a long journey that starts by quantifying the growth of different components of unstructured data, consolidating that data and eliminating redundancy, and securing it. That allows IT to develop a pragmatic plan to add structure and functionality to derive value from this data.

 .

About Wikibon research

As with all Wikibon written research, this complete report is available without charge on the Wikibon Web site. IT professionals are invited to register for free membership in the Wikibon community. This allows them to influence the direction of Wikibon research and participate in that research and to post their questions, comments and relevant research on the Wikibon site.

Graphic Courtesy Wikibon.org
feature image by Musebrarian via photopin cc

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.