UPDATED 15:47 EDT / MAY 01 2014

Invest more in metadata to make more of your data

dna big data genetics genes codeCompanies spend millions of dollars to get an edge from the data they own. However, all too often their efforts are out of balance. Data is hoarded without a clear purpose, and not nearly enough time is invested in capturing and analyzing the data about the data: the metadata. The fact of the matter is that metadata is a valuable tool that can answer as many questions as the data itself.

Metadata is essentially the lever that amplifies the value of data. It provides context around factors including the amount of data processed, the amount of data read and written, the data’s source, destination, and algorithms used to analyze it, the number of data versions in existence, and those versions that are used most often.

As enterprises move from ad-hoc development to operationalizing data to building teams to create and maintain a continual flow of big data applications, recognizing the potential of metadata is crucial. Metadata can provide valuable business insights to constituents of a team, an organization, and a CEO so that each player can do their job better. This is particularly true as metadata is often derived from the context or manner in which the data is used, shedding further light on who or what used the data and in what way the data provided value.

With a rich set of metadata, you can zoom into the details of your data or zoom out to see the bigger picture—all to gain insight into how your business is running. Whether your role is in compliance, operations, or application development, metadata is critical to leveraging your data. Anyone looking to find the business value in data can refer to real-world examples of metadata’s value, such as those described below, and consider how to collect and exploit it in their own organizations.


Metadata: More Valuable than the Data Itself


In almost all conspicuous data victories (both popular and mundane), data has been used in conjunction with metadata.

santa clara cloud analyticsFacebook is a great example of a data company that is deriving billions in revenue from use of metadata. While the company receives terabytes of data per minute, it certainly isn’t reading posts to find out if you like Coca-Cola or if you’re in the market to switch car insurance. Rather, Facebook leverages its data on a deeper level, looking at what you like (or stop liking), the brands you engage with, the quizzes you take, the social games you play and the apps you use. In turn, Facebook can create a profile based on user behavior—a metadata profile that it monetizes through eerily optimized ads.

Data is organic; it ebbs, flows and oozes through an organization. Capturing its navigation points, the details of its every stop, as well as details about the people and systems that manipulated it, will tell you as much or more about your business as the data itself. People navigate themselves to what’s most useful (or in the case of Facebook, what they consider most valuable or interesting). Metadata can capture that ebb and flow, and by analyzing it, you can gain insight into how your data is used, which is often more interesting than the data itself.

In turn, the imperative is to collect that metadata gold and use it to supplement other sources of user research, such as focus groups or polls. Metadata brings into focus how your data is being leveraged, where it’s being leveraged, whether or not your resources are being used efficiently, and ultimately what’s important.

Big data almost always represents some micro level of action—the phone call, the Facebook Like, the download, the click—but that micro-level data alone offers only an incomplete story. There’s nothing compelling about a record of four or five credit card transactions in isolation, but there’s something enormously telling about metadata when it shows that these transactions took place in five different states within the same two-hour period. Metadata moves data from a micro to a summary level, which can then become the raw materials for building a model to extract meaning.

Simply put, metadata enables you to gain broader and deeper insights by looking at the usage and summary of your data. As metadata surrounds raw data, it sheds light on a wider sphere of activity, thereby expanding the context of analysis. The result is that the model of the customer, the process or other interactions becomes richer and can tell us more about the past and the future.

One example of metadata in action is Amazon.com’s anticipatory shipping. By watching how customers interact with items in their carts, Amazon has a pretty good idea when someone is going to make a purchase. The signals in the metadata (viewing the item, reading reviews, going back to the page, interacting with the shopping cart) provide enough assurance to support moving the item in question to a warehouse near the customer. That practice is not exclusive to Amazon, and given web logs, ecommerce metadata is certainly there for the taking.


We’re All in the Metadata Business


In “Using Metadata to Find Paul Revere,” Kieran Healy, a professor of sociology at Duke University, showed how the British Crown could have used metadata available at the time to identify Paul Revere as a revolutionary. On a more amorous note, UCLA math student Chris McKinlay used metadata to find a compatible woman through OkCupid.

big data analytics real time decision makingIn financial services, there’s a governance, risk and compliance angle to metadata. At a recent banking tech conference, one of the speakers voiced the need for granular details about metadata. Banks are under tremendous pressure to comply with new and ever changing regulations. In some cases, banks are required to explain exactly how they derived their analytical results, answering questions that include: Where did the source data come from? Did the processing use a join, a filter or a merge? What algorithm was used? Which predictive model? How many versions of the data exist? Which data set was ultimately used to derive the result?

Here in Silicon Valley, new tech startups are building products to help organizations and consumers make sense of their metadata, and improve their businesses and lives through their use of data. The potential for metadata to support better operations, better personal well being and better fidelity is unlimited.

Fitbit, Jawbone and Nike Fuel all track what we do, and also when and where we do it—expanding raw data from an accelerometer to generate reminders to exercise and offering analysis on the quality of our sleep. The Nest acquisition by Google and other investments in the Internet of Things movement are motivated not only by the value of such core businesses, but more importantly by the value and ability of sensors to provide a better understanding of how people live. In the manufacturing realm, ThingWorx was just acquired for its ability to create larger models and advanced automation systems out of metadata provided by sensors and industrial equipment.

As I’ve said before, we’re all in the data business. Of course Facebook and Twitter are in the data business. However, if you’re using data to gain insights and drive decisions, then you too are monetizing your data, and you too are in the business of data. Never mind about the elegance and effectiveness of your data repositories; the way to fully exploit and monetize that data is to build a smarter organization, to inform the models used to run your business and to increase the scope of what you know. That potential is powered by metadata.


About the Author

Gary Nakamura is the CEO of Concurrent, Inc. He joined Concurrent in January 2013 to lead Concurrent through its next phase of growth. Gary has a highly successful track record including significant contributions to the explosive growth of Terracotta where he was SVP & General Manager.

photo credit: JohnGoode via photopin cc
photo credit: mrjoro via photopin cc
photo credit: Kris Krug via photopin cc

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy