UPDATED 07:10 EDT / APRIL 22 2014

Dark days ahead for Big Data? Depends what you use it for…

medium_1446927008According to Gartner’s famed hype cycle, technologies typically follow a similar pattern, in which they’re first discovered and immediately over-hyped, causing users to quickly become disillusioned, before finally finding realistic applications for the technology.

It kicks off with the “trigger” phase, a stage of rabid excitement about a new technology that leads to huge public interest. This morphs into phase two, a “peak of inflated expectations”, before tumbling into a steep decline as people learn that the innovation falls short of the original, extravagant claims. Things bottom out into a “trough of disillusionment” (phase three), before we see a slow and steady rise in interest as people find ways to put the technology to good use. This “slope of enlightenment” (phase four) eventually evolves into a “plateau of productivity” (phase five), at which point the technology finally goes mainstream.

One of the greatest technological innovations of this decade is Big Data, and it’s interesting to ponder, what phase is it at now?

Where is Big Data in the hype cycle?

 .

There’s no simple answer to this question, as it rests entirely on the kind of application we’re talking about. If you’re talking about Big Data being used for commercial purposes, then a fair assessment is that companies like Google have already reached phase four, as this infographic demonstrates. This is also true of data-intensive sciences like astrophysics and genomics, where massive streams of data are helping scientists to carry out all kinds of new research.

But these aren’t the only applications for Big Data – some believe that we can use it for social means too, helping to boost our understanding of society and improve public policies.

So where do social applications of Big Data sit on Gartner’s hype cycle right now? The evidence suggests that we’re still stuck at phase one, with massive expectations that are in all likelihood impossible to achieve.

Let’s blame Google for this. Google is responsible for all the negative hullabaloo over Big Data in the press recently. That’s because four years ago it claimed it was able to report more timely and accurate data about the spread of Flu through analysis of people’s web searches, than the US Centers for Disease Control and Prevention was able to do. Google Flu Trends consequently led to a overdose of speculation about other possible social applications of Big Data.

But Google Flu trends turned out to be dead wrong. Recently, Nature reported that Google had gotten things wrong – very, very wrong – wildly overestimating the spread of flu within the United States.

As Tim Harford in the Financial Times explains:

“Not only was Google Flu Trends quick, accurate and cheap, it was theory-free. Google’s engineers didn’t bother to develop a hypothesis about what search terms — ‘flu symptoms’ or ‘pharmacies near me’ — might be correlated with the spread of the disease itself. The Google team just took their top 50 million search terms and let the algorithms do the work.”

“After reliably providing a swift and accurate account of flu outbreaks for several winters, the theory-free, data-rich model had lost its nose for where flu was going. Google’s model pointed to a severe outbreak, but when the slow-and-steady data from the (U.S. government center) arrived, they showed that Google’s estimates of the spread of flu-like illnesses were overstated by almost a factor of two.”

And so it turns out that Google really doesn’t know anything about the spread of flu. All it knows is how to correlate search terms and estimate things, but these estimates are some way of the mark. What Google failed to acknowledge is that correlation and causation are two very different things, but the latter remains our main basis for understanding.

Big Data advocates won’t be too troubled by this, and in many applications it really doesn’t matter. As far as commerce goes for example, correlation is all you need to know. For example if you read SiliconANGLE every day, Google will most likely guess you’re interested in technology and IT, and show appropriate advertisements to you.

But for those who’re trying to use this kind of data science to better understand people and our society, it looks as though the “trough of disillusionment” is just around the corner.

photo credit: Nathan Marciniak via photopin cc

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU