UPDATED 18:33 EDT / SEPTEMBER 08 2011

NEWS

What is Big Data? 4 Definitions

The talk about big data can be deafening. But what is it really? I was looking at a post by Wikibon Co-Founder David Floyer who did a comprehensive overview on the topic earlier this summer and thought it might be worth looking at definitions from various experts and executives to provide some perspective.

Floyer writes:

Big data has the following characteristics:

    • Very large distributed aggregations of loosely structured data – often incomplete and inaccessible:
      • Petabytes/exabytes of data,
      • Millions/billions of people,
      • Billions/trillions of records,
      • Loosely-structured and often distributed data,
      • Flat schemas with few complex interrelationships,
      • Often involving time-stamped events,
      • Often made up of incomplete data,
      • Often including connections between data elements that must be probabilistically inferred
    • Applications that involved Big-data can be:
      • Transactional (e.g., Facebook, PhotoBox), or,
      • Analytic (e.g., ClickFox, Merced Applications).

Tim O’Reilly

Tim O’Reilly says he is reminded of the PC Revolution and how it commoditized hardware. Open source commoditized software. And now its the presence of large databases over the Internet that is causing the most significant disruptions. Those large interconnected databases are what gives us the ability to check in on service such as Four Square or use an online map when driving somewhere.

Consumers now expect this kind of information. The phone will only intensify this demand, which will force us to rethink such issues as privacy and identity.

In this excerpt from theCube, O’Reilly gives an example about Apple that resonates the power of big data and how it makes the app store Apple’s true killer app.


Watch live video from alexhwilliams on Justin.tv

EMC CEO Joe Tucci

EMC CEO Joe Tucci talks about big data in the context of different industries such as geoseismic data collected by oil companies or the scale of information that is aggregated in health care companies.

Brain Hopkins, Forrester Analyst

Forrester’s Brian Hopkins describes big data as “techniques and technologies that make handling data at extreme scale economical.”

He uses the “four Vs” to give his simple definition some body, which is illustrated in the chart here on the right:

The point of this graphic is that if you just have high volume or velocity, then big data may not be appropriate. As characteristics accumulate, however, big data becomes attractive by way of cost. The two main drivers are volume and velocity, while variety and variability shift the curve. In other words, extreme scale is more economical, and more economical means more people do it, leading to more solutions, etc.

Services Angle

Big data is a term used so loosely that it’s imperative to get multiple perspectives on what it means. For me, I align most with O’Reilly and Floyer, who get to the heart of how it applies in our world. We are entering the era in which sensors are collecting data in our physical world and delivering it to networks that aggregate and analyze the information. Big data defines us and will increasingly dictate how we live in a fully interconnected world.


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU