UPDATED 08:15 EDT / MARCH 09 2013

“There is No Such Thing as Raw Data”

The days where you can view data as a static thing are over.  Kaput.  No mas. Peter Wang, Co-Founder & President of Continuum Analytics, was kind enough to join John Furrier and Dave Vellante on theCube last week at Strata to discuss predictive analytics, Big Data, scientific computing, and the moving of more and more analytical code to where the data is. Continuum Analytics, the premier provider of Python-based data analytics solutions and services, is player we would bet big on in the Big Data space (full video below).

As Wang sees it, there has been a fundamental disruption in the storage and ETL end of Big Data (business analytics) space. It is a push up market that has caused a push-back down market as all of the players jostle for position.  “The Big Data wave that’s coming is exceeding the disciplines for doing business analytics that most companies are used to,” Wang says.  Transformation (read: metadata) is turning Data Warehousing on its head.

Announced just prior to Strata 2013 was Continuum’s latest version of Anaconda, its premium collection of libraries for Python that includes NumbaPro, IOPro, and wiseRF all in one package. Anaconda enables big data management, analysis, and cross-platform visualization for business intelligence, scientific analysis, engineering, machine learning, and more. Here is a brief part of that press release:

Available on Windows, Mac OS X and Linux, Anaconda includes more than 80 of the most popular numerical and scientific Python libraries used by scientists, engineers and data analysts, with a single integrated and flexible installer. It also allows for the mixing and matching of different versions of Python (including Python 3.3 on a 64-bit Linux installation), NumPy, SciPy, etc., and the ability to easily switch between environments.

Improvements to the latest version of Anaconda include:

  • The ability to build your own packages using conda
  • New versions of wiseRF Pine and NumbaPro
  • New, faster data adapters for Mongo database in IOPro
  • New versions of currently included packages, notably cython v0.17.4, pandas v0.10.1, llvm 3.2
  • New packages: cubes, ply, pyparsing, mpi4py (OSX), googlecl, gdata, biopython

Wang believes, and I would agree, that data at this point is a first-class concern.  “Data has hit mass now. When you have enough data, you can’t just willy-nilly move it around,” he says.  “You have to think about where did it come from, how am I going to view it, how do I want to transform it into those most useful views and do it in a way that doesn’t incur more data movement.”

The dilemma is very peculiar…with data movement as a first-class concern, how do you best analyze In-Memory?

Fluidity has found it’s way to Big Data.  Strike that. We’ve found that there is fluidity in Big Data, all data. “The days you can view data as a static thing are over,” said Wang. He mentioned a quote he once heard, that there is no such thing as raw data. Which, by definition, is and always will be true: there is a sensor somewhere that collected the data in the first place.

So what does that mean? How well do the worlds of Data Warehousing and Data Analyzing need to merge? Is proprietary the new ‘last-year’ and open source the new ‘black’? Transformation, Co-Transformations … what is the first big step in Big Data?

I’d love to hear your thoughts in the comments. But mark this day in your calendar: the mobile revolution has centralized all of the data around our activities. Men lie, women lie, numbers don’t. Good luck to all of the men and woman tackling the numbers.

See Wang’s entire segment below, and check here for our entire collection of exclusive interviews from Strata, Santa Clara 2013.


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU