UPDATED 11:32 EDT / MARCH 05 2012

NEWS

Why “Science” is an Important Word in the World of Big Data

Some observations from conference organizers Alistair Croll and Ed Brill on theCube last week at the Strata Conference:

Once users start using Hadoop, it is often the unexpected things that create value and it’s why the word “scientist,” is important. Using today’s big data tools requires you to explore. It’s true with Hadoop as an analytics engine with Map Reduce. And it’s certainly also true with data visualization and the ability to harness data streams to create algorithms.


Watch live video from SiliconANGLE.com on Justin.tv

But this also poses a threat to people with domain expertise. In an interview with John Furrier, Croll said that he had conversations with people at Strata about this idea that domain expertise gets trumped by data.  He cited Moneyball, the book and film about the Oakland A’s and the data jock who helped the team discover a group of players who went on to help the team have a fabulous season. The baseball scouts represented the domain experts. They had an institutionalized system for picking players. The scouts had already made up their minds what they wanted. The data jock had no such preconceptions. He used data to help find the best players. The results spoke for themselves.

Services Angle

The players in the big data game get this concept of discovery. I’d say every student of the Web gets this, too. Have you ever built a blog? It’s a discovery process. Feed readers? When introduced about ten years ago, they provided a window into a new world for people. Data was discoverable in new ways and people began to use those feeds to make new apps. Today we have millions of available feeds and data streams from sensors and machines that developers experiment with to create new apps.  APIs have matured to the point that they are becoming gateways for commerce.

And now we have an ecosystem emerging. HBase is gaining acceptance as a database on top of Hadoop that acts in a similar way to Google’s BigTable. Pig is a high-level language running on top of Hadoop.

These new tools help fuel new discovery. And that means lots of change for the services market as it adapts to the new world of data science. CIOs would do well to bring in people with fresh eyes to existing problems. Service providers that can bring new perspectives to old domains will be the ones to watch.


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU