UPDATED 06:29 EDT / OCTOBER 26 2012

Big Data, Big Insights. A Peek Into How IBM Sees Big Data

Wikibon analysts Dave Vellante and Jeff Kelly interviewed Jim Giles, a Distinguished Engineer for Big Data at IBM. Jim oversees the development of the platforms Big Insights and Streams. IBM is known for its commitment to researching and then launching products. The latter, Streams, was built in the IBM Research Labs and then brought to market as a product with software group. Analyzing data in motion – major part of our Big Data platform. And as far as Big Data, IBM actually started its Big Data platform initiative 10 years ago.

Dave Vellante sums it up best, “That’s what I love about IBM, you do some real research, that stuff actually hits the market. It’s not just research for research sake.”

Big Data is the copious amounts of real-time and long-term data accumulated by us (consumers) utilizing technologies – anything from your cell phone to your television. But  what is Big Insights you ask? Jim Giles gave a very good description:

“One of the most important things to really talk about is the overall big data platform, and so whenever some people talk about big data they just think that means Hadoop or they just think it means whatever warehouse they happen to have. But really with IBM we’re taking all of these different pieces together. We always talk about volume, variety and velocity. And so we made sure we had the right components to solve each of those different types of problems. Whether it’s analyzing data in motion, whether it’s analyzing data at rest, and bringing all of those different pieces together. And so that’s really the strength of our IBM big data portfolio. “

This begs the question, where are customers in the big data process right now, in forms of maturity? Customers are coming in at different entry points.

Entry Point 1: There are clients that have a problem with a business process that they are working on right now that they just can’t do it anymore the way they were doing it before. These clients are looking for new technologies that the can bring to bear that problem. They know what they want to do and are ready to use the technologies.

Entry Point 2: There are clients that have a whole new problem. Thus, sometimes they are just looking to kick the tires and familiarize themselves with the technologies and see what they can do for them – what sort of new capabilities can they uncover.

Let’s dig into Big Insights more. It was started in the Almaden IBM Research labs. At its root, Big Insights is a Hadoop stack, with analytics and tools for business analyst, business executives, data scientist and developers built on top of it. With Big Insights you get Apache Hadoop, whole set of analytics capabilities, work load optimizations, rich set of developer tools, and a web console. There is Big Insights basic edition, and Big Insights Enterprise.

The value proposition for Big Insights is a combination of the simplicity of an integrated stack, and more importantly the deeper integration into the rest of IBM’s big data stack.


Some of the security issues being faced by Hadoop and Big Insights are pretty par for the course in big data. Maturity of the technology is something that has to evolve over time. The number of ports that you have to open up to access all of the different components is something people are always worried about. One of the strategies IBM uses with Big Insights to tackle this issue is by utilizing a gateway into their clusters, and rest interface through their consoles. Another concern is auditing. What that means is “who is accessing what data and when.” Big Insights has a tool that collects that activity and pushes it up to through their guardian capabilities to their guardian tools to monitor.

Big Insights has very deep and two-way integration within all of IBM’s databases. That means into and out of DB2 and Netezza while using high speed parallel readers and writers. Additionally, with Cognos, IBM big data can now access data in its Big Insights cluster through a Hive connector.

Where does Jim see Big Data headed? “I think that ultimately what you’re going to see is the ability to execute standard sequel queries on top of a database, and you won’t even know that it happened, that some aspects of that will go off and be executed in the Hadoop environment. You’ll see extension where overflow data, instead of pushing data into a long-term archive, you might want to push it into something we’re calling a query-able archive. Still making it accessible, but not as accessible as it would have been if it was in your database.  But not as inaccessible as if it’s been moved off to long-term storage. It’s going to become a complete seamless picture is where we think it’s headed.”

He spoke of an interesting use case (as something that is exciting him right now) how Connoco Phillips is tracking icebergs in the artic and trying to understand where icebergs are relative to the oil production platforms and if they may be needed to move a platform, or breakup the ice – all because of satellites data in real-time.

Big Data is going to be the buzz word (to those that are making serious moves) over the next 24 months. Accessing data and interacting with data at the right place, at the right time. It’s what is going to matter on the backend of every technology you use from this day forward.


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU