UPDATED 11:37 EDT / APRIL 14 2011

Open Source, Support Services Key to Big Data Deployments

Big data is all about storing, processing and analyzing large volumes of distributed, unstructured data. It is also, at this point at least, all about open source.

Hadoop, the open source distributed computing file-system for big data storage and processing developed by Doug Cutting at Yahoo!, is now part of the Apache Software Foundation community. So are HBase, a database that provides real-time read/write access to big data, and Hive, a Hadoop data warehouse platform with ETL capabilities.

That big data lends itself to the open source model is not surprising. It’s a complex and to some extent unproven (at least in the mainstream enterprise) technology area that commercial vendors are reluctant to throw big dollars at … yet. It’s not surprising either,  then, that open source business intelligence (BI) vendors like Jaspersoft and Pentaho are jockeying for position to be the go-to, user-friendly front-end to Hadoop installments.

I recently met with Mike Boyarski, director of product marketing at San Francisco-based Jaspersoft, to chat about his company’s moves into big data. Among them, Jaspersoft announced in January that its core open source BI platform could now natively connect to big data data sources, including Hadoop installments, NoSQL databases and MPP data warehouses. The big data connectors were a no-brainer for Jaspersoft, Boyarski told me, as they weren’t particularly costly to build thanks to the company’s tight integration with fellow-open source vendor Talend, maker of data integration tools.

With the new connections in place, users can do ad hoc report design directly against big data engines like HBase or MPP data warehouses like Vertica and Greenplum, Boyarski said. He said Jaspersoft’s goal is to bring big data scale-out capabilities to self-service BI. “We want to make it really simple, make it cost effective, and make it flexible,” Boyarski said.

Jaspersoft claims to be the most widely used BI platform in the world, with over 13 million downloads to date. The number of paying customers that rely on Jaspersoft is significantly lower, of course, with many of those being ISV’s that embed Jaspersoft’s reporting and analytic capabilities into their own software platforms.

One problem for Jaspersoft is the quality of the support services it provides to its paying customers. Support is critical for ISV customers working on complex application integration projects, as well as big data deployments. Support is also one of the major value-adds to the open source model.  Jaspersoft has taken its lumps when it comes to support services, most recently receiving low marks in Gartner’s BI Magic Quadrant report in January.

Boyarski said Jaspersoft is working to address support issues. “We have a lot of [support] information that we’re not really good at sharing with our customers even though we share our code,” Boyarski told me. The company is “learning to surface our tricks and tips via search tools” to help users find the support information they need on their won, he said. Boyarski also rightly pointed out that ISV customers are working on “non-trivial” projects that tend to be more complex than straight-forward BI deployments.

Support services are an important part of any IT project, but they are likely to play an even more significant role in complicated big data projects, at least in the short-term. The technologies that support big data like Hadoop and MapReduce are complex and still evolving, and the onus is on open source vendors like Jaspersoft, Cloudera and others to provide guidance to their customers.


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU