UPDATED 11:11 EDT / NOVEMBER 04 2011

NEWS

5 Big Data Tools Built On Hadoop

Yesterday I looked at several of the alternatives to Apache Hadoop that are coming from companies like HPCC Systems, Twitter and Microsoft. These projects differentiate themselves from Hadoop by providing a more robust set of integrated tools and/or more accessible ways of performing analysis. But Hadoop has a large ecosystem, with many projects being built upon Hadoop. These projects plug many of the same holes that Hadoop alternatives try to fill.

Apache Mahout

Apache Mahout

Apache Mahout is a Java library of machine learning and data mining algorithms, many of which (but not all) are designed to run on Hadoop. The algorithms are designed to be highly scalable – a requirement doing data mining on big data sets distributed on Hadoop clusters. The algorithms are categorized into three main use cases: recommendation mining, clustering, classification and frequent itemset mining.

GoldenOrb

GoldenOrb logo

GoldenOrb is an open source graph database built on Hadoop and based on Google’s Pregel paper. It’s a fitting extension to Hadoop, since Hadoop is based on Google’s MapReduce, BigTable and Google Filesystem papers. The project is sponsored by Ravel.

A graph data base is designed to explore the network of relationships between items in a data base – like a the relationships between people in a social network, for instance. GoldenOrb is in early development now, but could eventually be used for social graph analysis, data mining, fraud detection and more.

Datameer Analytics Solution

Datameer

Datameer Analytics Solution is a business intelligence and data visualization application built on Apache Hadoop. It’s one of several products that are attempting to make Hadoop more easily accessible to non-developers (see also Karmasphere). Datameer provides wizards for setting up data integrations and a spreadsheet style interface for working with data and creating visualizations. It supports multiple Hadoop distributions, including those from Cloudera and MapR.

WibiData

I wrote about WibiData from Odiago yesterday. It’s a data management and analytics product from a new startup launched by the founder of Cloudera.

HStreaming

hstreaming

One of Hadoop’s noted weaknesses is its lack of support for real-time analytics. Hadoop is engineered to do finite batch jobs, not never ending jobs on ever changing data. HStreaming is one of a few projects that addresses this. HStreaming offers an on-premise Enterprise Edition and Cloud Edition which runs on Amazon Web Services.

Services Angle

Doing big data analysis with Hadoop doesn’t end with the . The ecosystem of tools that either build upon or extend Hadoop (such as Hive) and make it more accessible are Hadoop’s greatest strength, and something projects like HPCC Systems and Spark can’t yet match. Database, enterprise data warehouse and business intelligence companies are all tripping over themselves trying to provide integration with Hadoop, with even Microsoft and Oracle jumping in.

Next week the SiliconAngle team will be at the HadoopWorld event in New York City. It’s completely sold out, but we’ll be covering the action live on our online show theCube.


A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+  
CUBE Alumni Network
C-level and Technical
Domain Experts
15M+ 
theCUBE
Viewers
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.