UPDATED 16:45 EDT / MARCH 09 2012

NEWS

The Week in Big Data: Citi Meets Watson, HBaseCon Announced, SAS Loves Hadoop

With all the media attention being directed towards Big Data these days, it can be hard to keep up with the latest news and developments. That’s especially true for you Big Data practitioners out there who have your heads down writing code and crunching data all day.

To help you keep pace with what’s going on in the Big Data community, I thought it would be helpful to provide a rundown of the week’s Big Data news in brief. So beginning today and continuing each Friday (or as close to it as I can manage) I’ll provide a summary of The Week in Big Data, complete with links to further reading an analysis. I hope you find these updates helpful.

Without further ado, here’s your first roundup of Big Data News for the week for the 2nd week of March, 2012:

MapR and Informatica forged a technology partnership this week that the two claim brings real-time streaming capabilities to Hadoop. As part of the deal, MapR customers can use Informatica’s Ultra Messaging platform to load data into MapR Hadoop distribution in near real-time. If the partnership lives up to its claims, this could be a major differentiator for MapR over archrival Cloudera.

In other partnership news, Attivio and Tableau are teaming up to bring Tableau’s data visualization capabilities to the Attivio Active Intelligence Engine. Attivio AIE allows developers to build applications that can access both structured data and unstructured content like emails, documents and other text-based files for analysis.

Citigroup is tapping IBM Watson to explore “the use of 
deep content analysis and evidence based learning capabilities” to “help advance customer interactions, 
and improve and simplify the banking experience.” Watson is IBM’s artificial intelligence system designed to answer natural language queries. Hadoop is one of the underlying technologies supporting Watson.

Cloudera announced it is putting on a new conference focused on all things HBase, the open source NoSQL database popular with Hadoop practitioners for near real-time lookups. The goal of HBaseCon 2012, which will take place in San Francisco on May 22, is “to give the attendees a platform for connecting with their peers and sharing experiences with this technology and to ultimately advance the development of HBase.”

SAS added Hadoop to its roster of data sources. The Hadoop connector came in the form of an update to SAS Enterprise Data Integration Server, its ETL tool. SAS said it added the Hadoop connector as a response to customer demand, more and more of whom are using Hadoop to crunch massive multi-structured data sets.

The HA Name Node project announced significant progress this week in addressing Hadoop’s Achilles heel – the single-point-of-failure issue. The goal of the project is to deliver two fully functioning Name Nodes – an active Name Node and a passive Name Node – within Hadoop clusters to provide hot backup capabilities. HA Name Node capabilities are now baked into the beta version of Cloudera’s CDH4, with more improvements coming soon.


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU