The new connector uses Hive to collect data across distributed nodes. That data is then pulled into Tableau for visualization.
Tableau’s data engine processes the information that is piped in from Hadoop. Data can be pulled in as an aggregate or as a subset for analysis.
Tableau’s native data connector will be generally available with Tableau 7.0, the next version of the product due to be released in winter 2012. Tableau customers working with Tableau 6.1.4 can get access to the beta Hadoop connector. Access is via Hive and the new Cloudera-developed Hive ODBC driver. Here’s a demo.
Vice President Dan Jewett says the solution is designed to abstract the complexity that comes with Hadoop. “We are trying to do rapid-fire business intelligence,” Jewett says.
Jewett says his team has focused on the front end of the application to make it as simple as possible for the business user to ask questions of the data.
Until this point experts have used the R programming language to turn big data into easy to understand visualizations while data warehouse providers have used Tableau for its data visualization. The Tableau Hadoop connector will add to that capability. “This expands the footprint of who we reach,” Jewett says.
For more on Tableau’s vision, see my interview with chief scientist and co-founder Pat Hanrahan.
Tableau Software competes with companies that use the R programming language and proprietary vendors such as Spotfire and ClickTech. Neither Spotfire nor ClcikTech have Hadoop connectors yet, but this is fast becoming table stakes for data software providers of all types.
This is another sign of how Hadoop will go mainstream. It’s a different form of business intelligence designed more for larger data sets that look at data over a period of time. It’s not real-time information but instead designed for collecting large batches of data such as from Web logs.