Abstracting Data Science for the Every Day User


Unlike my cohorts in San Francisco for the circus that is Oracle OpenWorld, I spent two days last week in sunny San Diego, talking to the folks from Teradata, a company with its roots in data analytics since its founding in 1979.

Teradata has a true data analytics offering. Oracle? I am still unsure about Oracle Exalytics. I don’t think anyone really knows what it can or will do. Larry Ellison covered the basics in his opening keynote last week, and observers at OpenWorld did a bit of myth busting.

The contrast between Teradata and Oracle highlights the growing competition in the data analytics market. Teradata also competes with IBM Neteza, HP Vertica, EMC Greenplum and SAP HANA.

But Terdata’s differentiation is Teradata Aster, which provides greater access to data analytics by abstracting the data science. This means the data analyst with basic database skills can perform work normally done by someone with much deeper technical skills.

Terada Aster

Terada Aster is not a consumer oriented technology but it does show how consumerization is having a cascading effect, making a wide spectrum of enterprise technologies more accessible. It’s a trend with its roots in advancing ways to give greater access to technologies that had previously been the sole domain of programmers, statisticians and mathematicians

Earlier this year, Teradata acquired Aster Data, which takes a different approach to big data than what we see from Oracle, EMC Greenplum, IBM Neteza or HP Vertica. Aster Data is an iterative engine that uses its processing power to drill down data using MapReduce with native SQL on the front end. People get access to data through SQL. That makes the discovery process more accessible to a wider range of people, which is not necessarily the case in many data analytics technologies.

Teradata acquired Aster Data Systems last March for $263 million. Founded in 2005 by a group of graduate students from Stanford University, Aster’s creators developed its technology based upon the premise that with such huge amounts of data, it’s really impossible to know what question to ask. That departed from traditional data analytics technologies approach to data mining. Instead, Aster used its MapReduce integration to break down data and then analyze it. SQL became the query engine that any data analyst could use to quickly look at, discover patterns and from it process the knowledge that comes when insights are drawn from otherwise vast data blocks.

“Big data is about quick analytics,” said Teradata Aster CTO Tasso Argyros at the Terada Partners User Conference. “It’s about fast fail. You want to quickly load data.”

Teradat has pre-configured MapReduce functions that can be queried against SQL with among others, modules for behavioral clickstream interpretation, marketing attribution and decision tree analysis. It is deployed through software that scales analytics, not data. The data can be queried through robust analytics as opposed to pushing all the data into tables in a SQL database.

Teradata had a number of announcements at the Teradata User conference. Teradata Columnar is most relevant to our discussion. Users may query horizontal or columnar data. It includes data compression, a hot topic these day. Here’s a link to a presentation.

Curt Monash does a better job than I ever could in explaining the technical differences between Teradata Columnar and competitors like EMC Greenplum. I recommend his review.

Services Angle

At the Strata conference last month, Teradata announced its Aster MapReduce appliance. It also announced Aster Database 5.0. Both are built using the Aster architecture.

Teradata Aster MapReduce is one of a growing number of data analytics appliances now in the market. The addition gives Teradata a potent combination to compete against Oracle, EMC and the rest.

What role does services play for Teradata Aster? Appliances and software alike still do require consulting help. Teradata may be making it easier for data analysts but the ground work set up work is still a real necessity that services companies can provide to enterprise customers.