UPDATED 11:38 EDT / SEPTEMBER 26 2012

Treasure Data Offers Big Data Analysis for the Rest of Us

SMBs, it is time to start thinking Big Data. Hadoop-based analysis of unstructured data, which is the Next Big Thing for large enterprises, has now been brought within reach of SMBs and large company LOBs and small divisions by the Treasure Data Big Data Warehouse service. For those with the imagination to make good use of this opportunity, this could be a game-changer in a highly competitive marketplace.

So far Big Data has been the province of a few pioneers, first the big online service providers – Yahoo, Google, and Facebook – who developed Hadoop and the various Open Source tools surrounding it, and then some advanced services and companies. What makes Big Data revolutionary is that it is designed to analyze non-structured data, everything from e-mails to information on who you communicate with most on Facebook or Twitter, to machine-generated data in various types of logs.

Technologically astute companies are using this capability to answer new kinds of business questions. Retailers, for instance, are analyzing data from social media services to build pictures of webs of connectivity among groups to determine who among their customers might influence others to buy their products. Big banks are using it to identify customers, including analysis of their tweets and Facebook postings, as they walk in the door. This analysis is sent wirelessly to tablets carried by service employees on the bank floor who can then greet the customers by name and anticipate their needs. Cell phone carriers are using it to analyze their customer service systems to identify the problems that frustrate subscribers to improve service, make users happier, and save money.

But so far Big Data has been the captive of companies with deep pockets. Building a Big Data cluster takes large amounts of cash, months of work, and expensive, rare skills in such areas as Hadoop and Map Reduce.

The founders of Treasure Data, CEO Hiro Yoshikawa and CTO Kazuki Ohta, set out to fix these issues and bring Big Data to SMBs. They have extensive experience, and they identified several challenges. First they removed the basic cost issues by creating a Platform-as-a-Service (PaaS) that delivers access to the Hadoop cluster to customers doorsteps over the Internet. Then they added an SQL layer to Hive, Hadoop’s data warehouse system. This translates SQL queries into Map Reduce automatically, so users can use any standard business intelligence query tool for their analysis.

They solved the complex issue of data capture by developing their own data capture and translation engine, TD-Agent, that automatically uploads any kind of data from any kind of data source into any data warehouse, including, of course Treasure Data. All a user has to do is specify the data sources, which can include internal and external sources and structured as well as unstructured data, to use through a user interface designed for non-technical business people, and sit back. And because TD-Agent is designed for high-performance parallel batch loads to multiple concurrent targets, that wait is usually not long. And it maintains a continuous feed of new data to reduce subsequent load times, enabling near-real-time and event-based analysis.

To further streamline analysis, Treasure Data has developed its own columnar file format system to replace HDFS. This allows analysis tools to load only data in those columns that are relevant to the question rather than an entire database, drastically cutting the time it takes to conduct an analysis in many cases.

The critical measure of success, Yoshikawa says, is time-to-answer. Treasure Data has reduced that from months to days or even hours, so new users can reasonably expect to start receiving answers to their initial data queries in less than a week from the time they sign up and start loading data.

Treasure Data sells its services on the standard cloud model, using multi-tenancy and shared staffing to reduce the cost for users to a fraction of what even a traditional data warehouse such as Teradata would cost. And it allows users to combine the internal structured data that would be collected in that traditional RDBMS with unstructured data of any type, including logs of service calls and customer interactions with staff, customer e-mails and IMs, and relevant comments on social media sites, for instance.

And this is not just theory. Treasure Data is just coming out of stealth mode, but it already has 10 customers. Ironically these are mostly online game companies, for whom big data analysis is a core capability, and divisions of global enterprises. However, Yoshikawa says, Treasure Data is not working with the central IT groups of these large companies but rather with directly with business users in LOBs or, he hints, individual retail outlets of at least one large retailer. Those customers are located in the United States, Japan, and the European Union, with a deal in progress with a company based in Turkey.

The good news for SMBs is that Treasure Data has removed the financial and technical barriers that have prevented any but the largest enterprises from using data warehousing and Big Data. The challenge is to determine how those new capabilities can best help your company to attain competitive advantage and prosper.


A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.