UPDATED 06:02 EST / NOVEMBER 10 2011

Digital Reasoning Focuses on Pattern Recognition in Unstructured Data on Hadoop

Cloudera user Digital Reasoning is focused on developing ontologies for unstructured data, basically finding the patterns that allow that data to be analyzed, company CEO Tim Estes told SiliconAngle CEO and Founder John Furrier and Wikibon.org Chief Analyst David Vellante in an interview webcast live from HadoopWorld 2011 on SiliconAngle.tv.

The company, which has about 30 employees, started as a defense intelligence contractor and now is expanding into business analytics. Under the covers, he said, its technology is basically a clustering algorithm that establishes a context for a specific piece of data of interest. So for instance, it will look at the context in which a particular word is used by examining similarities that it then puts into a hierarchy. This creates specific blocks out of a mass of unstructured data that then can be counted and used as the basis for analysis. So if the word is “toothbrush” it might look at how often that word is used with the word “morning” or “toothpaste” to establish patterns of when or with what a toothbrush might be used.

This, said Digital Reasoning President and COO Rob Metcalf, can be applied to finding patterns that can be useful to different kinds of businesses. “Customers have large amounts of clustered data, and they are trying to identify actors.” In government that might mean identifying potential bad actors based on word patterns in their communications. Financial traders might look for patterns that indicate when a stock or currency is likely to rise or fall in value. Law firms might use the technology to determine who knew what facts when by examining emails and other written communications. In public health it could be used to identify patterns of disease by examining huge numbers of medical records. This could be helpful, for instance, in identifying disease outbreaks early or in planning staffing for a hospital.

So far Digital Reasoning has been focused on developing the core technology. “The year ahead of us is the year of developing applications,” Estes said. But they already are seeing a strong interest in the technology, and they expect to have no problem finding ways to apply it to business needs.

“Basically the problem is that with the explosion of data we can no longer afford to apply humans to understanding all the material involved,” he said. “A decade ago you could search and read the top 1% of results on a subject. Now you may be able to look at the top 0.5% or 0.1% of the references a simple Google search may turn up. It gets really bad when you can only look at the top 0.01%. And the result is you have no confidence that the conclusions you reach are legitimate. We let you apply machine intelligence to analyze a very large amounts of data to produce much more accurate conclusions.”


Watch live video from SiliconANGLE.com on Justin.tv


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU