UPDATED 22:01 EDT / OCTOBER 10 2011

Datafiniti Launches “Google for Data Sets”

No doubt, there’s an endless amount of information on the web.  Some is worthwhile, most isn’t.  But determining how to glean useful content from the web is a question nearly as old as the Internet itself.  Google revolutionized web content search with its algorithms and Page Rank, and now Datafiniti is looking to revolutionize data in a similar manner–with search.

Launching one of the first data search engines of its kind, Datafiniti emerges from private beta to present a tool for hunting down just the right data sets to the public.  You can compose custom data sets on the fly using web-based structured search queries, simplifying the process around a search bar that’s nearly as minimalistic as Google’s.  The project comes from 80legs, the powerful web crawler that lets you write web-based apps to create data sets around specific content sources.  Datafiniti aims to close the gap between 80legs and the future of data markets, delving deep to contextualize data harvested from across the web.

An example would be a real estate agent creating a data set for properties below the average listing price, drilled down to a given neighborhood.  Datafiniti delivers real-time queries based on these price- and location-sensitive parameters, offering you a ready-made data set that begins to answer questions for you.

“Datafiniti is a very ambitious project,” said Shion Deysarkar, founder and CEO of Datafiniti. “We’re challenging how people think about data access. Until now, a few providers had all the data, and all the fun. If you care about data, which generally means that you’re a developer or marketer, all you have to do now is search. It’s that simple. Whereas Google is a search engine that returns sets of web pages, we’re a search engine that returns sets of data.”

Ambitious may be an understatement, given the expanse Datafiniti is really stretching across here.  The team’s faced obstacles in gathering, processing and curating data sets, and sticking to the mantra established with its 80legs effort, democratizing data tools means Datafiniti must also scale for a wide range of customers.  Datafiniti addresses several industry pain points thanks to its years of experience with 80legs, as well as offering an API to extend the capabilities of Datafiniti’s core technology.

The only hope Datafiniti has for such an ambitious project is its revolutionary stance on data models.  Speaking on the current industry trends of setting up directories for data sets, Datafiniti starts with search, treating the entire web as a single interface.  “I think the [directory] mentality will slowly fade as people get more familiar with the data that’s available,” says Deysarkar.  “This provides a lot more power to the user to get the data they want.”  The resulting product is a tool for market researchers to start with their needs, instead of having to know where the data is before setting out on a directory search.

Datafiniti’s also looking to scale to the enterprise, identifying these customers’ needs around emerging data trends.  So far Datafiniti is focusing on location, social and product as the three top areas customers will require data sets, with goals to journey into the long tail.   Built on a combination of in-house and open source Cassandra resources, Datafiniti takes advantage of what’s going on in the open source community while remaining competitive with its own proprietary platform.  With the 80legs initiative as a foundation for its take on IT consumerization, Datafiniti is launching with a good deal of support.

“We’ve been serving data through 80legs for a couple years now and some of the current data players are treating the market as a new thing, but it’s been going on for years,” says Deysarkar.  “We’re leveraging a lot of knowledge from our team’s past lives, use them as a crystal ball and sidestep some of the issues we might have faced otherwise.  We think that some of the newer players aren’t focusing on the high value customers out there.  Certain businesses aren’t being served and we’re in a better position to serve them because we have more valuable attributes in our data.”

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy