UPDATED 22:01 EDT / OCTOBER 10 2011

Datafiniti Launches “Google for Data Sets”

No doubt, there’s an endless amount of information on the web.  Some is worthwhile, most isn’t.  But determining how to glean useful content from the web is a question nearly as old as the Internet itself.  Google revolutionized web content search with its algorithms and Page Rank, and now Datafiniti is looking to revolutionize data in a similar manner–with search.

Launching one of the first data search engines of its kind, Datafiniti emerges from private beta to present a tool for hunting down just the right data sets to the public.  You can compose custom data sets on the fly using web-based structured search queries, simplifying the process around a search bar that’s nearly as minimalistic as Google’s.  The project comes from 80legs, the powerful web crawler that lets you write web-based apps to create data sets around specific content sources.  Datafiniti aims to close the gap between 80legs and the future of data markets, delving deep to contextualize data harvested from across the web.

An example would be a real estate agent creating a data set for properties below the average listing price, drilled down to a given neighborhood.  Datafiniti delivers real-time queries based on these price- and location-sensitive parameters, offering you a ready-made data set that begins to answer questions for you.

“Datafiniti is a very ambitious project,” said Shion Deysarkar, founder and CEO of Datafiniti. “We’re challenging how people think about data access. Until now, a few providers had all the data, and all the fun. If you care about data, which generally means that you’re a developer or marketer, all you have to do now is search. It’s that simple. Whereas Google is a search engine that returns sets of web pages, we’re a search engine that returns sets of data.”

Ambitious may be an understatement, given the expanse Datafiniti is really stretching across here.  The team’s faced obstacles in gathering, processing and curating data sets, and sticking to the mantra established with its 80legs effort, democratizing data tools means Datafiniti must also scale for a wide range of customers.  Datafiniti addresses several industry pain points thanks to its years of experience with 80legs, as well as offering an API to extend the capabilities of Datafiniti’s core technology.

The only hope Datafiniti has for such an ambitious project is its revolutionary stance on data models.  Speaking on the current industry trends of setting up directories for data sets, Datafiniti starts with search, treating the entire web as a single interface.  “I think the [directory] mentality will slowly fade as people get more familiar with the data that’s available,” says Deysarkar.  “This provides a lot more power to the user to get the data they want.”  The resulting product is a tool for market researchers to start with their needs, instead of having to know where the data is before setting out on a directory search.

Datafiniti’s also looking to scale to the enterprise, identifying these customers’ needs around emerging data trends.  So far Datafiniti is focusing on location, social and product as the three top areas customers will require data sets, with goals to journey into the long tail.   Built on a combination of in-house and open source Cassandra resources, Datafiniti takes advantage of what’s going on in the open source community while remaining competitive with its own proprietary platform.  With the 80legs initiative as a foundation for its take on IT consumerization, Datafiniti is launching with a good deal of support.

“We’ve been serving data through 80legs for a couple years now and some of the current data players are treating the market as a new thing, but it’s been going on for years,” says Deysarkar.  “We’re leveraging a lot of knowledge from our team’s past lives, use them as a crystal ball and sidestep some of the issues we might have faced otherwise.  We think that some of the newer players aren’t focusing on the high value customers out there.  Certain businesses aren’t being served and we’re in a better position to serve them because we have more valuable attributes in our data.”

Since you’re here …

Show your support for our mission with our one-click subscription to our YouTube channel (below). The more subscribers we have, the more YouTube will suggest relevant enterprise and emerging technology content to you. Thanks!

Support our mission:    >>>>>>  SUBSCRIBE NOW >>>>>>  to our YouTube channel.

… We’d also like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.

If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.