UPDATED 23:09 EDT / SEPTEMBER 05 2018

BIG DATA

Google has created a new search engine for finding useful data

Google LLC today launched a new kind of search engine for academic researchers and journalists that’s designed to help them find the data they need more easily.

Dataset Search provides an easy way to access “millions of datasets” across thousands of repositories dispersed across the internet. It’s currently in beta test mode and is free for anyone to use, but Google research scientist Natasha Noy, who helped to build the tool, emphasized the particular benefits for data scientists and journalists.

“In today’s world, scientists in many disciplines and a growing number of journalists live and breathe data,” Noy wrote in a blog post. “To enable easy access to this data, we launched Dataset Search, so that scientists, data journalists, data geeks, or anyone else can find the data required for their work and their stories, or simply to satisfy their intellectual curiosity.”

Detailing the need for Dataset Search, Noy said data of this nature is often difficult to find because it’s spread across numerous individual research websites such as the National Oceanic and Atmospheric Administration and the National Aeronautics and Space Administration, as well as “data-driven” news sites such as ProPublica. Most people don’t even know where to begin to look for the data they need, so Google Dataset Search provides an easier to way to search all of these resources from the same place.

Dataset Search surfaces results from publishers’ sites, digital libraries and authors’ personal web pages, Noy said.

The tool relies on a new schema markup for publishers of datasets that Google rolled out in July. Called “Dataset markup,” it can be used by publishers to describe their data in such a way that Google can understand it and index it properly so it can be found by its search tools.

Google is encouraging dataset providers to adopt its new schema markup so their content appears in Dataset Search, and the response seems to have been positive. An article in Nature.com noted that numerous universities have already began standardizing their metadata for inclusion in Google’s search results.

Noy said that for now, Dataset Search is best suited for finding datasets in disciplines such as environmental sciences and social sciences, as well as data provided by governments and news organizations.

Image: Google

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU