UPDATED 15:48 EDT / JANUARY 23 2020

CLOUD

Google launches Dataset Search out of beta with new capabilities

After more than a year of testing, Google LLC today launched its Dataset Search service out of beta test mode with new capabilities aimed at enabling users to find information faster.

Dataset Search is a version of the company’s search engine designed specifically for browsing collections of scientific and technical information. Google has to date indexed close to 25 million datasets that span topics ranging from volcano activity to the social behaviors of puppies. The information comes from governments, universities and other organizations engaged in research activities.

Open-source data is playing an increasingly important role in the technology landscape amid the rapid spread of artificial intelligence. The more sophisticated the AI, the more training data it needs to crunch to become production-ready. A portal such as Dataset Search where AI developers can search records in a centralized manner has the potential to be a valuable tool for machine learning projects. 

Google is marking Dataset Search’s launch from beta with the introduction of new features meant to make the service even more useful. To start, the company claims it has “significantly improved” the quality of the descriptions for information repositories. There are also new filters that allow users to narrow down search results based on what kind of dataset they require.

“You can now filter the results based on the desired types of dataset that you want (e.g., tables, images, text), or whether the dataset is available for free from the provider,” Google research scientist Natasha Noy wrote in a blog post. “If a dataset is about a geographic area, you can see the map.”

Finally, the service is now accessible on mobile devices. Noy told The Verge that Google plans to continue improving Dataset Search by adding features to let users explore datasets “when they don’t necessarily know what they are looking for.”

Dataset search - skiing

AI developers are far from the only knowledge workers can take advantage of the service in their projects. Dataset Search is used by several hundred thousand people worldwide, including academic researchers, business analysts and students.

The groundwork for the service was laid all the way back in 2011, when Google LLC, Yahoo! and Microsoft Corp. launched a joint open-source project called Schema.org. The companies set out to create a universal standard for formatting web pages that contain structured data such as research files. Schema.org has since been adopted by the majority of the world’s governments, along with numerous academic institutions, and Dataset Search employs the standard to index the records it serves up to users. 

Image: Unsplash

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU