Microsoft open-sources one of the core algorithms powering Bing
Microsoft Corp. today open-sourced one of the cornerstone algorithms powering its Bing search engine in an effort to help developers build faster, more easily navigable applications.
The Space Partition Tree And Graph algorithm, or SPTAG for short, is available under the permissive MIT License. Microsoft has bundled it into a library that includes tools to help developers to incorporate the code into their projects.
SPTAG is what allows Bing to instantly display relevant search results even when a user enters a query that can’t be processed by simply matching keywords to web pages. Looking up the phrase “largest lake in the United States,” for instance, brings up a panel with information about Lake Superior even though there is only one shared word.
SPTAG makes that possible by transforming queries into data constructs known as vectors. A vector is essentially a long sequence of numbers that can encapsulate various kinds of information, from individual words to entire web pages.
Translating different records into a common numerical format has the benefit of allowing them to be compared more easily. The vector for the phrase “largest lake in the United States” will share similarities with, among others, the vector that Bing generates from the text of the Wikipedia page “List of largest lakes of the United States by area.” And that Wikipedia page has Lake Superior at the top of the ranking.
Bing groups the vectors representing web content based on similarity to speed up searches. “Once the numerical point has been assigned to a piece of data, vectors can be arranged, or mapped, with close numbers placed in proximity to one another to represent similarity. These proximal results get displayed to users, improving search outcomes,” Microsoft detailed in a blog post.
According to the company, SPTAG enables Bing to sift through billions of pieces of data in just a few milliseconds. The search engine has access to a repository of more than 150 billion vectors that is continuously expanded with new content from the web.
One obvious application for SPTAG is improving the search experience for users of collaboration services, email clients and other text-heavily applications. But the algorithm is not limited to processing written content. SPTAG is also capable of generating vectors for images and audio files, which means developers can use it to build advanced capabilities such as automated photo comparison.
SPTAG is available on GitHub.
Photo: Pixabay
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU