

Microsoft Corp. has clearly perfected the art of playing catch-up with Amazon Inc. in the public cloud over the last few years. Less than 24 hours after its arch-nemesis debuted two new services for developers, the software powerhouse is firing back with the announcement of a fully managed discovery engine aimed at another key constituency in the enterprise: business analysts.
The aptly-named Azure Data Catalog does exactly what the name suggests, serving as an index of the various systems and services from which an organization draws its information. Each source needs to be added individually, but the process is largely automated thanks to an integration wizard that handles the extraction of the necessary details for the filing process.
That includes the type or types of information stored in a certain repository, the amount and even the names of the individual objects inside, along with any other insights that can be gleaned from the metadata in that system. Working under the assumption that the user tasked with doing the cataloging is most likely familiar with the source, Microsoft also provides the option to add manual input like tags during the extraction to help colleagues find the source more easily.
That simplified discovery is essential in the large organizations that the company is targeting with its public cloud, where there can be upwards of hundreds of systems scattered throughout different divisions and locations. Factoring the growing number of cloud services that are finding their way to the workplace into the calculation raises that figure even higher, turning the need to give users a straightforward way of finding data outright urgent.
But Microsoft isn’t the first to have identified the opportunity in that challenge. There are numerous other vendors offering solutions for automating the discovery of information, perhaps most notably the recently funded Tamr Inc., which has developed a platform that goes several steps further than Azure Data Catalog and automatically correlates the contents of different systems for patterns.
There’s admittedly a very good reason why Microsoft didn’t go down that route. The indexing service is offered as part of the broader machine learning toolkit on its public cloud, which contains plenty of options to help organizations perform their own custom analyses of the data aggregated in catalog.
Accordingly, Microsoft has included access controls for ensuring that the information is handled in compliance with security and regulatory requirements. The administrator in charge could use the functionality to provide a senior data scientist with broad access to sources while limiting a business analyst attached to a certain department to only using information that falls directly into their particular area of focus.
THANK YOU