Can artificial intelligence cataloging be the Google for enterprise big data?


Despite uncertainty about its usefulness, companies continue hoarding masses of data. Does this mean data scientists are doomed to shovel through dreck looking for rare nuggets, or is there any easier way?

Artificial intelligence and machine learning for cataloging data may be the answer, according to Amit Walia, executive vice president and chief product officer of Informatica Corp.

“It’s the Google of data for the enterprise,” he said at BigData SV 2017 in San Jose, CA.

Walia told John Furrier (@furrier) and George Gilbert (@ggilbert41), co-hosts of theCUBE, SiliconANGLE Media’s mobile live streaming studio, that enterprises need this because “the intelligence around data has to be at the metadata level.” (*Disclosure below.)

Data lakes have largely become data swamps — there may be valuable insight in there, but who can see it through the muck? Walia asked. It is difficult for data scientists to gauge data’s value at first sight, so into the data lake it goes.

Internet of Things takes big data to huge data

The many varied data streams coming from IoT devices certainly won’t be helping matters, Walia said. “IoT is the big game-changer of big data becoming big or huge data,” he explained.

Device feeds, social feeds and a multitude of other kinds of data are flooding enterprises. “Do you want the analyst to figure out where the data’s coming from, or the machine learning or AI to contextualize and tell you …?” Walia asked.

AI, machine learning and self-serve analytics can extract intelligence from data as a unified whole, even if its locales are fragmented in different clouds or databases, he added.

“Metadata becomes the organizing principle; that’s where it becomes real,” he said. This, in turn, will free data scientists to work up the stack on value-adds.

Meta analysis for insight is what customers really want from data, Walie shared. “The word ‘big data,’ I thought, got massively abused. A lot of Hadoop customers are not necessarily big data customers,” he said. What they want is intelligence, and that requires tools that go deep, regardless of the amount of data.

Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of BigData SV 2017. (*Disclosure: Some segments on SiliconANGLE Media’s theCUBE are sponsored. Sponsors have no editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE