UPDATED 08:00 EDT / JUNE 17 2025

BIG DATA

Diskover gets $7.5M from Snowflake, NetApp and others to facilitate unstructured data discovery for AI

Big-data discovery startup Diskover Data Inc. burst into the limelight today, announcing a $7.5 million seed funding round, its first acquisition and key partnerships with industry giants Snowflake Inc. and NetApp Inc.

The startup’s momentum stems from its development of a novel platform that enables enterprises to “structure their unstructured data” and make it more usable and secure for artificial intelligence models.

The round was led by Park Capital Partners and The Hive, and also saw the participation of Diskover’s new partners, Snowflake and NetApp. It intends to use the funds to expand its go-to-market and engineering teams.

Diskover is on a mission to help enterprises untangle the massive volumes of what can potentially be extremely valuable unstructured data – things such as documents, PDF files, videos, audio recordings, text messages, scans of handwritten notes, receipts and so on. Such items account for as much as 80% of all data stored by the typical enterprise, yet the bulk of it does nothing except sit in their storage arrays, accumulating dust. That’s because its unstructured nature makes it almost impossible for companies to track, govern, secure and use in any meaningful way.

Yet unstructured data is also the most critical fuel for large language models, which have so much potential in terms of business automation. That’s why Diskover wants to help them understand what their unstructured data is, where it is, and how it can be used.

The startup does this by continuously scanning and indexing millions or even billions of unstructured data files across an organization’s entire information technology environment, spanning clouds and on-premises servers. That allows it to generate metadata for those files that explains what it is and where it’s stored, making it searchable. With that metadata, it can then create a real-time inventory with tools to facilitate discovery, classification and governance.

According to Diskover, its platform can play a critical role in curating information for AI data pipelines, identifying the most relevant files for training different models. Its technology verifies any data it surfaces against the company’s existing authentication systems, enabling it to honor existing permissions to ensure compliance.

Co-founder and Chief Executive Will Hall said his company has transformed what used to be a “dark, opaque data swamp” into a “structured, searchable and actionable resource” that’s just begging to be used by AI developers.

“If you want to build AI that works, it starts with knowing what you have and curating it in the most efficient manner, and that’s what we do,” Hall said. “With unstructured data comprising more than 80% of all enterprise data, and AI’s insatiable need for high-quality inputs growing daily, Diskover is the starting point for enterprise AI.”

Diskover also engages in acquisitions, which is exceedingly rare for a startup that has only just announced its first funding round. Alongside the funding announcement, it said it’s buying an even smaller startup called CloudSoda Inc., which specializes in “AI-ready intelligent data management” and provides what seem to be many of the same capabilities.

“It’s an ideal coupling,” Hall said. “Our respective strengths are mutually reinforcing. We had scale, they had simplicity. Together, we’ve now got the most intuitive and enterprise-ready unstructured platform on the market.”

Constellation Research Inc. analyst Michael Ni told SiliconANGLE that what Diskover is doing isn’t all that different from existing data management software providers like Komprise Inc., positioning itself to fill the gap between raw data lakes and the intelligence layers for unstructured content. But unlike those more established players, he said it stands out somewhat due to the largely open-source nature of its platform.

“This gives Diskover a unique positioning, somewhere between the low-level command-line tools and the higher-cost commercial enterprise solutions,” Ni said.

According to Ni, the main advantages Diskover has are a lower barrier to entry in terms of cost-effectiveness, more transparency and flexibility, no vendor lock-in and a proven ability to scale.

“Diskover is built on Elasticsearch, which gives it credibility for large-scale file system analysis,” Ni explained. “This makes it best-suited for tech-forward teams that are comfortable managing infrastructure such as Elasticsearch, who want to avoid the overheads of expensive platforms.”

Though most startups’ claims of grandeur can be taken with a pinch of salt, the fact that Snowflake and NetApp are both partnering with and funding Diskover suggests that it really may be onto something. Snowflake said it’s making Diskover’s platform available through the Snowflake Marketplace, and will also connect its on-premises data intelligence features with its Openflow data integration service to enable superior hybrid data orchestration.

Snowflake Ventures Director Harsha Kapre said more enterprises are adopting AI-first data strategies, which necessitate being able to access all of their data.

“Enterprises can’t unlock the full value of AI without knowing what unstructured data they have and how to use it,” he explained. “Our partnership with Diskover, in combination with Snowflake Openflow, makes that possible, acting as a super-connector to exabyte-scale unstructured data.”

NetApp is just as enthusiastic about Diskover, and is integrating its services within its own data pipeline infrastructure, which incorporates data sources from the network edge to the cloud. According to Gagan Gulati, NetApp’s senior vice president and general manager of data services, the partnership will ensure companies are better able to surface and activate unstructured data, regardless of where it lives. “This collaboration helps accelerate cyber resiliency, AI readiness, and storage efficiency to deliver outcomes that drive business value,” Gulati said.

Diskover says it has enjoyed strong momentum even before today’s announcements. It claims it already has more than 130 enterprise customers across a range of industries, including the media and entertainment, life scenes, manufacturing, energy and semiconductor design sectors. It also established a commercial relationship with Dell Technologies Inc. in October 2024.

Neuralytix analyst Ben Woo said he has been sold on Diskover’s platform, because these days almost every company that’s able to do so is investing in AI due to the enormous advantage it can provide in terms of enterprise automation. But one of the biggest challenges they face is getting ahold of the data that’s needed to fuel their AI initiatives.

“AI requires relevant and accurate data, and Diskover helps enterprises to identify the data that will generate the greatest value,” Woo explained. “It will connect [the data] with the most critical enterprise applications and enable business leaders to make informed decisions to achieve their business objectives.”

Image: SiliconANGLE/Dreamina

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.