AI
AI
AI
Companies are making aggressive bets on artificial intelligence, but many are discovering that AI success depends less on models and compute and more on whether they understand and govern the data beneath them.
For unstructured data management provider Congruity360, that disconnect has become one of the defining risks facing modern enterprises: Organizations are racing toward AI while still lacking visibility into the unstructured data that fuels AI success.
Data governance has emerged as one of the clearest fault lines separating AI ambition from AI readiness. Forty-one percent of organizations still report no data‑classification tooling, even as 37% plan tool purchases in the next two years. That gap leaves large volumes of unclassified, high-risk data spread across file servers, NAS environments and cloud repositories — environments Congruity360 regularly encounters during enterprise assessments — forcing IT and security teams into reactive cleanup efforts that drain time, budget and trust.
“AI is important to you as a workflow, it’s important to you as a workload and important to you as a business,” said theCUBE Research’s Christophe Bertrand. “For this reason, anything that is part of AI infrastructure has to be protected. I think that’s the baseline. Don’t make it a second thought. It is actually a design requirement.”
Looking back at theCUBE Research’s Cyber Resiliency Summit, industry experts — including Mark Ward, chief operating officer at Congruity360 — examined how uncontrolled data growth quietly undermines security, compliance and resilience. The conversation focused heavily on Redundant, Obsolete and Trivial data (ROT) as a systemic problem rather than a storage nuisance.
Experts at the summit framed the challenge through the “Big 4” drivers of data governance: cybersecurity exposure, compliance and legal obligations, operational efficiency and business risk reduction. Recent research underscored the scale of the issue: 70% of IT professionals reported some visibility into ROT in SaaS applications, 56% in on-prem systems and only 46% understood ROT in cloud-hosted environments.
These visibility gaps align directly with the risks Congruity360 is designed to surface — from breach exposure to stalled audits and runaway infrastructure costs.
This feature is part of SiliconANGLE Media’s ongoing coverage of data governance, AI success and cyber resilience in the enterprise. (* Disclosure below.)
When organizations lack a clear view of their data, risk multiplies quickly. Forgotten credentials buried in legacy file shares, compliance audits delayed by missing datasets and unclassified files expanding silently across environments all increase exposure. What begins as a visibility problem becomes a governance failure across each of the Big 4 drivers, according to Bertrand.
“I don’t think I’ve ever met an end user that told me, ‘I have a hundred percent visibility over all of my data,’” he said.
That pattern surfaced repeatedly during conversations at the Cyber Resiliency Summit, including during an interview with Congruity360’s Ward on why data governance continues to lag enterprise expectations. Ward pointed to the operational imbalance behind that reality: Data grows faster than organizations can classify, govern or retire it. Over time, this creates sprawling unstructured-data environments where risk remains hidden until a breach, audit or AI initiative exposes it. Without a reliable map of what exists and where it lives, security and IT teams struggle to turn awareness into action.
“Data security posture management is all about unraveling what isn’t seen or is hidden,” Ward said. “So, what we focus on is identifying in the first step all of the attributes associated with your unstructured data that could potentially give you or can give you very real actionable data to take remediative action on.”
The summit also revealed that ROT is not merely a storage inefficiency — it is a business risk with regulatory and security consequences. Stale spreadsheets, archived emails and abandoned file shares can still contain personally identifiable information or regulated records subject to GDPR, HIPAA and other frameworks.
Ward offered a practical illustration of that exposure: “Has PII data existed in an open share by an employee that left five years ago? Understanding what you currently have gives you very tangible items to act on.”
The Big 4 governance drivers continue to shape how enterprises are reassessing unstructured data. Closing visibility gaps requires classifying data by business value, actively reducing dark data and aligning storage decisions with measurable risk and cost. Organizations that address ROT strategically — rather than as a one-time cleanup — strengthen cyber-resilience, simplify compliance and improve operational efficiency.
Congruity360 positions data security posture management as a practical bridge between visibility and governance. By focusing on rapid discovery across cloud and on-prem environments, DSPM enables organizations to surface unstructured-data blind spots in days rather than months. This speed matters as security teams face mounting pressure to show measurable risk reduction tied to the Big 4 outcomes.
“What we’re able to do is give you a snapshot of your current state of security posture management in a week,” Ward said.
Cost pressure has intensified alongside security risk. Legacy storage systems can accumulate significant expense as inactive files, duplicates and abandoned archives pile up. Large enterprises routinely manage hundreds of terabytes of unstructured data, much of it unmanaged yet still subject to security and compliance requirements, according to Ward.
“For us, it’s all about performance and scale combined together,” he said. “We have clients that have literally 100 petabytes of data that they’re trying to find both that needle in a haystack and at the same time reducing their risk exposure.”
Without regular audits, these datasets remain invisible until a breach, audit failure or regulatory inquiry forces action.
Congruity360’s approach emphasizes continuous assessment rather than episodic cleanup. By pairing classification with remediation workflows, organizations can reclaim storage, reduce attack surface and align infrastructure spending with actual business value — reinforcing operational efficiency and business-risk reduction within the Big 4 framework.
Insider risk has also grown more acute. Social-engineering attacks increasingly exploit abandoned accounts and residual access tied to former employees. Human error remains a dominant breach vector, particularly in environments where sensitive files are poorly classified. Security teams are responding by tracking access patterns on high-risk data silently, limiting accidental exposure without disrupting legitimate workflows.
“We’re seeing this more and more often from our large customers, as they care not just about the third-party guys on the outside, but they care about … some of their users putting data that is very important at risk,” Ward said, adding that the risk could be intentional or unintentional.
Integration and scalability have become essential. Rather than treating ROT as a periodic project, enterprises are shifting toward rolling governance programs that classify new data automatically and feed results into security and compliance systems. DSPM platforms that combine machine-learning-driven classification, remediation workflows and reporting are gaining traction because they directly support the Big 4 governance objectives.
The steps organizations take to manage ROT and monitor sensitive files do more than reduce breaches — they establish the foundation for AI initiatives that depend on accuracy, traceability and trust. Clean, classified data becomes an operational prerequisite, not a downstream optimization.
“You need to identify data before you classify it, and then you need to hand it off to a smart system,” Ward said. “That’s exactly what we’re doing for the Fortune 1000 and the SMB marketplaces. We’re giving them ability, through our SaaS data delivery, to consume the values of data security posture management, the values of AI.”
Enterprise AI depends on governed data. Analysts increasingly warn that without strong information architecture, AI projects stall or generate unreliable outcomes. A Drexel University survey found that 62% of organizations believe poor data governance is the primary obstacle inhibiting AI initiatives. Feeding generative models ungoverned, redundant or outdated data amplifies risk and cost — from wasted compute cycles to regulatory exposure. For Congruity360, governance is about controlling what enters AI pipelines before risk propagates downstream.
“AI is all about taking risk out of the data before you feed it to your ChatGPT engine or your own private system,” Ward said. “That vast system’s pretty expensive… if you can feed it a third of the data that it’s currently being fed right now, you’re going to have an enormous economic impact.”
From this perspective, AI amplifies the urgency of the Big 4 rather than replacing them. Cybersecurity, compliance, efficiency and business risk all intensify when poorly governed data becomes fuel for automated systems.
Before organizations can optimize backups, automate compliance or govern AI pipelines, they need a clear understanding of what data exists and where risk resides. Congruity360’s framework is designed to provide that visibility across both cloud and on-prem environments, exposing ROT, unstructured-data blind spots and stale files that would otherwise persist unnoticed.
Once visibility is established, targeted action becomes possible. Ward outlined five core components of Congruity360’s ROT-control framework:
1. Conduct data audits regularly
Identify and eliminate non-critical and redundant data across environments to prevent ROT accumulation and reduce attack surface.
2. Invest in proactive classification
Deploy DSPM tools with clear rules governing data value and retention to surface high-risk content quickly.
3. Adopt a “less is more” philosophy
Remove inactive snapshots and stale backups to shrink cost, complexity and exposure.
4. Align storage with risk
Match data sensitivity to appropriate storage controls based on business relevance and regulatory requirements.
5. Implement data lifecycle management (DLM)
Define expiration and deletion policies for aging data while preserving audit trails and recovery options.
Each element reinforces the Big 4 governance drivers: audits and classification strengthen cybersecurity, lifecycle policies support compliance, streamlined storage improves efficiency, and risk-aligned retention reduces business exposure.
Managing ROT is not a static goal — it is an ongoing discipline. The market signals are clear: faster assessments, tighter governance expectations and AI-driven risk amplification.
Organizations that act on these principles are reducing costs, simplifying audits and closing breach pathways. Those that don’t face growing exposure as AI accelerates the consequences of poor data hygiene, according to Bertrand.
“It’s existential risk, because you need to be compliant and you need to be using AI if you want to survive in today’s enterprise world going forward,” he said.
(* Disclosure: Congruity360 InfoGov Inc. sponsored this segment of theCUBE. Neither Congruity360 nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.