

Data observability firm Monte Carlo Data Inc. is turning its attention to unstructured information, introducing a new capability that will allow enterprises to monitor the enormous volumes of text, images, video and audio files that are so vital for artificial intelligence workloads.
The company’s new unstructured data monitoring engine represents an effort to fix one of the biggest blind spots in most enterprises’ data estates. According to a report by International Data Corp., up to 90% of the information stored on the average enterprise’s servers is unstructured – a jumble of chat logs, Word and PDF documents, PowerPoints and so on. This presents a big problem in terms of reliability, as there’s no easy way to monitor it for data quality issues.
Monte Carlo is a leading player in the data observability market, providing tools that can help businesses to ensure the “quality” of their datasets meets the highest standards. Its platform works similarly to application monitoring tools such as Datadog and AppDynamics, only it’s applied to data pipelines rather than telemetry and other app metrics. It uses machine learning algorithms to understand the normal baseline of any given data stream, so it can alert users to any abnormal behavior. Much of what it does can now be automated by AI agents.
The problem is that, until now, Monte Carlo’s data monitoring tools were always aimed at structured data – the kind of information that’s stored neatly in rows and columns in databases such as Oracle. Unstructured information is an entirely different ballgame. Nevertheless, it’s important for organizations to be able to trust this information, as it’s the main fodder for most generative artificial intelligence applications and agents, which are becoming increasingly prevalent in enterprise computing environments.
Monte Carlo says it’s the first data observability company in the business to focus on unstructured data, enabling companies to apply customizable and AI-powered checks to any quality they believe is relevant to their business-critical workloads. Some of the use cases put forward by Monte Carlo include flagging customer reviews with negative sentiment, before they reach dashboards, and detecting personally identifiable information or other sensitive data in contract text fields. Crucially, its checks can also be used to validate AI model outputs for their factual accuracy, consistency, tone and structure, the company said.
Monte Carlo co-founder and Chief Technology Officer Lior Gavish said it’s vital for businesses to be able to proactively detect data issues in order to build AI systems they can trust.
“High-quality unstructured data, like customer feedback, support tickets, or internal documentation, isn’t just important; it’s foundational to building powerful, reliable AI,” he said. “It can be the difference between a model that performs and one that fails.”
The focus on unstructured data quality points to the start of a new trend that will see consolidation across the AI and data observability markets, which have traditionally always been separate, said analyst Michael Ni of Constellation Research Inc. He believes that chief data analytics officers will welcome this consolidation, because they’re not only drowning in data, but completely blind in terms of the 90% that’s thought to be unstructured.
Because AI workloads are powered mostly by unstructured data, companies need visibility into their vector database stores and the data behind each prompt, the analyst said. Simply monitoring data pipelines and tables is no longer enough.
“Monte Carlo’s move finally puts documents, chat logs and transcripts under observability, and it’s a move that represents where trust in AI finally begins,” Ni said. “This marks the beginning of the end for siloed data observability, and the next platform battle will be around ‘decision observability,’ where AI signals come together in one trusted view.”
The company said its new unstructured data monitoring tool is compatible with platforms such as Snowflake, Databricks and Google BigQuery, and can natively integrate with those platform’s AI function libraries and large language models. For instance, it’s fully compatible with Snowflake Inc.’s Cortex Agents, which are intelligent bots that aim to orchestrate structured and unstructured information together to guide more reliable AI decision-making. It can also provide observability for Databricks Inc.’s AI/BI tool, which is a hybrid AI system that helps to generate rich insights relating to data lineage, data pipelines and more.
“By enabling native support for Snowflake Cortex Agents and Databricks AI/BI, Monte Carlo helps data teams ensure their foundational data is reliable and trustworthy enough to support real-time business insights driven by AI,” said Monte Carlo’s head of AI Shane Murray.
Monte Carlo said its move into unstructured data monitoring represents a key milestone in its broader mission to provide visibility across the full data and AI application lifecycle.
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.