Datadog announces LLM observability tools and its first generative AI assistant
Datadog Inc., one of the top dogs in the application monitoring software business, today announced the launch of new large language model observability features that aim to help customers troubleshoot problems with LLM-based artificial intelligence applications.
The new features were announced alongside the launch of its own generative AI assistant, which helps dig up useful insights from observability data.
Datadog is a provider of application monitoring and analytics tools that are used by developers and information technology teams to assess the health of their apps, plus the infrastructure they run on. The platform is especially popular with DevOps teams, which are usually composed of developers and information technology staff.
DevOps is a practice that involves building cloud-native applications and frequently updating them, using teams of application developers and IT staff. Using Datadog’s platform, DevOps teams can keep a lid on any problems that those frequent updates might cause and ensure the health of their applications.
The company clearly believes the same approach can be useful for generative AI applications and the LLMs that power them. Pointing out the obvious, Datadog notes generative AI is rapidly becoming ubiquitous across the enterprise as every company scrambles to jump on the hottest technology trend in years. As they do so, there’s a growing need to monitor the behavior of the LLM models that power generative AI applications.
At the same time, the tech stacks that support these models are also new, with companies implementing things like vector databases for the first time. Meanwhile, experts have been vocal of the danger of leaving LLM models just to do their own thing, without any monitoring in place, pointing to risks such as unpredictable behavior, AI hallucinations – where they fabricate responses – and bad customer experiences.
Datadog Vice President of Product Michael Gerstenhaber told SiliconANGLE that the new LLM observability tool provides a way for machine learning engineers and application developers to monitor how their models are performing on a continuous basis. That will enable them to be optimized on the fly to ensure their performance and accuracy, he said.
It works by analyzing request prompts and responses to detect and resolve model drift and hallucinations. At the same time, it can help to identify opportunities to fine-tune models and ensure a better experience for end users.
Datadog isn’t the first company to introduce observability tools for LLMs, but Gerstenhaber said his company’s goes much further than previous offerings.
“A big differentiator is that we not only monitor the usage metrics for the OpenAI models, we provide insights into how the model itself is performing,” he said. “In doing so, our LLM monitoring enables efficient tracking of performance, identifying drift and establishing vital correlations and context to effectively and swiftly address any performance degradation and drifts. We do this while also providing a unified observability platform, and this combination is unique in the industry.”
Gerstenhaber also highlighted its versatility, saying the tool can integrate with AI platforms including Nvidia AI Enterprise, OpenAI and Amazon Bedrock, to name just a few.
The second aspect of today’s announcement is Bits AI, a new generative AI assistant available now in beta that helps customers to derive insights from their observability data and resolve application problems faster, the company said.
Gerstenhaber explained that, even with its observability data, it can take a great deal of time to sift through it all and determine the root cause of application issues. He said Bits AI helps by scanning the customer’s observability data and other sources of information, such as collaboration platforms. That enables it to answer questions quickly, provide recommendations and even build automated remediations for application problems.
“Once a problem is identified, Bits AI helps coordinate response by assembling on-call teams in Slack and keeping all stakeholders informed with automated status updates,” Gerstenhaber said. “It can surface institutional knowledge from runbooks and recommend Datadog Workflows to reduce the amount of time it takes to remediate. If it’s a problem at the code-level, it offers concise explanation of an error, suggested code fix and a unit test to validate the fix.”
When asked how Bits AI differs from similar generative AI assistants launched by rivals such as New Relic Inc. and Splunk Inc. earlier this year, Gerstenhaber said it’s all about the level of data it has access too. As such, its ability to join Datadog’s wealth of observability data with institutional knowledge from customers enables Bits AI to assist users in almost any kind of troubleshooting scenario. “We are differentiated not only in the breadth of products that integrate with the generative interface, but also our domain-specific responses,” he said.
Image: Freepik
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU