UPDATED 16:05 EST / NOVEMBER 08 2023


Datadog integrates with Google Cloud’s Vertex AI to monitor health and performance of generative AI models

Application monitoring and security platform provider Datadog Inc. said today it’s expanding its partnership with Google Cloud, becoming one of the first artificial intelligence observability partners for Google Cloud’s Vertex AI platform.

The integration will enable AI operations teams and developers to easily monitor, analyze and optimize the performance of AI and machine learning models, the company said. The Vertex AI platform is a constellation of cloud services that can be used by companies to build machine learning models.

With the recent launch of new generative AI capabilities in Vertex AI, the platform can also be used to customize large language models that can perform tasks such as text and image generation. It provides access to a range of foundational generative AI models, and various tools for customizing them.

As developers start getting their AI models up and running in Vertex AI, Datadog says, they will need a way to monitor their performance, hence today’s integration. The company provides application monitoring and analytics tools that can help developers to assess the health of their apps and also their AI models, in addition to the infrastructure they run on. Its tools are especially popular with DevOps teams.

Datadog President Amit Agarwal said his company is already a key partner of Google Cloud when it comes to cloud application monitoring. “The new Vertex AI integration expands this partnership and gives AI and ML developers full observability into their production applications built on Vertex AI,” he said. “With out-of-the-box dashboards and real-time monitors, customers can get started quickly and ensure their models are performing at an optimal level while delivering predictions responsively at scale and without errors.”

The company first announced its generative AI observability capabilities in August, alongside the launch of its own generative AI assistant, which helps surface useful insights from the data it digs up.

Datadog said its integration with Vertex AI is available from today and will give developers full visibility into the prediction performance and resource utilization of their custom AI models. Users can access a new dashboard that provides insights into model prediction counts, latency, errors and the amount of resources, including memory, network and compute, they are using. It will enable teams to compare different AI models side-by-side in production environments. In addition, the integration provides a way for Vertex AI users to detect any data anomalies that might affect the reliability of their models.

“With this new integration, AI teams can improve how they monitor and analyze the performance of machine learning applications built on Vertex AI, ensuring they are functioning correctly and creating optimal value,” said Kevin Icchuparani, Google Cloud’s corporate vice president of global partner ecosystem and channels.

Holger Mueller of Constellation Research Inc. said there is a valid need for monitoring and observability in the AI industry as many businesses are implementing the technology for the first time. “Datadog’s integration with a leading AI platform is exactly what these enterprises need, namely a single pane of glass through which they can see and monitor their IT processes, applications and AI models,” he said. “There is a strong value proposition to this partnership between Datadog and Google Cloud.”

AI might be the main focus of today’s update, but it’s not the only new capability Datadog has added for Google Cloud users. It’s also adding in-depth support for Google Cloud Run, which is a serverless computing technology available on Google Cloud. Datadog now offers native distributed tracing across all runtimes, with the ability to collect custom metrics and logs.

Moreover, joint customers can now send their Google Cloud Security Command Center findings on container and virtual machine vulnerabilities, threats and errors directly to Datadog, in order to see how these are impacting the performance and security of their applications.

There’s also a new quick setup experience for Datadog on Google Cloud, enabling customers to get started with its platform in a matter of seconds, the company said.

Finally, Datadog said it has earned the Google Cloud Ready destination for Google Cloud SQL, meaning it can now provide visibility into the performance and health of that service. According to Datadog, the integration is able to monitor throughput, memory and availability metrics for databases including MySQL, PostgreSQL and SQL Server.

Image: Google

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy