UPDATED 15:20 EST / JULY 11 2024

How generative AI, data-centric aproaches and observability are revolutionizing AI data quality for better, more reliable results. AI

The crucial role of data in Scale AI’s gen AI platform: Transforming enterprises and public sector solutions

The importance of data in realizing enterprise artificial intelligence goals has put AI data quality at the forefront of discussions. Ensuring high-quality AI data is now vital for a wide range of enterprise applications, from innovative development projects to insightful business intelligence.

How generative AI, data-centric aproaches and observability are revolutionizing AI data quality for better, more reliable results.

Scale AI’s Vijay Karunamurthy discusses the importance of quality AI data.

Scale AI Inc., which provides labeled data used to train AI applications, strongly advocates for a data-centric approach. By enhancing self-driving technology and empowering both enterprise and public sector solutions, the company showcases the transformative power of well-managed AI data, according to Vijay Karunamurthy (pictured), field chief technology officer of Scale AI.

“Some of the first use cases we tackled were self-driving and autonomy,” he said. “How do you ensure the self-driving car stops at a crosswalk if a pedestrian’s crossing the crosswalk? If you find the right data that can help you reach that safety milestone … that data is worth its weight in gold for training the AI model and hitting that safety milestone in three months or six months, instead of waiting nine or 12 months to get there.”

Karunamurthy spoke with theCUBE Research’s John Furrier at AWS Summit New York, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed AI data quality, how Scale AI collaborates with Amazon Web Services Inc., and Scale AI’s efforts to pave the way for a future where applications are advanced, reliable and secure.

AI data quality: The backbone of Scale AI’s success across sectors

Scale AI’s collaboration with AWS has proven instrumental in meeting the diverse needs of its enterprise clients, especially those requiring robust data security and compliance. This synergy allows the company to leverage AWS’ advanced infrastructure to ensure its AI solutions are both secure and scalable.

The company has has made immense contributions to universal gen AI development for enterprises, according to Karunamurthy. Collaborations with research labs, including OpenAI, showcase its influence on cutting-edge AI developments. Scale AI contributed to reinforcement learning workflows that powered GPT-3 and subsequently ChatGPT, enabling reliable human-AI interactions.

“Today, we have three big buckets of customers,” Karunamurthy said. “One is the OpenAIs of the world, the research labs that are building the next cutting-edge models. We’re helping make those models more trustworthy across a range of new capabilities, like writing code or doing multimodal data.”

The second segment, enterprises, present unique challenges, particularly concerning data privacy and security. Scale AI’s generative AI platform addresses these by ensuring data remains within strict boundaries, such as virtual private clouds in specific locations. This is crucial for companies with sensitive data that cannot be compromised, according to Karunamurthy.

“They have unique needs because they have private data,” he said. “There has to be really strict role-based access control around how that data is used, even if you have derived data from that data. So, making gen AI work in that environment has been a big challenge, and we’ve built this gen AI platform around that.”

The third niche, public sector users, benefit from Scale AI’s agent-driven workflows, such as Agent Donovan. This tool aids in mission-critical tasks by providing reliable, actionable information. For example, if an agency needs to solve a supply chain issue quickly, Agent Donovan identifies potential solutions and visualizes them, enabling more effective decision-making, Karunamurthy explained.

“What Agent Donovan does is it gets the breakdown answer, it verifies each one of those using multiple models in the loop,” he said. “Then, rather than just giving you a block of text, it’ll go and plot it out on a map and show you the links in the chain of that supply chain so that you can visually understand. Someone that’s the responsible individual for the supply chain, this is where I can make changes, this is something I can do in the next 10 minutes to drive an outcome.”

Data observability and AI deployment challenges

With AI data quality top of mind, data observability is becoming a critical area of focus. Ensuring that models can access and interpret the correct data reliably is essential for maintaining the trustworthiness of AI outputs. Scale AI advocates for comprehensive monitoring of data pipelines to prevent issues and maintain the integrity of AI applications, according to Karunamurthy.

“You need constant, 24/7 online evaluation of gen AI applications, which is a new mindset that people didn’t have before,” he said. “What you need is 24/7 observability. If people are asking legal questions and they’re your employees, you need to have a range of different scenarios that you’re testing and you’re observing and you’re seeing how the model’s changing over time.”

The pace of AI innovation, particularly from major players such as OpenAI, has been breakneck. Enterprises often struggle to keep up, facing bottlenecks in AI data quality and integrating trustworthy solutions into their workflows. One significant challenge is maintaining the freshness of data. AI models require constant updates with real-time information to avoid staleness, which can lead to outdated or inaccurate outputs, Karunamurthy explained.

“You can have a model that chats with customers in the right tone of voice,” he said. “But, ultimately, your real milestone of how you measure your performance is can you unlock your people? Can you unlock all this organizational expertise that I have and have them drive improvements over time? Ultimately, you can’t hold any advantage that you can’t observe.”

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE Research’s coverage of AWS Summit New York

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU