UPDATED 10:14 EDT / NOVEMBER 29 2024

Safety and reliability concerns temper agentic AI mania

SPECIAL REPORT: AI AND THE CLOUD by Paul Gillin

When information services and technology firm Thomson Reuters Corp. acquired accounting software firm Materia in October, it paved the way for what it sees as the future of its business.

Materia, which is the business name of Credere Technologies Inc., builds software agents that can break down complex queries and consult numerous information sources to deliver a compound answer.

Such automation could be a godsend for the accounting profession, which is facing a shortage of up to 3.5 million accountants by 2025. One reason often cited is the volume of tedious information-gathering the job involves.

Thomson Reuters’ Wong sees applications of agentic AI across the board in professional services. Photo: X

Materia is tackling the problem with a new type of generative artificial intelligence that dispatches autonomous agents to identify the most important sources of information, look up the required material and deliver a consolidated summary.

Unlike generative AI, which sorts through large data corpora to produce outputs that mimic human creativity but do not independently set or pursue goals, agentic AI acts autonomously with decision-making capabilities, often guided by goals or objectives. Agentic systems can plan, reason and execute tasks across multiple steps, adapting to changing environments and contexts.

Thomson Reuters sees agents as a perfect match for its business of providing actionable information to business professionals.

“You’d think a question about the accounting treatment of sales and revenue is straightforward, but you need to consult standards information from a source like Thomson Reuters’ Checkpoint products, internal accounting policies and various online resources,” said David Wong, Thomson Reuters’ chief product officer. “You need to pull the information relevant to that particular question, reconcile it and produce a response to the user.”

Materia deconstructs such problems into individual steps that an agent can tackle. “The approach that they took was to think of every problem as being essentially a multistep one, even if the steps were really, really simple to begin with,” Wong said.

Beyond generative AI

Autonomous agents are potentially far more powerful than the current breed of information-gathering generative models. In theory, agents can work independently, tackling complex tasks with little need for supervision. OpenAI Chief Executive Sam Altman has called them “AI’s killer function.”

CausaLens’ Wall: “Autonomy is negatively correlated with the stake of the decision. Photo: LinkedIn

The online dictionary Whatis.com defines agentic AI as “systems capable of autonomous action and decision-making [that] can pursue goals independently, without direct human intervention.” Agents have generated enormous enthusiasm recently because of their potential to pull information from multiple sources and use that intelligence to automate processes and deliver finished plans or products.

Large language models have solved the information retrieval problem. The next step is to apply automation to the results.

However, agentic AI has also raised questions about accountability, reliability and safety. At its worst, a generative AI model dispenses misleading or incorrect information. Agentic models can do far more damage if given too much decision-making latitude, potentially including financial damage or threats to human safety.

That’s why most experts believe AI agents will be limited to noncritical business decisions in clearly defined domains for the foreseeable future. “Autonomy is negatively correlated with the stake of the decision,” said James Wall, product manager at causal AI software firm causaLens, the business name of Impulse Innovations Ltd.

“Much of the talk about agentic AI today actually refers to rudimentary automations that are far from agentic systems that can dramatically transform workflows,” said David Vellante, chief analyst at theCUBE Research.

Many challenges

“Enterprise leaders should view agentic AI the same way they should view quantum computing,” said Kjell Carlsson, head of AI strategy at Domino Data Labs Inc. “Like quantum, agentic AI has the potential to be very powerful and disruptive, but there are too many fundamental challenges on a technological and governance level that need to be solved before leaders should spend any time thinking about it.”

Those risks have done little to quell a gold rush as software vendors scramble to add agentic capabilities to their products. Gartner Inc.’s 2024 Emerging Technology Hype Cycle has autonomous agents climbing the expectation curve just behind artificial general intelligence. The research firm predicts agents will be embedded in one-third of enterprise software applications by 2028. A recent survey of 100 information technology executives by Forum Ventures LLC found that 48% are beginning to adopt AI agents, and 33% are actively exploring them.

Kjell Carlsson, head of AI strategy at Domino Data Lab Inc, talks with theCUBE about data science in AI.

Domino Data Labs’ Carlsson: Too many fundamental challenges need to be solved. Photo: SiliconANGLE

“When you look at doing things much more efficiently, agentic AI is the way,” said Bhaskar Roy, chief of AI products and solutions at Workato Inc., which recently added agentic capabilities to its data integration platform.

In October, Salesforce Inc. launched Agentforce, a set of tools that enables customers to build and customize agents that can access company data and take actions on behalf of employees in areas like sales, service, marketing and commerce. Moveworks Inc., Cisco Systems Inc., Thoughtful Automation Inc. and UiPath Inc. have recently joined the agentic parade. And at Amazon Web Services Inc.’s re:Invent conference starting Monday, AI agents are expected to get top billing on multiple keynotes.

Just this past week, a team of prominent developers from Google, Meta Platforms Inc., Stripe Inc. and Dropbox Inc. said they’ve raised more than $50 million to build an operating system for agents. “Just as Android made mobile development accessible to virtually any developer, we’re building the platform that will help make AI agents mainstream,” said David Singleton, co-founder and chief executive of a startup called /dev/agents.

Startup Thoughtful Automation Inc. has raised more than $38 million for its AI agents that handle claims processing, patient eligibility verification and payment posting for healthcare companies, claiming its customers see between five and nine times return on investment. Construction management software firm Procore Technologies Inc. just rolled out a line of agents that are said to enforce project efficiency, improve safety, enhance decision-making and streamline workflows. Human resources software provider UKG Inc. just introduced agents that can monitor regulatory changes and alert HR professionals.

A July report by CB Information Services Inc. estimated that more than 50 companies are building agents, agentic workflows and agent infrastructure. The number is no doubt much larger now.

‘Subroutine in a tux’

The concept of autonomous agents is nothing new. “This type of architecture has been around for several decades; object-oriented software used many of these principles,” said Bern Elliot, vice president and distinguished analyst at Gartner. “Think of it as a subroutine dressed up in a tuxedo.”

Creating agents that can interact with each other reliably means solving unprecedented engineering problems, says Berkeley Professor Niloufar Salehi. Learn more about agentic AI challenges. Photo: Niloufar Salehi

Krishna Tammana, chief technology officer at conversational messaging platform GupShup, the business name of Webaroo Inc., sees many potential everyday uses for agents. They include monitoring data from wearables and medical records to identify potential health risks and recommend interventions, intervening to prevent fraud in financial transactions, automating routine customer service interactions and conducting initial job interviews.

“Instead of simply retrieving search results, agentic AI can actively guide the user, offering product options and explaining each recommendation,” he said.

Software development is primed for an agentic revolution, said Kevin Cochrane, CMO of Vultr, the business name of cloud computing provider The Constant Company LLC. “These models will not only generate functional code but also automate testing steps that adhere to organizational coding standards,” he said. “Autonomous coding agents can potentially manage entire coding cycles for tasks such as troubleshooting bugs or optimizing legacy systems.”

AI behind the wheel

One well-known example of agentic AI is autonomous vehicles. They integrate input from various sources and continually make decisions that adapt to changing road and traffic conditions. Their decision-making domain is well-defined, and humans can always override their choices.

Gartner’s Elliot: LLMs have opened huge new use cases for agentic AI. Photo: Gartner

However, autonomous vehicles also exemplify some of the ethical and moral dilemmas of assigning too much control to a machine. Despite excellent safety records, occasional mistakes have inhibited broader adoption. Agentic AI will encounter many of the same objections.

LLMs have rekindled interest because of their striking success in delivering sophisticated responses to questions posed in everyday language. Generative AI has removed the need for users to master sophisticated programming languages to create agentic interactions. “You don’t have to be explicit with an LLM,” Elliot said. “It concludes and decides what information is needed for the task.” That potentially opens up huge new use cases.

The distinction between agentic and generative AI is fuzzy enough that the term is being applied loosely. Materia, for example, barely mentions agents on its website, calling itself instead the “generative AI platform for intelligent accounting.” Thomson Reuters’ Wong acknowledged that agentic capabilities are more of a roadmap than a current deliverable. “Eventually, we want these agents to take action,” he said.

‘Slapped on anything’

An LLM that culls responses from a defined set of sources and delivers a recommendation can appear to be acting independently but is actually only following instructions. “My experience is that the agent label will be slapped on anything,” said Matt McLarty, chief technology officer at integration specialist Boomi Inc.

Domino Data Labs’ Carlsson advises IT leaders to look out for what he called “agentic AI washing,” or exaggerated claims that products appear more innovative, advanced or competitive than they are. “Since practical agentic AI is infeasible for enterprises, vendors will increasingly use these terms as fancier-sounding ways to refer to the real and important capabilities necessary for orchestrating AI workflows, which is called MLOps,” he said.

An example of how fine a line exists between programmatic and agentic behavior is Reserve with Google, an assistant Google LLC introduced in 2017 that can make restaurant, event and other reservations using online reservation forms or dialing the establishment and placing the order using a synthesized voice.

That’s an automation, but not an agent, said Nenshad Bardoliwalla, Google Cloud’s director of product management for Vertex AI and cloud AI industry solutions. “If you asked it to book a restaurant reservation but didn’t specify the restaurant, that would be agentic behavior,” he said.

Question of limits

The question that will loom over agents for a long time is how much autonomy to give them. The current crop mostly works within a defined set of applications, as with Salesforce’s Agentforce. That allows interactions to be strictly defined and carefully monitored.

Sonar’s Wang: “We’re not at the autonomous auto stage; we’re more at cruise control.” Photo: LinkedIn

“We’re not at the autonomous auto stage; we’re more at cruise control,” said Harry Wang, vice president of growth and new ventures for software quality firm Sonar, the business name of SonarSource SA. “You need people to have their hands on the steering wheel.”

Workato’s agentic orchestration platform, introduced in August, narrowly defines agents’ domains to competencies it calls skills and allows agents only to take actions with permissions granted by human users. “Use cases need to be deterministic; you don’t want an agent to go rogue, so autonomy needs to be controlled,” Roy said.

Agentforce exemplifies what agents will likely look like for at least the next few years. They can be programmed to access functions within the portfolio of Salesforce applications and selected partners but are limited in the actions they can take and how they interact with the outside world.

Interoperability issues

That’s another major challenge of making agentic AI more broadly useful. Internet services are built with various technologies, protocols and application programming interfaces with little standardization. Public APIs may have inconsistent documentation, rate limits or updates that can break integrations. Services use diverse data formats that complicate communication. Scalability, predictable performance and security are other major concerns.

“To make agents work in concert, several pieces have to fall into place, not the least of which is the ability to harmonize disparate data types and formats,” said theCUBE Research’s Vellante. “This is nontrivial and something often glossed over.”

In a perfect world, a person would ask an agent to book a trip to a resort destination on certain days, with certain activities and within a stated budget and receive a full proposed itinerary in return. With one click, the agent could make all the necessary reservations.

But it’s one thing to instruct an agent to book a window seat on a particular flight on a specific day and another to let it choose the airline and reservation on its own.

“The question is the degree of agency and the authority you wish to delegate,” said Gartner’s Elliot. “If they can autonomously do things, then you have agents without accountability because they don’t understand the context.”

Google’s Bardoliwalla sees cross-agent communication as a major challenge. Photo: LinkedIn

The travel booking example would involve the agent conversing with multiple airlines and hotels, figuring out rates and discounts and achieving the optimal balance of adventure and relaxation. That could include interacting with dozens of other travel agents. Though most travel services expose APIs, the interoperability standards that permit full autonomy don’t exist yet.

“There is a technical challenge to translating what the models know into speaking to the APIs that glue the action systems to the model so they can talk to each other,” said Google’s Bardoliwalla. “You need the model to think and respond in a step-by-step manner that is reasonable and makes sense. Not only do they have to understand API signatures, but they have to adhere to a very specific format.”

CB Insights’ report said reservations about agents’ ability to execute complex tasks across multiple services will limit broad enterprise adoption, at least in the short term. “Despite the rising momentum, agents remain limited in their ability to execute tasks reliably across the internet and software apps,” the report said.

Google, Sabre Corp., Alaska Air Group Inc., InterContinental Hotels Group PLC, Sonderbase Technologies PLT and Mindtrip Inc. are just a few of the established and emerging companies building AI-powered travel planners. Still, none allows its bots to book reservations on its own.

“The future of agentic AI depends on balancing its potential with responsible deployment, guided by ongoing research and evolving regulations,” said Unmesh Kulkarni, head of generative AI at data science firm Tredence Inc.

Lack of empathy

Even in scenarios that lend themselves to automation, such as customer service, placing too much trust in AI can backfire. Customer service interactions often involve emotion and conflict, necessitating a human touch. “Agents don’t have empathy,” said Workato’s Roy.

Jumio’s Kumar sees the potential for agents to create “a significant threat to consumer trust and market integrity.” Photo: X

Bala Kumar, chief product and technology officer at identity management firm Jumio Corp., sees darker possibilities. Unregulated activity on e-commerce and dating platforms risks “fundamentally disrupting user engagement and sales,” he wrote in an e-mail message to SiliconANGLE. “E-commerce bots will wreak havoc on online shopping, particularly during high-demand events like concert ticket sales, creating a significant threat to consumer trust and market integrity.”

Improving model transparency could be a big step toward increasing trust in agents, but that goal has proved elusive in generative AI because of the nature of the complex and nonlinear interactions between potentially billions of parameters. Agentic AI is even more complicated.

Black box problem

“I don’t think the industry talks enough about the black box problem,” said Daniel Avancini, chief data officer at data services company Indicium Tech Corp. “If you don’t have well-defined processes, there are a lot of ethical risks, such as hallucinations and unpredicted outcomes. Answers have to be well-defined.”

CausaLens’ Wall believes familiarity will breed acceptance. Although bad actors will undoubtedly co-opt agents in nefarious new ways, the vast majority will help knowledge workers be more productive.

He recalls the introduction of spreadsheets to the workplace. “People were skeptical about the ‘mean’ function; was it really taking an average?” he said. “Spreadsheets started in a low-trust environment, and we’ve slowly come to rely on them. Automation will be as good as people’s ability to understand it.”

Vellante sees agents improving over time to become ubiquitous in the workplace. “Agents will observe and learn from human reasoning,” he said. “The vast amount of work that is non-automated today will begin to be streamlined in ways that will drive dramatic improvements in productivity.”

Image: SiliconANGLE/DALL-E

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.