UPDATED 13:55 EDT / JULY 10 2024

AI

Anthropic expands developer console for creating and evaluating AI prompts

Generative artificial intelligence startup Anthropic PBC has opened up new features on its back-end console aimed at developers and technical teams that will allow them to generate, test and evaluate AI prompts.

The company announced Tuesday that developers can now access a built-in prompt generator, powered by Anthropic’s AI model Claude 3.5 Sonnet, which allows users to describe the task at hand and have the AI large language model create a high-quality prompt for them. Descriptions can be anything, such as “Write me a triage document for inbound customer support requests.”

The generated prompts include input variables that can include things such as the inbound customer support message or other information expected to be part of issue management. This feeds into the second feature Anthropic announced test suites for prompts.

Users can manually add or import test cases that provide data for the variables so that prompts can be tested to see how the LLM will react, or ask Claude to auto-generate test cases.

This is useful because coming up with realistic test data is time-consuming and difficult, especially if there’s not much of it already on hand. Even with a small amount of customer support messages, an LLM such as Claude can generate a large variety of convincing synthetic test data that will robustly cover many cases. Users can also manage the test cases as needed, so if the technical teams know of specific issues that might fall through the cracks, they can add them.

Once test cases are generated, users can view and adjust them to match their needs to iterate on how they were produced. This includes manual and generated test cases.

Model evaluation and refining prompts are now quicker as users can create new versions of their prompt and re-run tests to rapidly iterate. The test cases from previous prompts remain and can be regenerated on modified prompts quickly with the variable sets.

Anthropic also added the ability to compare the output of two or more prompts side by side so performance can be evaluated. As a result, how a modification or iteration on a prompt changed the output can be better understood. For example, if the triage prompt from above were adjusted to have the priority of certain types of customer calls elevated over others, but the wording in the elevation needed to be changed to get the attention of the LLM better, it could be seen happening in the comparison.

The company said that subject matter experts are also available to grade responses from AI models on a five-point scale to show if changes have improved response quality.

Anthropic said that the prompt generation and output comparison features are now available to all users on the console. Documentation and tutorials are available on the company’s website.

Image: Anthropic

A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+  
CUBE Alumni Network
C-level and Technical
Domain Experts
15M+ 
theCUBE
Viewers
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.