UPDATED 18:06 EDT / SEPTEMBER 29 2025

Anthropic sets AI coding record with new flagship Claude Sonnet 4.5 model

Anthropic PBC today debuted its newest large language model, Claude Sonnet 4.5, and a toolkit for building artificial intelligence agents.

The company describes the LLM as the world’s best coding model. Additionally, it says that Sonnet 4.5 has set a record on a benchmark designed to evaluate neural networks’ tool use capabilities.

Sonnet 4.5 is a hybrid reasoning model, which means it has two modes. When users enter relatively simple queries, the LLM quickly generates a response using a limited amount of computing power. When it receives a more complicated question, Sonnet 4.5 can spend a significant amount of time working on an answer. That approach boosts output quality at the expense of higher hardware usage.

Anthropic evaluated the model’s programming capabilities using a benchmark called SWE-bench Verified. Sonnet 4.5 set a new industry record with a 82% score. The next two highest scores were also achieved by Anthropic models while the fourth place went to GPT-5 Codex, which answered 74.5% of the questions correctly.

Sonnet 4.5 also set a record on a second benchmark called OSWorld. It’s used to measure how well neural networks interact with external applications such as databases. Sonnet 4.5 achieved a record score of 61.4%, a nearly 20% improvement over the Sonnet 4 model Anthropic released four months ago.

The company claims that its latest LLM also outperformed the competition across more than a half-dozen other benchmarks. According to Anthropic, those tests evaluate AI models’ ability to perform tasks such as interpreting graphs and analyzing financial data.

Sonnet 4.5 is available through Anthropic’s Claude chatbot service, Claude Code programming assistant and its application programming interface. The latter two products received updates today in conjunction with the LLM launch.

Developers interact with Claude Code by entering instructions into a command line interface. Anthropic has made several usability improvements to that interface as part of today’s update. Additionally, it’s rolling out an extension that embeds Claude Code in the popular Visual Studio Code programming tool. The extension is currently available in beta.

The other major addition to Claude Code is a feature that automatically saves the user’s code after every major change. If an error finds its way into the workflow, developers can rewind their code to an earlier, reliable version.

The upgrades are rolling out alongside a development toolkit called the Claude Agent SDK. According to Anthropic, its engineers originally built the toolkit to power Claude Code. Customers can use it to build AI agents.

Claude Agent SDK enables an agent to delegate work to so-called subagents that can perform multiple tasks in parallel, which speeds up processing. Additionally, the toolkit makes it easier to build AI applications that can interact with external systems. To reduce the risk of hallucinations, agents built with Claude Agent SDK can check their output for accuracy issues.

The toolkit can be used with the Claude API, which now provides access to Sonnet 4.5. The LLM is joined by several other enhancements.

According to Anthropic, developers can now give its AI models access to a “dedicated memory directory” with information that can help them answer prompts. When the information is no longer needed, it can be removed from a model’s context window using a new context editing tool. Anthropic says that the enhancements will enable the Claude API to tackle more complicated tasks than before.

Image: Anthropic

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Anthropic sets AI coding record with new flagship Claude Sonnet 4.5 model

Image: Anthropic

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

MWC Barcelona 2026

Vast Forward 2026

CES 2026

AWS re:Invent 2025

Microsoft Ignite 2025

Anthropic sets AI coding record with new flagship Claude Sonnet 4.5 model

Image: Anthropic

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

MWC Barcelona 2026

Vast Forward 2026

CES 2026

AWS re:Invent 2025

Microsoft Ignite 2025

Cookies