UPDATED 19:04 EDT / OCTOBER 15 2025

AI

Anthropic debuts entry-level Claude Haiku 4.5 hybrid reasoning model

Anthropic PBC today debuted Claude Haiku 4.5, a large language model geared toward cost-sensitive use cases.

The company will charge users of the model $1 per million input tokens and $5 per million output tokens. Anthropic’s flagship LLM, Claude Sonnet 4.5, costs three times as much.

Haiku 4.5 is a hybrid reasoning model, which means that it can adjust the amount of computing power it uses to process requests. By default, the algorithm generates responses through a workflow that requires limited hardware resources. Users can enable an “extended thinking” mode to have Haiku 4.5 produce more complex responses that take longer to generate.

Anthropic trained the LLM on public webpages, content from third-party data providers and internal records. The latter files included information from Claude customers who gave the company permission to use their data for AI training. Anthropic removed duplicate entries from the dataset to increase training efficiency.

According to the company, Haiku 4.5 can ingest multimodal prompts containing up to 200,000 tokens’ worth of information. That enables it to process large files such as lengthy business documents. The model outputs up to 64,000 tokens per response.

Anthropic evaluated Haiku 4.5’s capabilities using eight popular benchmarks. The LLM trailed Anthropic’s flagship Sonnet 4.5 model by less than 10% across most of the tests. It managed to outperform the company’s previous flagship LLM, Sonnet 4, across three benchmarks that contained coding tasks and high school math problems.

Improved cost-efficiency is not Haiku 4.5’s only selling point. Anthropic describes it as the the safest LLM developed by its engineers to date. Additionally, the algorithm is more than twice as fast as Sonnet 4, which makes it useful for latency-sensitive applications such as customer support chatbots.

Haiku 4.5 also lends itself to AI agent projects. According to Anthropic, an agent based on its flagship Sonnet 4.5 model could reduce inference costs by relegating simple tasks to Haiku 4.5 sub-agents. Such workflows can be used to automate multistep coding and market research tasks.

The new model is available through application programming interfaces and Anthropic’s Claude chatbot. It’s also included in Claude Code, which has emerged as a major growth driver for the company since its launch in May. Reuters today cited sources as saying that the programming assistant is approaching $1 billion in annual recurring revenue. 

Similarly to Anthropic, OpenAI offers scaled-down versions of its flagship LLM. GPT-5 Mini and GPT-5 Nano have more limited reasoning capabilities than their namesake and cost significantly less. Both OpenAI and Anthropic enable developers to cache frequently recurring prompt responses, which reduces inference costs by removing the need to generate the same output from scratch multiple times. 

Image: Anthropic

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.