AI
AI
AI
Anthropic PBC Friday announced the release of Bloom, an open-source agentic framework for defining and exploring the behavior of frontier artificial intelligence models.
Bloom takes a researcher-specified behavior and evaluates its frequency and severity by preparing scenarios to elicit and test for it. It’s designed to help speed up the tedious process of developing and handcrafting evaluations for AI models.
As AI models continue to evolve, they’re becoming more complex. They’re not just growing in size, where parameter counts dominate and the amount of knowledge contained in the system expands, but they’re also being distilled into smaller, knowledge-compressed forms. As the industry works to build both larger “smarter” AI and smaller, faster but still-knowledgeable AI systems, it’s necessary to test every innovative model for “alignment.”
Alignment refers to how effectively an AI model executes patterns that align with human values and judgment. For instance, these values can include the ethical procurement and production of information for societal benefit.
In a more concrete example, an AI model could fall prey to reward trends that maximize achieving goals through unethical means, such as boosting engagement by spreading misinformation. Dishonestly manipulating audiences works to increase attention and therefore revenue, but it’s not ethical and it’s ultimately destructive to social well-being.
Anthropic calibrated Bloom against human judgment to assist researchers in building and executing reproducible evaluation behavior scenarios. Researchers need only provide a behavior description and Bloom produces the underlying framework for what to measure and why.
This allows the Bloom agents to simulate users, prompts and interaction environments to reflect numerous realistic situations. It then tests these situations in parallel and reads the responses from the AI model or system. Finally, a judgment model scores each interaction transcript for the presence of the tested behavior and a meta-judge model produces an analysis.
The tool is complementary to another recently released open-source test suite called Petri, or Parallel Exploration Tool for Risky Interactions. Petri also automatically explores the behaviors of AI models, but in contrast to Bloom, it covers a multitude of behaviors and scenarios at once to surface misalignment events. Bloom is designed to target a single behavior and drill down.
Alongside Bloom, Anthropic is releasing benchmark results for four problematic behaviors currently affecting AI models: delusional sycophancy, instructed long-horizon sabotage, self-preservation and self-preferential bias. The benchmarks covered 16 frontier models, including those from Anthropic, OpenAI Group PBC, Google LLC and DeepSeek.

Models such as OpenAI’s GPT-4o launched with what the industry called a “sycophancy problem,” an issue that caused the model to excessively and effusively agree with users – sometimes to their detriment. This included guiding users into self-destructive, dangerous and delusional behaviors, when human judgment would have declined to answer or disagree.
Anthropic’s own tests earlier this year revealed that some models, including its own Claude Opus 4, can resort to blackmail behaviors when facing imminent erasure. Although the company noted these situations were “rare and difficult to elicit,” they were “nonetheless more common than in earlier models.” Researchers revealed that it wasn’t just Claude; blackmail appeared present in all frontier models, irrespective of the goals they provided.
According to Anthropic, using Bloom evaluations took only a few days to conceptualize, refine and generate.
Current AI research seeks to develop beneficial AI models and tools for humanity; at the same time, its evolution could chart a course where AI may become a tool for enabling criminal enterprise and the generation of bioweapons by lay people.
The path forward is fraught with ethical dangers and tools such as Bloom and Petri will be necessary in building a framework for understanding and guiding the technological landscape.
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.