UPDATED 12:00 EDT / APRIL 02 2026

Google’s new Gemma 4 models bring complex reasoning skills to low-power devices

Google LLC is upping the stakes for open-weights artificial intelligence models with the release of Gemma 4, its most advanced “open” model family so far.

Built on the same architectural foundation as Gemini 3, the models are designed to handle complex reasoning tasks and support autonomous AI agents running locally on low-power devices such as workstations and smartphones.

With Gemma 4, Google DeepMind researchers Clement Farabet and Olivier Lacombe said, they’ve managed to squeeze out more “intelligence per parameter,” allowing them to punch significantly above their weight class. For instance, the 31B Dense variant currently ranks third in open models on the industry-standard Arena AI Text leaderboard,.

The Gemma 4 models come in four flavors: Effective 2B, Effective 4B, a 26B Mixture of Experts model and a 31B Dense model. The smaller “Effective” models are designed for edge use cases on lightweight hardware such as Android smartphones or Raspberry Pi computers, the researchers said. Meanwhile, the 26B MoE model has a clever trick in that it only activates 3.8 billion parameters on inference tasks, allowing it to perform at high speed without sacrificing the deep knowledge base of larger models.

Farabet and Lacombe explained that each of the Gemma 4 models is better suited to running AI agents. Whereas earlier Gemma iterations forced developers to tweak their design so they could interact with other software tools, the Gemma 4 models have native support for function calling and structured JavaScript Object Notation outputs. This means developers can use them to power autonomous agents that interact with third-party tools and execute on multi-step plans.

All four models have the ability to process images and videos, with the smaller E2B and E4B variants going further with support for native audio inputs, enabling real-time speech understanding directly on device. Google has also increased the context window of the models, up to 128K for the smallest models and 256K for the larger two. This means developers will be able to upload an entire codebase or massive sets of documents with a single prompt.

Each of the models is being made available under a permissive Apache 2.0 license, which removes many of the commercial restrictions placed on other AI models, making them a great choice for developers building enterprise applications, Google said. They can be accessed directly through Google Cloud, and they’re also available along with their open weights on Hugging Face, Kaggle and Ollama.

The release underscores Google’s ambitions to dominate the “local AI” industry. Because even the larger Gemma 4 models are small enough to run on a single graphics processing unit, that makes them suitable for edge use cases and applications where low latency and digital sovereignty are high priorities, said Holger Mueller, an analyst with Constellation Research.

“Google is building its lead in AI, not only by pushing Gemini, but also open models with the Gemma 4 family,” he said. “These are important for building an ecosystem of AI developers, and will help the company to tap into functional and vertical use cases on different device form factors. Google set a high bar with its previous Gemma 3 release, and so there’s a lot of expectation with this release.”

Image: Google

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Google’s new Gemma 4 models bring complex reasoning skills to low-power devices

Image: Google

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

KubeCon + CloudNativeCon EU 2026

RSAC 2026 Conference

Nvidia GTC 2026

Google Cloud AI Agents in Action Series 2025/2026

MWC Barcelona 2026

Google’s new Gemma 4 models bring complex reasoning skills to low-power devices

Image: Google

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

KubeCon + CloudNativeCon EU 2026

RSAC 2026 Conference

Nvidia GTC 2026

Google Cloud AI Agents in Action Series 2025/2026

MWC Barcelona 2026

Cookies