UPDATED 22:47 EST / JULY 31 2024

Google’s lightweight Gemma LLMs get smaller, but they perform even better than before

Google LLC is advancing its efforts in open-source artificial intelligence with three new additions to its Gemma 2 family of large language models, which it said are notably “smaller, safer and more transparent” than many of their peers.

The company released its first Gemma models back in February. They’re different from its flagship Gemini models, which are used in the company’s own products and services and are generally considered to be more advanced. The main differences are that the Gemma models are much smaller and fully open-source, meaning they’re free to use, whereas the Gemini family models are bigger and closed-source, so developers must pay for access.

The Gemma models are based on the same research as the Gemini LLMs, and represent Google’s effort to foster goodwill in the AI community, similar to how Meta Platforms Inc. is doing the same with its Llama models.

Of the three new models announced today, the most important is Gemma 2 2B, which is a lightweight LLM designed for generating and analyzing text. According to Google, it’s designed to run on local devices such as laptops and smartphones, and it is licensed for use in research and commercial applications.

Although Gemma 2 2B contains just 2.6 billion parameters, Google said it demonstrates performance that’s on a par with, and sometimes even superior to much larger counterparts, including OpenAI’s GPT-3.5 and Mistral AI’s Mistral 8x7B.

To back up its claims, Google published the results of independent testing by LMSYS, an AI research organization. According to LMSYS, Gemma 2 2B achieved a score of 1,126 in its chatbot evaluation arena, surpassing Mixtral-8x7B, which scored 1,114, and GPT-3.5-Turbo-0314, which scored 1,106. The results are impressive, as the latter models have almost 10 times more parameters than the latest edition of Gemma.

Google said Gemma 2 2B’s capabilities extend beyond size efficiency. It scored 56.1 on the Massive Multitask Language Understanding benchmark, and 36.6 on the Mostly Basic Python Programming test, improving on the score of earlier Gemma models.

The results challenge the idea in AI that larger parameter size equates to better performance. Instead, Gemma 2 2B shows that by employing more sophisticated training techniques and using superior architectures and higher-quality training data, it’s possible to compensate for a lower number of parameters.

Google said its work could help initiate a shift by AI companies, away from a focus on building ever-larger models. Instead, it could lead to AI model makers focusing more on refining existing models to make them perform better.

In addition, Google said Gemma 2 2B also illustrates the importance of using model compression and distillation techniques. The company explained that Gemma 2 2B was developed by distilling the knowledge from much larger models. It hopes that progress in this area will enable the development of more accessible AI with reduced computational power requirements.

The other, more specialized models announced by Google today include ShieldGemma, which is actually a collection of safety classifiers designed to catch toxic responses such as hate speech, harassment and sexually explicit content. ShieldGemma is built on top of the original Gemma 2 model, and can be used by developers to filter malicious prompts that encourage models to respond in an undesirable way. It can also be used to filter the actual responses of LLMs.

Finally, Gemma Scope is an attempt to bring greater transparency to the Gemma 2 models. It does this by zooming in on specific parts of the Gemma 2 models, helping developers to interpret their inner workings.

“Gemma Scope is made up of specialized neural networks that help us unpack the dense, complex information processed by Gemma 2, expanding it into a form that’s easier to analyze and understand,” Google wrote in a blog post. “By studying these expanded views, researchers can gain valuable insights into how Gemma 2 identifies patterns, processes information and ultimately makes predictions.”

Gemma 2 2B, ShieldGemma and Gemma Scope are all available to download now from various sources, including Hugging Face.

Featured image: SiliconANGLE/Microsoft Designer

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Google’s lightweight Gemma LLMs get smaller, but they perform even better than before

Featured image: SiliconANGLE/Microsoft Designer

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

MWC Barcelona 2026

Vast Forward 2026

CES 2026

AWS re:Invent 2025

Microsoft Ignite 2025

Google’s lightweight Gemma LLMs get smaller, but they perform even better than before

Featured image: SiliconANGLE/Microsoft Designer

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

MWC Barcelona 2026

Vast Forward 2026

CES 2026

AWS re:Invent 2025

Microsoft Ignite 2025

Cookies