UPDATED 15:54 EDT / NOVEMBER 01 2024

AI

Report: Chinese researchers used Llama 13B to build chatbot optimized for military use

Researchers in China have reportedly used Meta Platforms Inc.’s Llama 13B artificial intelligence model to develop a chatbot optimized for military use.

Reuters detailed the project today, citing academic papers and analysts.

Llama is a family of open-source large language models that Meta released in February 2022. Developers can use the algorithms at no charge in both research and commercial projects. Under Meta’s licensing terms, the Llama series may not be used for military applications.

According to Reuters, Llama was mentioned in a June AI paper authored by six researchers from three Chinese institutions. Two of those institutions operate under the Academy of Military Science, the People’s Liberation Army’s leading research body. The paper details a Llama-powered chatbot called ChatBIT that is “optimised for dialogue and question-answering tasks in the military field.”

The chatbot is reportedly based on Llama 13B, a model that rolled out at the time of the LLM family’s initial release last February. The model is based on a modified version of the industry-standard Transformer neural network architecture. Meta’s engineers added performance optimizations to the architecture and made other enhancements that improved its ability to understand lengthy prompts.

The creators of the ChatBIT chatbot reportedly modified Llama 13B by adding custom parameters. Those are configuration settings that manage how a neural network processes data. Additionally, the researchers gave the chatbot access to 100,000 military dialogue records.

Another paper detailed in today’s report was published by two researchers from an aviation company that has been linked to the People’s Liberation Army. The paper discussed using Llama 2 for “the training of airborne electronic warfare interference strategies.”

Llama 2 is an iteration of the LLM series that Meta released last July, a few months after the original version. It was trained on 40% more data than the first-generation Llama models and can process prompts with twice as many tokens. A token is a unit of data that corresponds to a few characters.

Llama 2 implements an AI technique called grouped-query attention, or GQA, that was not supported by the earlier models. The technique reduces the hardware requirements of an LLM’s attention mechanism, a component used to interpret prompts. By lowering AI models’ infrastructure usage, GQA helps speed up inference and cut costs.

Meta has introduced several new iterations of its LLM series since Llama 2 debuted last year. The most capable model released by the company to date, Llama 3.1 405B, made its debut this past July. It’s better at reasoning tasks and can process prompts with more than 60 times the amount of data supported by the first-generation Llama algorithms.

Meta developed Llama 3.1 405B using 16,000 H100 graphics processing units. Earlier this week, Chief Executive Mark Zuckerberg revealed that the next iteration of the LLM series is being trained on an even larger AI cluster with more than 100,000 H100s. He detailed that work on Llama 4 is already “well underway,” with the first models from the upcoming series set to roll out early next year.

Image: Meta

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU