UPDATED 15:21 EST / AUGUST 14 2024

AI

Elon Musk’s xAI debuts new Grok-2 and Grok-2 mini language models

Elon Musk’s xAI Corp. has debuted two new language models, Grok-2 and Grok-2 mini, that it claims can perform some tasks with similar accuracy to OpenAI’s GPT-4o.

The models rolled out to X on Tuesday. Later this month, they will also become available to developers through an application programming interface. The API will make it possible to integrate Grok-2 and Grok-2 mini into third-party services.

The debut didn’t go swimmingly, as apparent lack of guardrails allowed some distasteful images to be produced.

Musk launched xAI early last year to develop large language models. The company released its first LLM, Grok-1, later in 2023 and subsequently raised $6 billion from investors to finance the development of additional models. Grok-2 and Grok 2 mini, the latest fruits of the engineering effort, are rolling out about four months after the previous addition to xAI’s LLM lineup.

Grok-2, the more advanced of the two models, can generate text, troubleshoot code and perform related tasks. It’s also capable of analyzing user-provided images. Grok-2 mini is a scaled-down version of the LLM that trades off some output quality for faster response times and lower inference costs.

In an internal test, xAI compared Grok-2 against several competing models to assess the quality of its output. The evaluation comprised eight benchmark datasets that researchers commonly use to measure LLMs’ accuracy. According to xAI, Grok-2 achieved “performance levels competitive” with the most advanced LLMs on the market.

One of the benchmark datasets that xAI used, GPQA, comprises 448 multiple-choice questions spanning several scientific fields. LLMs that complete the test receive a score reflective of how many questions they answered correctly. Grok-2 achieved a score of 56, which put it ahead of both GPT-4o and Meta’s newly released Llama 3 405B model.

The only LLM that outperformed Grok-2 in the GPQA test is Anthropic PBC’s Claude 3.5 Sonnet. The latter model achieved higher scores across most of the benchmark datasets that xAI used in the evaluation with the exception of two that comprised math questions. Grok-2 mini, in turn, achieved lower scores than the other LLMs across nearly all the benchmark datasets. 

Both of xAI’s new models became available in X on Tuesday for users with paid Premium and Premium+ subscriptions. The LLMs are accessible through a ChatGPT-like chatbot interface.

X’s implementation of Grok-2 is integrated with a third-party AI model called FLUX.1. The latter model, which was developed by a startup called Black Forest Labs Inc., allows users to generate images with natural language prompts. The Verge reported that Grok-2’s image generation features currently appear to have few guardrails against harmful output.

Later this month, xAI plans to make Grok-2 and Grok-2 mini available through an API. The offering will enable developers to integrate the models into their own applications. The API includes cybersecurity controls, a traffic analytics tool and the option to deploy the models in data centers near end-users to reduce latency.

Image: xAI

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU