UPDATED 14:06 EDT / NOVEMBER 22 2023

AI

Inflection AI debuts new flagship Inflection-2 LLM trained on 5,000 H100 chips

Inflection AI Inc. today debuted its new flagship large language model, Inflection-2, which it claims can outperform most major rivals with the exception of OpenAI’s GPT-4.

Inflection AI was founded in March by Mustafa Suleyman, co-founder of Google LLC’s DeepMind artificial intelligence research group, and LinkedIn co-founder Reid Hoffman. The company closed a $225 million investment two months after launching. In June, it raised an additional $1.3 billion from Microsoft Corp., Nvidia Corp. and other high-profile investors.

A week before its latest funding round, Inflection AI debuted its first LLM. Dubbed Inflection-1, the model can write and rewrite text, summarize documents, generate code and perform related tasks. Inflection-2, the new LLM model the company detailed today, is the successor to Inflection-1.

The new model is significantly more capable than its predecessor. According to Inflection AI, it has access to an upgraded knowledge base that allows it to answer user queries more accurately. Inflection-2 also boasts “dramatically improved reasoning,” which makes the model better at tasks such as code generation, and it can adjust the style of the text it generates in a more fine-grained manner.

Inflection-2 likewise fares better in comparisons with competing companies’ LLMs.

Ahead of today’s launch announcement, Inflection ran a series of benchmark tests to assess how its new flagship model matches up against GPT-4, Google LLC’s PaLM-2, Claude-2, Llama-2 and Grok. According to the company, the evaluation determined that Inflection-2 is the most capable LLM in the world behind GPT-4 and the best in its “compute class.” OpenAI’s model is believed to be larger than Inflection-2, which means it likely required more compute capacity to train.

In a small subset of the benchmark tests, Google’s flagship PaLM-2 language model delivered better results. However, Inflection-2 outperformed a version of PaLM-2 optimized for code generation in a benchmark used to compare AI systems’ programming skills. Inflection-2 achieved that feat even though “code and mathematical reasoning were not an explicit focus in the training” of the model, according to the company.

Inflection AI trained the model using 5,000 of Nvidia’s H100 graphics processing units. The H100, which retails for upwards of $25,000, packs 80 billion transistors that allow it to run language models up to 30 times faster than the chipmaker’s previous flagship CPU. 

Inflection AI is also using the H100 for inference, or the task of running Inflection-2 in production to process user requests. The company detailed that the model performs inference faster and more cost-efficiently than its predecessor despite the fact it’s “multiple times larger.” 

Inflection AI has a total of 22,000 graphics cards at its disposal, or more than four times the number of chips it used to train Inflection-2. The company detailed today that it plans to use the hardware to build more advanced, larger LLMs. Inflection reportedly hopes to develop a model with 100 times the “scale” of Inflection-2 within one year. 

Image: Inflection AI

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU