UPDATED 14:06 EDT / NOVEMBER 22 2023

Inflection AI debuts new flagship Inflection-2 LLM trained on 5,000 H100 chips

Inflection AI Inc. today debuted its new flagship large language model, Inflection-2, which it claims can outperform most major rivals with the exception of OpenAI’s GPT-4.

Inflection AI was founded in March by Mustafa Suleyman, co-founder of Google LLC’s DeepMind artificial intelligence research group, and LinkedIn co-founder Reid Hoffman. The company closed a $225 million investment two months after launching. In June, it raised an additional $1.3 billion from Microsoft Corp., Nvidia Corp. and other high-profile investors.

A week before its latest funding round, Inflection AI debuted its first LLM. Dubbed Inflection-1, the model can write and rewrite text, summarize documents, generate code and perform related tasks. Inflection-2, the new LLM model the company detailed today, is the successor to Inflection-1.

The new model is significantly more capable than its predecessor. According to Inflection AI, it has access to an upgraded knowledge base that allows it to answer user queries more accurately. Inflection-2 also boasts “dramatically improved reasoning,” which makes the model better at tasks such as code generation, and it can adjust the style of the text it generates in a more fine-grained manner.

Inflection-2 likewise fares better in comparisons with competing companies’ LLMs.

Ahead of today’s launch announcement, Inflection ran a series of benchmark tests to assess how its new flagship model matches up against GPT-4, Google LLC’s PaLM-2, Claude-2, Llama-2 and Grok. According to the company, the evaluation determined that Inflection-2 is the most capable LLM in the world behind GPT-4 and the best in its “compute class.” OpenAI’s model is believed to be larger than Inflection-2, which means it likely required more compute capacity to train.

In a small subset of the benchmark tests, Google’s flagship PaLM-2 language model delivered better results. However, Inflection-2 outperformed a version of PaLM-2 optimized for code generation in a benchmark used to compare AI systems’ programming skills. Inflection-2 achieved that feat even though “code and mathematical reasoning were not an explicit focus in the training” of the model, according to the company.

Inflection AI trained the model using 5,000 of Nvidia’s H100 graphics processing units. The H100, which retails for upwards of $25,000, packs 80 billion transistors that allow it to run language models up to 30 times faster than the chipmaker’s previous flagship CPU.

Inflection AI is also using the H100 for inference, or the task of running Inflection-2 in production to process user requests. The company detailed that the model performs inference faster and more cost-efficiently than its predecessor despite the fact it’s “multiple times larger.”

Inflection AI has a total of 22,000 graphics cards at its disposal, or more than four times the number of chips it used to train Inflection-2. The company detailed today that it plans to use the hardware to build more advanced, larger LLMs. Inflection reportedly hopes to develop a model with 100 times the “scale” of Inflection-2 within one year.

Image: Inflection AI

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Inflection AI debuts new flagship Inflection-2 LLM trained on 5,000 H100 chips

Image: Inflection AI

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Freshworks Refresh 2026

IBM Think 2026

Dell Technologies World 2026

KB4-CON 2026

VeeamON 2026

Inflection AI debuts new flagship Inflection-2 LLM trained on 5,000 H100 chips

Image: Inflection AI

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Freshworks Refresh 2026

IBM Think 2026

Dell Technologies World 2026

KB4-CON 2026

VeeamON 2026

Cookies