UPDATED 12:20 EDT / MAY 05 2023

AI

Google and OpenAI struggling to keep up with open-source AI, senior engineer warns

Google LLC and ChatGPT developer OpenAI LP face increasing competition from open-source developers in the field of generative artificial intelligence, which may threaten to overcome them, a senior Google engineer warned in a leaked document.

The engineer, identified as Luke Sernau by Bloomberg, published the document named “We Have No Moat and Neither Does OpenAI” to Google’s internal servers in early April and it was quickly picked up and spread by staff. It was then leaked by the consulting firm SemiAnalysis on Thursday.

In the document, Sernau commented that Google and OpenAI have been openly competing against one another for control of the generative AI market with large language models between OpenAI’s GPT-4 and Google’s Bard. But there’s a third contender that is currently “quietly eating our lunch” – and that third party is the open-source community.

Generative AI is a type of artificial intelligence that can generate new content, such as text, images and other media from prompts given by users in plain language. The technology has become explosively popular since the introduction of OpenAI’s ChatGPT captured the hearts and minds of the population bringing millions of users and creating numerous headlines about the technology.

“While our models still hold a slight edge in terms of quality, the gap is closing astonishingly quickly,” Seranu said. “Open-source models are faster, more customizable, more private, and pound-for-pound more capable.”

In his analysis, Seranu said that although Google and OpenAI control a market with extremely large language models, which provide high accuracy – often weighing more than 540 billion parameters – they’re also expensive and not very portable. That means they’re difficult to run on smaller systems, require massive computing infrastructure, and time to produce models. By contrast, the open source movement is producing smaller, more agile models that can be built in weeks instead of months on smaller datasets far more cheaply.

His contention is that Google doesn’t have any “secret sauce” that sets it apart from the open source community, which is currently rapidly iterating on these smaller models that are capable of doing very similar things to the giant models. He also said people are unwilling to pay the exorbitant prices for restrictive models when free, open-source models are available with comparable quality.

Examples of open-source models that have rapidly arisen and iterated after Meta Platforms Inc. LLaMA, Large Language Model Meta AI, was leaked to the public in March. LLaMA is much smaller than other LLMs, at 13 billion parameters, and became a foundational model for a series of iterations that the open-source community quickly created including, Alpaca-13B and Vicuna-13B.

With the leak of LLaMA, Seranu said, almost overnight the “barrier to entry for training and experimentation has dropped from the total output of a major research organization to one person, an evening, and a beefy laptop.”

Most pointedly, Seranu argued that Google is missing the inflection point in that data quality scales better than data size. Smaller models work better because they can be trained for specific use cases and with higher portability and lower costs for training and deployment it means that they’re easier to use and less burdensome. Also, many of the datasets currently on the market are free, which means that nobody wants to pay for them anyway.

He also argued that it would be better for Google to work with the open source community rather than against it. It has worked extremely well for the company with Chrome and Android – it could work here.

“Paradoxically, the one clear winner in all of this is Meta,” Seranu added. “Because the leaked model was theirs, they have effectively garnered an entire planet’s worth of free labor. Since most open source innovation is happening on top of their architecture, there is nothing stopping them from directly incorporating it into their products.”

Ideally, Seranu said, Google should cooperate with the open-source community and establish itself as a leader by publishing the foundational models for smaller LLMs in order to enter the conversation.

Image: Nepool

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU