UPDATED 12:19 EST / DECEMBER 18 2023

AI

David versus Goliath reimagined: OpenAI’s approach to AI supervision

Artificial general intelligence, or AGI, has people both intrigued and fearful.

As a leading researcher in the field, last July, OpenAI introduced the concept of superalignment via a team created to study scientific and technical breakthroughs to guide and ultimately control AI systems that are much more capable than humans. OpenAI refers to this level of AI as superintelligence.

Last week, this team unveiled the first results of an effort to supervise more powerful AI with less powerful models. While promising, the effort showed mixed results and brings to light several more questions about the future of AI and the ability of humans actually to control such advanced machine intelligence.

In this Breaking Analysis, we share the results of OpenAI’s superalignment research and what it means for the future of AI. We further probe ongoing questions about OpenAI’s unconventional structure, which we continue to believe is misaligned with its conflicting objectives of both protecting humanity and making money. We’ll also poke at a nuanced change in OpenAI’s characterization of its relationship with Microsoft. Finally, we’ll share some data that shows the magnitude of OpenAI’s lead in the market and propose some possible solutions to the structural problem faced by the industry.

OpenAI’s superalignment team unveils its first public research

With little fanfare, OpenAI unveiled the results of new research that describes a technique to supervise more powerful AI models with a less capable large language model. The paper is called Weak-to-Strong Generalization: Eliciting Strong Capabilities with Weak Supervision.

The basic concept introduced is superintelligent AI will be so vastly superior to humans that traditional supervision techniques such as reinforcement learning from human feedback, or RLHF, won’t scale. This “super AI,” the thinking goes, will be so sophisticated that humans won’t be able to comprehend its output. Rather, the team set out to test whether less capable GPT-2 models can supervise more capable GPT-4 models as a proxy for a supervision approach that could keep superintelligent systems from going rogue.

The superalignment team at OpenAI is led by Ilya Sutskever and Jan Leike. Sutskever’s name is highlighted in this graphic because much of the chatter on Twitter after this paper was released suggested that he was not cited as a contributor to the research. Perhaps his name was left off initially given the recent OpenAI board meltdown and then added later. Or perhaps the dozen or so commenters were mistaken, but that’s unlikely. At any rate he’s clearly involved.

Can David AI control a superintelligent goliath?

This graphic has been circulated around the internet so perhaps you’ve seen it. We’ve annotated in red to add some additional color. The graphic shows that traditional machine learning models involve RLHF where the outputs of a query are presented to humans to rate them. That feedback is then pumped back into the training regimen to improve model results.

Superalignment, shown in the middle, would ostensibly involve a human trying to unsuccessfully supervise a far more intelligent AI, which presents a failure mode. For example, the super AI could generate millions of lines of code that mere humans wouldn’t be able to understand. The problem, of course, is that superintelligence doesn’t exist, so it can’t be tested. But as a proxy, the third scenario shown here is that a less capable model — GPT-2 in this case — was set up to supervise a more advanced GPT-4 model.

For reference, we pin AGI at the human level of intelligence, recognizing that definitions do vary depending on whom you speak with. Regardless, the team tested this concept to see if the smarter AI would learn bad habits from the less capable AI and become “dumber,” or would the results close the gap between the less capable AI’s capabilities and a known ground truth set of labels that represent correct answers?

The methodology was thoughtful. The team tested several scenarios across natural language processing, chess puzzles and reward modeling, which is a technique to score responses to a prompt as a reinforcement signal to iterate toward a desired outcome. The results were mixed, however. The team measured the degree to which the performance of a GPT-4 model supervised by GPT-2 closed the gap on known ground truth labels. They found that the more capable model supervised by the less capable AI performed 20% to 70% better than GPT-2 on the language tasks but did less well on other tests.

The researchers are encouraged that GPT-4 outdid its supervisor and believe this shows promising potential. But the smarter model had greater capabilities that weren’t unlocked by the teacher calling into question the ability of a less capable AI to control a smarter model.

In thinking about this problem, one can’t help but recall the scene from the movie “Good Will Hunting”:

Is there a ‘supervision tax’ in AI safety?

There are several threads on social and specifically on Reddit, lamenting the frustration with GPT-4 getting “dumber.” A research paper by Stanford and UC Berkeley published this summer points out the drift in accuracy over time. Theories have circulated as to why, ranging from architectural challenges and memory issues, and some of the most popular citing the need for so-called guardrails has dumbed down GPT-4 over time.

Customers of ChatGPT’s for pay service have been particularly vocal about paying for a service that is degrading in quality over time. However, many of these claims are anecdotal. It’s unclear to what extent the quality of GPT-4 is really degrading, as it’s difficult to track such a fast-moving target. Moreover, there are many examples where GPT-4 is improving, such as in remembering prompts and fewer hallucinations.

Regardless, the point is this controversy further underscores many alignment challenges between government and private industry, for-profit versus nonprofit objectives, AI safety, and regulation conflicting with innovation and progress. Right now the market is like the Wild West, with lots of hype and diverging opinions.

OpenAI changes the language regarding Microsoft’s ownership

In a post last month, covering the OpenAI governance failure we showed this graphic from OpenAI’s Web site. As we discussed this past week with John Furrier on theCUBE Pod, the way in which OpenAI and Microsoft are characterizing their relationship has quietly changed.

To review briefly, the graphic shows the convoluted and in our view, misaligned structure of OpenAI. It is controlled by a 501(c)(3) non-profit public charity with a mission to do good AI for humanity. That board controls an LLC which provides oversight and also controls a holding company owned by employees and investors such as Khosla Ventures, Sequoia and others. This holding company owns a majority of another LLC, which is a capped profit company.

Previously on OpenAI’s Web site, Microsoft was cited as a “Minority owner.” That language has now changed to reflect Microsoft’s “Minority economic interest,” which we believe is a 49% stake in the capped profits of the LLC. Now, quite obviously this change was precipitated by the U.K. and U.S. governments looking into the relationship between Microsoft and OpenAI, which is fraught with misalignment as we saw with the firing and rehiring of Chief Executive Sam Altman, and the subsequent board observer seat consolation that OpenAI made for Microsoft.

The partial answer in our view is to create two separate boards and governance structures — one to govern the nonprofit and a separate board to manage the for-profit business of OpenAI. But that alone won’t solve the superalignment problem, assuming superhuman intelligence is a given, which it is not necessarily.

The AI market is bifurcated

To underscore the wide schisms in the AI marketplace, let’s take a look at this Enterprise Technology Research data from the Emerging Technology Survey, ETS, which measures the market sentiment and mindshare amongst privately held companies. Here we’ve isolated on the ML/AI sector which comprises traditional AI plus LLM players, as cited in the annotations. We’ve also added the most recent market valuation data for each of the firms. The chart shows Net Sentiment on the vertical axis which is a measure of intent to engage, and mindshare on the horizontal axis which measure awareness of the company.

The first point is OpenAI’s position is literally off the charts in both dimensions. Its lead with respect to these metrics is overwhelming, as is its $86 billion market cap. On paper it is more valuable than Snowflake Inc. (not shown here) and Databricks Inc. with a reported $43 billion valuation. Both Snowflake and Databricks are extremely successful and established firms with thousands of customers.

Hugging Face is high up on the vertical axis – think of it as the GitHub for AI model engineers. As of this summer, its valuation was at $5 billion. Anthropic PBC is prominent and, with its investments from Amazon Web Services Inc. and Google LLC, it touts a recent $20 billion valuation, while Cohere this summer reportedly had a $3 billion valuation.

Jasper AI is a popular marketing platform that is seeing downward pressure on its valuation because ChatGPT is disruptive to its value proposition at a much lower cost. DataRobot Inc. at the peak of the tech bubble had a $6 billion valuation, but after some controversies around selling insider shares, its value has declined. You can also see here H2O.ai Inc. and Snorkel AI Inc. with unicorn-like valuations and Character.ai, which is a chatbot generative AI platform and recently was reported having a $5 billion valuation.

So you can see the gap between OpenAI and the pack. As well you can clearly see that emergent competitors to OpenAI are commanding higher valuations than the traditional machine learning players. Generally our view is AI generally and generative AI specifically are a tide that will lift all boats. But some boats will be able to ride the wave more successfully than others and so far, despite its governance challenges, OpenAI and Microsoft have been in the best position.

Key questions on superintelligence

There are many questions around AGI and now super AI as this new parlance of superintelligence and superalignment emerge. First, is this vision aspirational or it is truly technically feasible? Experts such as John Roese, chief technology officer of Dell, have said all the pieces are there for AGI to become a reality, there’s just not enough economically feasible compute today and the quality of data is still lacking. But from a technological standpoint, he agrees with OpenAI that it’s coming.

If that’s the case, how will the objectives of superalignment – aka control – affect innovation and what are the implications of the industry leader having a governance structure that is controlled by a nonprofit board? Can its objectives truly win out over the profit motives of an entire industry? We tend to doubt it and the reinstatement of Altman as CEO underscores who is going to win that battle. Altman was the big winner in all that drama — not Microsoft.

So to us, the structure of OpenAI has to change. The company should be split in two with separate boards for the nonprofit and the commercial arm. And if the mission of OpenAI is truly is to develop and direct artificial intelligence in ways that benefit humanity as a whole, then why not split the companies in two and open up the governance structure of the nonprofit to other players, including OpenAI competitors and governments?

On the issue of superintelligence, beyond AGI, what happens when AI becomes autodidactic and becomes a true self-learning system? Can that really be controlled by less capable AI? The conclusion of OpenAI researchers is that humans clearly won’t be able to control it.

But before you get too scared, there are those skeptics who feel that we are still far away from AGI, let alone superintelligence. Hence point No. 5 here: Is this a case where Zeno’s paradox applies? Zeno’s paradox, you may remember from high school math classes, states that any moving object must reach halfway on a course before it reaches the end; and because there are an infinite number of halfway points, a moving object never reaches the end in a finite time.

Is superintelligence a fantasy?

This graphic sums up the opinions of the skeptics. It shows a super-complicated equation with a step in the math that says, “Then a Miracle Occurs.” It’s kind of where we are with AGI and superintelligence… like waiting for Godot.

We don’t often use the phrase “time will tell” in these segments. As analysts, we like to be more precise and opinionated with data to back those opinions. But in this case we simply don’t know.

But let’s leave you with a thought experiment from Arun Subramaniyan put forth at Supercloud 4 this past October. We asked him for his thoughts on AGI and the same applies for superintelligence. His premise was assume for a minute that AGI is here. Wouldn’t the AI know that we as humans would be wary of the AI and try to control it? So wouldn’t the smart AI act in such a way as to hide its true intentions? Ilya Sutskever has stated this is a concern.

The point being, if super AI is so much smarter than humans, then it will be able to outsmart us easily and control us versus us controlling it. And that is the best case for creating structures that allow the motives of those concerned about AI safety to pursue a mission independent of a profit-driven agenda. Because a profit motive will almost always win over an agenda that sets out to simply do the right thing.

Keep in touch

Thanks to Alex Myerson and Ken Shifman on production, podcasts and media workflows for Breaking Analysis. Special thanks to Kristen Martin and Cheryl Knight, who help us keep our community informed and get the word out, and to Rob Hof, our editor in chief at SiliconANGLE.

Remember we publish each week on Wikibon and SiliconANGLE. These episodes are all available as podcasts wherever you listen.

Email david.vellante@siliconangle.com, DM @dvellante on Twitter and comment on our LinkedIn posts.

Also, check out this ETR Tutorial we created, which explains the spending methodology in more detail. Note: ETR is a separate company from Wikibon and SiliconANGLE. If you would like to cite or republish any of the company’s data, or inquire about its services, please contact ETR at legal@etr.ai.

Here’s the full video analysis:

All statements made regarding companies or securities are strictly beliefs, points of view and opinions held by SiliconANGLE Media, Enterprise Technology Research, other guests on theCUBE and guest writers. Such statements are not recommendations by these individuals to buy, sell or hold any security. The content presented does not constitute investment advice and should not be used as the basis for any investment decision. You and only you are responsible for your investment decisions.

Disclosure: Many of the companies cited in Breaking Analysis are sponsors of theCUBE and/or clients of Wikibon. None of these firms or other companies have any editorial control over or advanced viewing of what’s published in Breaking Analysis.

Image: InvisibleWizard/Adobe Stock

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU