UPDATED 22:38 EST / APRIL 18 2023

AI

Reddit to charge for access to its API to counter free data scraping by AI companies

As big tech races to develop the most advanced generative artificial intelligence chatbots, Reddit Inc. today announced its treasure trove of data will no longer come free.

The company said it will now start charging for access to its application programming interface, an API that has been used by AI companies such as Microsoft Corp.’s Bing AI and OpenAI LP’s ChatGPT models, to train their chatbots. Reddit has been one of the most valuable resources in this regard, with its 57 million users chatting about almost every topic under the sun since it was established in 2005. In terms of training large language models, LLMs, Reddit’s data is priceless.

Reddit didn’t say what it’s going to charge third parties for access, only explaining in a post that it is “introducing new premium access point for third parties who require additional capabilities, higher usage limits, and broader usage rights.” It added that the API will remain open for “reasonable and appropriate use cases” so developers can help improve the user experience on the platform.

The rise of generative AI chatbots shouldn’t have taken anyone by surprise, but what has happened in just a few months has been nothing short of incredible. ChatGPT alone has more than 100 million active users and more than a billion visitors to its website each month. It’s said to have had the fastest-growing user base in history, with revenue predictions going through the roof.

There have been concerns, of course, mostly related to how such models may be used maliciously, produce misinformation or, like Microsoft’s Bing chatbot, seem to go off the rails and “hallucinate.” That has already led to probes into the possible dangers of using such powerful tools.

There has been much talk about pausing generative AI development, but the chances of that happening are slim. This is one reason why Reddit is trying to make money from what has become a feeding trough for such models.

“The Reddit corpus of data is really valuable,” Steve Huffman, co-founder and chief executive of Reddit, said today in an interview with the New York Times. “But we don’t need to give all of that value to some of the largest companies in the world for free.” As the article points out, not only has Reddit not been making hay while the sun shines on AI, but such systems may one day be a competitor as they duplicate answers that have appeared on Reddit.

Photo: Brett Jordan/Unsplash

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.