UPDATED 22:04 EDT / MARCH 26 2025

AI

New York judge allows New York Times copyright lawsuit against OpenAI to proceed

A federal judge in New York today rejected OpenAI‘s bid to dismiss a copyright lawsuit brought by The New York Times accusing the company of scraping data from its content to train its products.

U.S. District Judge Sidney Stein narrowed the scope of the case, but the core of the copyright infringement claim will remain. No opinion was given, but Stein said one would come “expeditiously.” The decision will delight the Times, The New York Daily News and the Center for Investigative Reporting, which have joined forces on the lawsuit.

“We appreciate Judge Stein’s careful consideration of these issues,” Times’ attorney Ian Crosby said in a statement. “As the order indicates, all of our copyright claims will continue against Microsoft and Open AI for their widespread theft of millions of The Times’s works, and we look forward to continuing to pursue them.”

The Times’ lawyers claim the vast trove of content in the newspaper’s database has been one of the major sources of data that has trained products such as ChatGPT. The lawsuit states that OpenAI has been “free-riding” on the Times’ “significant efforts and investment of human capital to gather this information,” without paying compensation.

On the other side of the argument, OpenAI’s attorneys state that the mass scraping of data amounts to “fair use.” This is a legal framework wherein copyrighted material can be used without having to seek permission. Generative AI companies have built fortunes on the back of this doctrine, although what counts as fair is still up for debate.

If it turns out the OpenAI’s scraping doesn’t meet the criteria for fair use, it could get expensive, with the statutory maximum for each willful violation being $150,000. It will also have a profound impact on all other generative AI firms that have also developed products on the back of content available online.

What constitutes fair use is the question at hand, with the law stating that work can be reproduced in part but it must be “transformative,” adding something new or at least referring to the original work. The case will also revolve around “market substitution,” asking if the chatbot is a substitute for reading the content produced by the newspaper. The Times’ lawyers have argued that ChatGPT has offered responses identical to articles in the Times, while OpenAI has argued that this was achieved only when the chatbot was manipulated.

Publishers presently live with the threat that their content going forward will be summarized by ChatGPT-like products, leaving their audience walking out of the door and so collapsing their ad revenue. Meanwhile, a handful of other media giants have opted to be compensated for their work through licensing deals.

In a statement, OpenAI said it welcomed “the court’s dismissal of many of these claims and look forward to making it clear that we build our AI models using publicly available data, in a manner grounded in fair use, and supportive of innovation.”

Photo: Unsplash

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU