UPDATED 19:24 EDT / JULY 25 2024

AI

In latest AI training drama, Runway accused of using publicly available YouTube videos

In the latest drama surrounding the training of artificial intelligence models, video generation startup Runway AI Inc. is being accused of using publicly available YouTube videos to train its AI video generation model.

The company, which launched its Gen-3 Alpha model for generating 10-second videos in June to generally positive reviews, is claimed by 404 Media to scraped “thousands of videos from popular YouTube creators and brands, as well as pirated films.” The claim is made based on an internal spreadsheet obtained by the outlet.

Among the YouTube channels allegedly used to train Rumway’s AI include those from The New Yorker, VICE News, Pixar, Disney, Netflix and Sony. Videos from YouTube creators, including Casey Neistat, Sam Kolder, Benjamin Hardman and Marques Brownlee were also apparently used.

Notably, the leaked spreadsheet is said to show that the company was trying to obtain videos that had a specific type of subject matter, camera work and a diverse set of people in them. In some cases, the videos targeted included those showing rain, beaches and even doctors.

404 Media claims that the use of such material to assist in training AI models is ripping off YouTube creators with a theme that somehow reading or viewing publicly available material is some sort of massive crime. And yet, it isn’t.

Although there are arguably gray areas around AI that laws are yet to catch up with, if someone reads 100 books or videos and then comes to a conclusion based on them, that’s not copyright theft unless the knowledge learned — outside of facts — was copied verbatim. That said, some argue that the scale of the AIs’ ingestion of copyrighted material constitutes a violation — something that hasn’t yet been decided legally.

The closest 404 Media can get to is that a video of a man skiing generated by Runway is somewhat similar to a video from a YouTube creator. Another video of a racing car was also similar. Both of the examples used prompts specifically asking Runway to copy the original video — not regular user behavior — and that the result was not identical, which 404 Media admits, means that they are likely not a breach of copyright.

The drama around Runway followed a similar storm in a teacup on July 16 when Anthropic PBC, Nvidia Corp., Apple Inc. and Salesforce Inc. were accused of using subtitles from YouTube videos to help train their AI models.

Legal action has also been taken in relation to AI training, with Microsoft Corp. and OpenAI sued for their use of nonfiction authors’ work in AI training in November. The class-action lawsuit, led by a New York Times reporter, claimed that OpenAI allegedly scraped the content of hundreds of thousands of nonfiction books to train their AI models.

The Times also accused OpenAI, Google LLC and Meta Holdings Inc. in April of skirting legal boundaries for AI training data.

Image: SiliconANGLE/Ideogram

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU