UPDATED 17:14 EDT / AUGUST 20 2024

AI

Hotshot debuts new AI model for generating video clips

Hotshot has debuted an artificial intelligence model that can generate 10-second clips with a resolution of 1280 by 720 pixels.

The startup launched the model, which is also called Hotshot, into public preview on Monday. It joins the growing list of AI video generators on the market. OpenAI, Runway ML Inc. and a number of other startups likewise offer neural networks capable of generating short clips based on user prompts.

Hotshot launched last year with an AI-powered image generation app for consumers. According to VentureBeat, the company appears to have shuttered the service to focus on its efforts in the nascent video generation segment. Hotshot is reportedly backed by SV Angel, angel investor Lachy Groom and Reddit Inc. co-founder Alexis Ohanian.

The company developed its latest video generation model over the course of several months. It created three different neural networks as part of the project: the core Hotshot model and two other neural networks that helped prepare the AI video generator’s training dataset.

In the first phase of the initiative, Hotshot put together a repository of 600 million clips with captions describing their contents. It combined those clips with one billion images to create its video generation model’s training dataset. “We knew we were going to want to train the model on images and videos jointly in order to take advantage of how much more abundant publicly accessible image data is than video,” Hotshot team members John Mullan, Duncan Crawbuck, Chaitu Aluru and Aakash Sastry explained in a blog post.

During the next phase of the project, the company developed an AI model to generate captions for the videos in its training dataset. Captions can help a neural network better understand the clips on which it’s being trained. The additional learnings that the AI consequently gleans increase the quality of its output.

Hotshot found that the existing caption generation models on the market didn’t meet its requirements. In response, the company took one of those existing models and customized it on a training dataset comprising 300,000 clips with manually created captions. “In a couple weeks, we had a video captioner we were quite happy to use to annotate our hundreds of millions of video samples,” the Hotshot team detailed.

The second auxiliary AI model that the company built to support the development of its video generator was an autoencoder. This is a type of algorithm that can take a piece of data, in this case a video, and remove details that aren’t necessary for AI training. Deleting superfluous information lowers storage requirements and thereby decreases costs.

After preparing the autoencoder and the captioning model, Hotshot spent four months training its AI video generator. The company used several thousand H100 graphics processing units from Nvidia Corp. that racked up millions of processing hours during the project. 

Hotshot’s engineers applied several optimizations to reduce the training run’s infrastructure requirements. The company stored many of the files it used for the project in the bfloat16 format, which can compress 32 bits of data into 16 bits to save storage space. Additionally, it carried out some of the computations that would have normally been performed during the training process ahead of time to make better use of the Nvidia chips’ processing capacity.

Hotshot’s new AI is accessible as part of a free video generation service on its website. The company also plans to make the model available to developers through an application programming interface.

Image: Hotshot

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU