UPDATED 11:45 EDT / SEPTEMBER 08 2025

AI

Vidu launches AI image generation update with reference-to-image for creating imaginative realism

Vidu, the flagship artificial intelligence product of Chinese firm ShengShu Technology, released a new update to its platform today that’s intended to “reinvent photography” by allowing users to upload multiple image references and have an AI model compose them into vivid, highly consistent generated pictures.

Best known for its generative AI video platform and foundation models, the company allows users to write natural language inputs and add reference images to produce short scenes. The model can then use the reference images to create elements and objects in the scene, maintaining high consistency from scene to scene.

Vidu said it has implemented a similar technology for image generation called reference-to-image that permits greater control and consistency of references in generated content, which allows users to upload up to seven images.

When a user uses this update, the company’s model uses what it calls “semantic understanding” to interpret the relationship between multiple images to produce greater consistency. This capability in AI models has been somewhat error-prone, and only recently, breakthroughs by models such as Google LLC’s Gemini 2.5 Flash Image, also known as “Nano Banana,” have made it easier for people to access.

For example, a user could use Vidu’s reference-to-image capability to generate a new image from scratch using a text prompt and multiple separate images. According to Vidu, this allows for quick editing of photographs with extremely high consistency.

For example, a photographer could take a picture from a wedding and add elements such as a bouquet, change the style of flowers on tables or adjust the lighting if it was a gloomy or rainy day. Users could use the function to modify a candid selfie that didn’t quite match their expectations, change out a logo on their shirt or put themselves in a different place. Marketers and advertisers will be able to quickly compose AI-generated “photographs” that include products or swap product models in already completed advertising shoots.

Vidu said it has significantly improved its instant image editing capabilities to rival current editing platforms. Users looking to use AI for generative image composition often need to rely on editing platforms or advanced workflow builders like the open-source tool ComfyUI to achieve consistency and control.

The company said editing features available using this new feature include remixing, partial and full object replacement, and adding objects. Users can use multiple input images and freely composite them into a single image with what the company says is “high consistency” compared with other models on the market, which include visual plausibility. Users can modify the appearance of objects through partial replacement or object replacement, such as changing the color of an outfit or umbrella, or completely replacing the object with a different one.

Vidu’s new model feature competes with both Google’s Nano Banana and Black Forest Labs Inc.’s Flux Context in generative image editing and production capabilities. The company said its model stands out by providing what it calls “unmatched image and character consistency, along with natural image blending for richer and more realistic details,” including the capability of carrying over both the visuals and embedded text from reference images with clarity. Modern generative AI image models still struggle to accurately render text, even with a reference image.

Image: Vidu

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.