Google has launched a new Gemini AI feature that transforms photos into video clips using its Veo 3 video model. The feature generates eight-second videos complete with AI-generated audio, including background noises, environmental sounds, and speech that sync to the visuals, expanding Google’s AI capabilities beyond text and static images.
What you should know: The photo-to-video capability is now available to Google AI Ultra and Pro subscribers in select regions, rolling out on web today and mobile devices throughout the week.
- Users access the feature by clicking “tools” in the prompt bar, selecting “video,” and uploading their photo alongside a text description of desired movement.
- Audio descriptions can be included for dialogue, sound effects, and ambient noise, which Google says will be “perfectly synced with the visuals.”
- Finished videos are delivered as MP4 files at 720p resolution in 16:9 landscape format.
How it works: The feature leverages Google’s Veo 3 video model to animate static images based on user prompts and descriptions.
- “You can get creative by animating everyday objects, bringing your drawings and paintings to life, or adding movement to nature scenes,” Google said.
- All generated videos include both visible watermarks showing they are AI-generated and invisible SynthID digital watermarks for authenticity tracking.
Competitive landscape: This capability was previously available in Flow, Google’s generative AI filmmaking tool launched in March, but now integrates directly into Gemini for easier access.
- Flow is also expanding to “an additional 75 countries” today alongside the Gemini video feature rollout.
- The integration eliminates the need for users to open separate applications to animate their photographs.
Gemini AI can now turn photos into videos