Google's Veo 3 now turns a photo into a full-length video, complete with sound. Here's how it works.

Google has introduced a new feature within its Gemini app: the ability to generate a short video from a single photograph .
The unique feature of this new feature is that the video not only animates the image, but also includes an automatically generated audio track, with ambient sounds, effects and even dialogue .
The technology behind this new feature is Veo 3 , the third generation of the model developed by Google DeepMind for creating videos from text or images.
What is Veo 3Unveiled last May, the Veo 3 is capable of producing video clips approximately eight seconds long, in 720p resolution , combining motion and sound in a single generation.
Veo 3 is available to Gemini Pro and Ultra subscribers in over 150 countries. The photo animation feature is currently rolling out and should appear in Gemini in the coming days.
The feature is currently available in the web version of the Gemini app, but Google plans to extend it to mobile devices soon.
How to turn a photo into a videoThe process is simple: you log in to Gemini (requires a Google account and a Pro or Ultra subscription), upload a photo, and briefly describe what you want to happen, including the type of audio you want to match. The system then takes a short time to return an animated video, complete with sound.
One camera's trash can be Veo 3's treasure Now, Gemini can bring photos to life by turning them into videos with sound.
— Google Gemini App (@GeminiApp) July 11, 2025
This type of integration represents a step forward compared to what other models on the market allow, such as Runway Gen?2, Pika Labs or OpenAI's Sora .
All of these chatbots can generate videos from text or images, but without native audio. In those cases, the soundtrack must be added separately.
A significant aspect of Google's system is the presence of integrated security mechanisms: all videos are marked with a watermark - the word "Veo", visible in the bottom right, and an invisible one (SynthID) - to ensure traceability and prevent improper use of the generated content.
La Repubblica