Google has enhanced its AI-driven image and video generation capabilities with the introduction of its second-generation video generation model, Veo 2, and improvements to its existing Imagen 3 image-generation model. These updates aim to produce more refined and realistic visual outputs.
Veo 2 Model
Google’s Veo 2 model focuses on improving the understanding of real-world physics, human movement, and expressions, enabling the generation of more realistic videos. The model can handle complex requests, including specific genres, lens types, and cinematic effects. It is capable of producing videos in up to 4K resolution and can generate videos lasting several minutes. Integrated into Google Labs’ VideoFX tool, users can sign up for access through the waitlist. Veo 2 is also expected to expand to YouTube Shorts and other Google products in the coming year.
Imagen 3
Google’s Imagen 3 model has been upgraded to offer a broader range of art styles, from photo-realism and impressionism to abstract and anime, with enhanced accuracy. This update also improves the model’s ability to adhere to user prompts more closely, generating images with greater detail and texture. Imagen 3 will be available through Google Labs’ ImageFX tool.
Whisk
Google has also introduced an experimental tool called Whisk, which combines Imagen 3’s image-generation features with Gemini’s visual understanding and description abilities. Whisk allows users to create or modify images according to their preferences, remixing them into unique outputs. When an image is inputted, Gemini automatically generates a detailed caption, which is then used by Imagen 3 to generate new images in different styles based on the input and description.