

Google has introduced Gemini Omni, a next-generation multimodal AI model capable of generating and editing videos using text, images, audio, and video prompts. Announced during Google I/O 2026, the company described Gemini Omni as a major leap toward building a fully creative AI system. The first version, called Gemini Omni Flash, is now being rolled out through the Gemini app, Google Flow, and YouTube Shorts. According to Google, the model combines Gemini’s advanced reasoning capabilities with AI-powered media generation, allowing users to create cinematic-quality videos through simple natural language instructions. One of its standout features is conversational video editing, where users can modify videos simply by describing the changes they want instead of using traditional editing timelines and tools.
Google demonstrated examples where users transformed sculptures into bubbles, altered mirrors into fluid visuals, changed environments, and applied animations while maintaining realistic physics and character continuity. Gemini Omni can simultaneously process text, images, videos, sketches, voice references, and audio prompts to create cohesive multimedia content. The company also introduced AI avatar features, enabling users to create digital versions of themselves using their appearance and voice for personalized video generation. To prevent misuse and deepfake concerns, all AI-generated videos will include Google’s invisible SynthID watermark technology. Gemini Omni Flash is being launched globally for Google AI Plus, Pro, and Ultra subscribers, while integration with YouTube Create and developer APIs is expected in the coming weeks.














Comments (0)
No comments yet
Be the first to comment!