Google appears to be preparing a new AI video generation model called Gemini Omni, according to early user reports and interface screenshots shared online. The feature surfaced inside Gemini with prompts inviting users to “Create with Gemini Omni,” suggesting Google may unveil the system more broadly at Google I/O 2026.
Google reportedly describes Omni as a “new video generation model” that supports video remixing, conversational editing, templates, and direct scene generation through chat prompts. While the company has not officially announced the model, metadata reportedly suggests Omni is connected to Google’s existing Veo video generation technology.
Early demonstrations indicate the system focuses on improving realism and consistency in generated video. One test generated a scene of a professor explaining trigonometric equations on a chalkboard while maintaining relatively coherent mathematical notation throughout the sequence. Text rendering remains one of the more difficult challenges for AI video systems because letters and equations often distort across moving frames.
Another example recreated the widely used “spaghetti test” benchmark, which many AI developers informally use to evaluate hand movement, object interaction, and eating realism in generated video. The generated clip showed two men seated at an outdoor restaurant eating spaghetti and holding a natural conversation, with fewer visual inconsistencies than earlier-generation AI video models.
The leaked interface also included a dedicated usage tracker for video generation. One tester said two complex prompts consumed roughly 86% of a daily AI Pro usage allowance, indicating that video generation workloads may remain heavily restricted because of their high compute requirements.
Google Pushes Gemini Deeper Into Video Creation
The apparent Omni integration suggests Google is moving video generation directly into Gemini instead of keeping Veo as a separate experimental product. The addition of conversational editing tools points toward a workflow where users can iteratively modify videos through chat rather than repeatedly generating clips from scratch.
That approach could make Gemini more competitive as an end-to-end creative platform combining text, image, audio, and video generation in a single interface. The reported template support also suggests Google may target marketing, education, and content production workflows rather than only experimental consumer use cases.
The stronger handling of written text and scene continuity is particularly notable because those remain major weaknesses across many current AI video systems.
Video AI Competition Accelerates Ahead Of I/O
The leaks arrive as AI companies increasingly compete in generative video infrastructure and creative tools. Video generation has become one of the fastest-growing areas of multimodal AI, though it also remains one of the most computationally expensive.
Google has continued investing heavily in video generation technology through Veo while expanding Gemini into a broader multimodal platform. The timing of the Omni leak, shortly before Google I/O 2026, suggests the company may be preparing a larger announcement around integrated AI media creation.
If launched publicly, Gemini Omni would place Google in more direct competition with other AI video platforms targeting professional content generation, conversational editing, and multimodal creative workflows.