X (Twitter) Facebook Pinterest LinkedIn E-mail

The image animation AI joins the global race for dominance in AI-generated video, competing with Sora, Veo, Kling, Runway, and Pika.

AI-generated video is experiencing a true revolution, and Midjourney—previously focused on photorealistic images created with AI—has just announced its entry into this realm with V1, its first video model. With this new tool, the company allows any user to animate static images into clips of up to 20 seconds without the need for advanced technical knowledge.

The process is simple: start with an image (either generated in Midjourney or uploaded from a computer), click the Animate button, and the system generates up to four animated versions of five seconds each. These can then be extended in blocks of four seconds, up to a maximum total of 20 seconds.

But beyond its simplicity, V1 represents the first step towards a more ambitious project for Midjourney: creating a unified generative platform that combines video, images, 3D models, and real-time simulation, with the promise that one day interactive “open worlds” can be created from scratch.

Expanded Comparison: The Most Relevant AI Video Models

The launch of V1 takes place against a backdrop of great enthusiasm in the field of generative video models. Here’s a comparative review of the most notable solutions available today:

Model	Company	Main Input	Maximum Duration	Output Type	User Control	Current Access Status	Highlighted Features
V1	Midjourney	Image + motion prompt	Up to 20 s	4 animated clips	Low / medium	Available via web (June 2025)	Fast, visual, accessible
Sora	OpenAI	Text	Up to 60 s	Coherent videos	High	Restricted access (researchers)	Advanced storytelling, physics simulation, and camera
Veo	Google DeepMind	Text	Up to 60 s	High-quality video	Medium / high	Closed, in pre-release phase	Cinematic quality, natural language
Kling	ByteDance	Text + image	2–4 s	Realistic facial animations	Medium	Limited to users in China	Precise movement, facial expressiveness
Runway Gen-3	Runway	Text + image	15–30 s (variable)	Creative clips	Medium	Available for registered users	Artistic control, integration into creative workflow
Pika 1.0	Pika Labs	Text + image	Up to 30 s	Creative video	Medium	Available (with registration)	Easy editing, quick cinematic effects

What Does V1 Bring Compared to Others?

While models like Sora and Veo focus on generating long, narrative, and highly realistic videos from scratch based on text, V1 emphasizes immediacy and visual control, making it an excellent choice for artists, designers, and content creators already working with images.

Advantages of V1:

Simplicity and speed: does not require complex prompt knowledge.
Direct interface: image-based, no scripts needed.
Accessibility: available from a web browser without waiting or access requests.
Reasonable price: each video costs about 8 times that of an image, equivalent to one second of video per image in terms of consumption.

Limitations:

Less narrative control.
Limited quality compared to more advanced models.
Risk of visual errors in “high-motion” scenes.

Two Modes, Two Levels of Movement

V1 offers two ways to generate animations:

Automatic: the AI decides how to move the scene.
Manual: the user can write a brief text describing the desired movement.

And two levels of dynamism:

Low movement: useful for subtle or ambient scenes.
High movement: for more energetic and dynamic animations, although with a greater margin for error.

In the Midst of Innovation, a Legal Warning

This launch comes at a delicate time for Midjourney. Disney and Universal have filed lawsuits against the company for alleged copyright infringement by training their models with protected images. The presented evidence includes AI-generated content that reproduces characters like Homer Simpson and Darth Vader with concerning fidelity.

Midjourney has responded by urging its community to use the tool ethically and responsibly. “When used correctly, this technology can be fun, useful, and even profound,” the company noted.

A Long-Term Vision: Generative and Interactive Worlds

Midjourney has made it clear that V1 is just one piece of a much larger puzzle. Its vision is ambitious: to build a platform where images, videos, 3D scenes, and entire worlds can be generated that respond in real-time. All of this with a visual, fast, and collaborative interface.

While it is still far from achieving the complexity of models like Sora or Veo, V1 demonstrates that Midjourney does not want to be left behind in the race for AI-generated video. On the contrary: it has chosen to do what it does best—provide powerful and accessible tools to visual creators—and make them available to everyone.

On the horizon, a future emerges where users will not only generate static images but complete universes that move, evolve, and respond instantly, all from a creative interface powered by artificial intelligence.

Source: AI News

X (Twitter) Facebook Pinterest LinkedIn E-mail