Generate AI Videos with Wan 2.7 on VidCella

First & last frame control · Natural language video editing · Subject & voice reference · 9-grid image input · Audio sync · 720P/1080P up to 15s

Key Capabilities

What Makes Wan 2.7 Stand Out

Wan 2.7 introduces four major capabilities that redefine AI video generation. These are not incremental upgrades — they are new paradigms that give you unprecedented control over your creative output.

First & Last Frame Control

Anchor both the opening and closing frame of your video, and Wan 2.7 fills in the motion between them. This is a game-changer for narrative and commercial work: you control exactly where the shot starts and where it ends, not just the beginning. Create seamless transitions, product reveals, and story arcs with precision.

Natural Language Video Editing

Edit existing videos with simple text instructions like "change the background to a rainy street" or "swap the jacket to dark red." Wan 2.7 returns an edited version without regenerating from scratch. Iteration cycles that previously required a full new generation are now handled as lightweight edits — saving both time and credits.

Subject & Voice Reference

Combine a visual subject reference with a voice audio clip to generate videos where both appearance and voice stay consistent. Upload a character reference image and a short voice sample, and Wan 2.7 produces fully synchronized talking videos — matching lip movements, facial expressions, and vocal tone without post-processing.

What is Wan 2.7?

Wan 2.7 is Alibaba's latest AI video generation model, released April 3, 2026. It builds on the Wan series with a focus on control, consistency, and editability — letting creators direct AI video generation with a level of precision that previous models could not achieve.

Text-to-Video with 5 Aspect Ratios
Generate 2–15 second videos from text prompts in 16:9, 9:16, 1:1, 4:3, or 3:4. Describe your scene in detail — subject, action, camera movement, lighting, style — and Wan 2.7 produces cinematic results at 720P or 1080P resolution.
Image-to-Video with First & Last Frame
Convert images to video with endpoint control. Upload a first frame, a last frame, or both, and the model generates the motion in between. The 9-grid input mode accepts a 3x3 grid of nine reference images for dramatically better subject consistency.
Video Editing & Extension
Edit existing videos with natural language instructions — change backgrounds, swap outfits, alter lighting — without full regeneration. Extend videos beyond their original duration with prompt-guided continuation that maintains visual coherence.
Audio Sync & Voice Reference
Native audio-visual synchronization generates music, ambient sound, and vocals as part of the scene from the first frame. Subject & voice reference lets you upload a character image and voice sample to produce talking videos with consistent identity and synchronized lip movements.

How to Use Wan 2.7 on VidCella

Create AI videos with Wan 2.7 in four steps. Whether you use text-to-video, image-to-video, or video editing mode, VidCella gives you full access to Wan 2.7's capabilities:

Wan 2.7 Features on VidCella

Explore the full range of Wan 2.7 capabilities available on VidCella — from text-to-video generation to advanced editing and extension workflows:

Text-to-Video

Generate 2–15 second videos from text prompts at 720P or 1080P. Five aspect ratios (16:9, 9:16, 1:1, 4:3, 3:4) cover every use case from landscape cinema to vertical social content.

Image-to-Video

Animate still images with first-frame, last-frame, or dual-endpoint control. The 9-grid input mode accepts nine reference images for superior subject consistency across the generated video.

Video Editing

Edit existing videos with natural language instructions. Change backgrounds, swap clothing, alter lighting, or modify any visual element — without regenerating the entire clip from scratch.

Video Extend

Extend videos beyond their original duration. Provide a prompt to guide the continuation, maintaining visual and narrative coherence with the source material.

Audio Synchronization

Native audio-visual sync generates background music, ambient sound, and character vocals as part of the scene. Audio is produced alongside the video, not layered in post-processing.

Up to 1080P Resolution

Generate at 720P (30 credits/s) or 1080P (45 credits/s). Both resolutions support all workflows — text-to-video, image-to-video, video editing, and video extension.

FAQ

Wan 2.7 Frequently Asked Questions

Everything you need to know about using Wan 2.7 on VidCella:

What is Wan 2.7?

Wan 2.7 is Alibaba's latest AI video generation model, released April 3, 2026. It supports text-to-video, image-to-video with first & last frame control, natural language video editing, video extension, and audio synchronization — all at up to 1080P resolution and 15 seconds duration.

What is first and last frame control?

First and last frame control lets you anchor both the opening and closing frames of your video. Upload a starting image, an ending image, or both, and Wan 2.7 generates the motion between them. This gives you precise control over narrative arcs, product reveals, transitions, and any shot where the start and end states matter.

What is subject & voice reference?

Subject & voice reference lets you upload a character image and a short voice audio clip together. Wan 2.7 generates a video where the character's appearance matches the reference image while lip movements and facial expressions synchronize to the provided voice — all in a single generation pass, without post-processing or separate dubbing.

How does natural language video editing work?

Upload an existing video and describe the changes you want in plain text — for example, "change the background to a sunset beach" or "swap the red shirt to blue." Wan 2.7 applies the edits to the video without regenerating it from scratch, preserving the original motion and composition while applying your changes.

What resolutions and durations does Wan 2.7 support?

Wan 2.7 generates video at 720P or 1080P resolution, with durations from 2 to 15 seconds. Five aspect ratios are available: 16:9 (landscape), 9:16 (portrait), 1:1 (square), 4:3, and 3:4. Video extension mode supports 5–15 second extensions.

How much does Wan 2.7 cost on VidCella?

Wan 2.7 costs 30 credits per second at 720P and 45 credits per second at 1080P. For example, a 5-second 720P video costs 150 credits, and a 5-second 1080P video costs 225 credits. No subscription required — pay only for what you generate.

What is 9-grid image input?

Instead of providing a single reference image for image-to-video, you can upload a 3x3 grid of nine images in a single call. Wan 2.7 reads across all nine to infer the subject's appearance, environment, and composition — dramatically reducing subject drift compared to single-image input.

Is Wan 2.7 open source?

No. Wan 2.1 and Wan 2.2 were the last models in the Wan series to release weights publicly (Apache 2.0). From Wan 2.5 onward, Alibaba shifted to a commercial API model. Wan 2.7 is only accessible through hosted platforms like VidCella. If you need to self-host, Wan 2.2 remains the most capable open-source option.

Generate AI Videos with Wan 2.7 on VidCella

What Makes Wan 2.7 Stand Out

First & Last Frame Control

Natural Language Video Editing

Subject & Voice Reference

What is Wan 2.7?

How to Use Wan 2.7 on VidCella

Choose Your Workflow

Write Your Prompt & Upload References

Adjust Settings

Generate, Review & Download

Wan 2.7 Features on VidCella

Text-to-Video

Image-to-Video

Video Editing

Video Extend

Audio Synchronization

Up to 1080P Resolution

Wan 2.7 Frequently Asked Questions

What is Wan 2.7?

What is first and last frame control?

What is subject & voice reference?

How does natural language video editing work?

What resolutions and durations does Wan 2.7 support?

How much does Wan 2.7 cost on VidCella?

What is 9-grid image input?

Is Wan 2.7 open source?