Wan 2.7 Prompt Guide: Write Prompts That Actually Work
Wan 2.7, released on April 3 2026 by Alibaba's Tongyi Lab, is built on a 27-billion-parameter Mixture-of-Experts architecture with a new Thinking Mode — a chain-of-thought reasoning layer that plans composition before generating. That extra reasoning step is great news for output quality, but it also means the model pays closer attention to what you write. Vague prompts still produce vague videos; precise prompts now produce remarkably cinematic results.
This guide gives you the complete framework for writing Wan 2.7 prompts that convert intent into footage.
The 7-Part Prompt Formula
Wan 2.7's official documentation and community testing both point to the same prompt structure: describe the what, then layer in how the camera, light, and style should render it.
[Subject] [Action] [Camera move] [Lighting] [Style/Medium] [Lens/Era] [Color grade & Mood]
Here's what each slot does:
| Slot | What to write | Example |
|---|---|---|
| Subject | Who or what is the focus | "A female astronaut in a worn EVA suit" |
| Action | What movement happens | "turns slowly, looking out a porthole" |
| Camera move | Shot framing and motion | "slow push-in, medium close-up" |
| Lighting | Light source and quality | "single cold key light, deep shadows" |
| Style/Medium | Aesthetic or film stock | "cinematic, 35mm film grain" |
| Lens/Era | Lens character | "anamorphic 2.39:1, 1980s sci-fi" |
| Color grade & Mood | Emotional tone | "desaturated blues, quiet dread" |
Full example:
A female astronaut in a worn EVA suit turns slowly, looking out a porthole at a debris field. Slow push-in, medium close-up. Single cold key light from the porthole casting deep shadows. Cinematic, 35mm film grain, anamorphic 2.39:1, 1980s sci-fi aesthetics. Desaturated blues, teal shadows, quiet dread.
Prompt Length and Negative Prompts
Target 80–120 words. Shorter prompts leave too much to chance; longer ones risk conflicting instructions that confuse the model.
Guidance scale: Start at 5–7. Higher values push the video closer to your text but can introduce artifacts. Lower values give the model more creative latitude.
Negative prompts tell Wan 2.7 what to avoid. Useful defaults:
blurry, low quality, watermark, text overlay, flickering, distorted faces,
duplicate limbs, oversaturated, static camera (if you want motion)
Add negative prompts sparingly — one or two focused terms work better than a long list of generic ones.
T2V vs I2V: Writing the Prompt Differently
Text-to-Video (T2V) prompts must establish the entire scene. Describe the location, character, and action fully — the model has no starting image to infer from.
Image-to-Video (I2V) prompts can be shorter. Your reference image already defines the subject and setting, so the prompt should focus on what changes: the camera move, the action that unfolds, or the mood shift.
| Mode | What the prompt should emphasize |
|---|---|
| T2V | Full scene: subject + environment + action + style |
| I2V | Motion + camera + emotional arc (subject already defined by image) |
I2V example (image: a lone lighthouse at dusk):
Slow aerial pullback reveals the full coastline. Waves crash in slow motion. Golden hour light shifts to deep violet. Melancholy, wide anamorphic.
First & Last Frame Control
Wan 2.7 introduced integrated first-and-last-frame control — you specify the opening and closing frames, and the model generates the motion between them. Your prompt in this mode describes the transition, not the full scene.
Think of it as directing the "action" rather than the "set":
- ✅ "The camera slowly tilts up from the cobblestones to reveal the illuminated cathedral facade"
- ✅ "Fog rolls in from the left, gradually obscuring the mountain"
- ❌ "A mountain in the fog" (describes a state, not a transition)
Checklist for first/last frame prompts:
- Name the direction of change (roll in, pull back, fade, tilt)
- Mention time or pace if relevant ("slowly", "over 8 seconds")
- Include the dominant light or color shift between the two frames
5 Ready-to-Copy Example Prompts
1. Urban Timelapse Feel
A busy Tokyo intersection at rush hour. Hundreds of people cross simultaneously from all directions. Top-down drone shot, gradually pulling higher. Overcast daylight, cool blue-grey tones. Hyper-real, documentary style. Slight 16mm film grain, neutral color grade.
2. Fantasy Landscape
A lone traveler on horseback crosses a vast salt flat that mirrors a stormy purple sky. Slow wide tracking shot from the side, keeping the rider centered. Diffuse stormy light, no hard shadows. Painterly, reminiscent of a Moebius illustration. Desaturated purples and dusty ochres.
3. Product Shot (I2V)
Camera orbits slowly around the product in a full 360-degree arc. Soft studio lighting from above, subtle rim light. Clean white background with a faint shadow underneath. Elegant, commercial photography style. No movement in the product itself.
4. Emotional Portrait
A middle-aged man sits at a diner window, watching rain hit the glass. Tight over-the-shoulder shot, slowly dollying left to reveal his reflection. Warm interior tungsten light contrasting cold blue rain outside. Intimate, film noir influenced. Kodak Vision3 500T emulation.
5. Nature Macro
Extreme close-up of a single red poppy petal with a water droplet trembling near the edge. Shallow depth of field, soft bokeh background. Natural morning side-light. The droplet slowly falls. Slow-motion, 120fps feel. Lush, saturated reds against a blurred green field.
Common Mistakes and How to Fix Them
Mistake: Style adjectives with no context
- ❌ "Cinematic beautiful epic"
- ✅ "Cinematic — anamorphic lens, shallow depth of field, color graded in teal and orange"
Mistake: Multiple conflicting moods
- ❌ "Playful yet dark, happy yet melancholy"
- ✅ Pick one emotional direction and commit to it
Mistake: Static descriptions for T2V
- ❌ "A mountain with snow"
- ✅ "Snow falls from a heavy sky onto the peak; slow push-in from a wide establishing shot"
Mistake: Over-specifying the I2V prompt
- In I2V mode your image already locks the subject. Adding a full physical description of what's already in the image wastes word budget and can conflict with the reference.
Mistake: Ignoring aspect ratio cues
- Wan 2.7 respects anamorphic and aspect ratio language. "Anamorphic 2.39:1" and "vertical 9:16 social format" steer the composition noticeably.
