Choose the right model for text-to-video, image-to-video, and video editing.
Text-to-video
To generate videos with synchronized audio from text prompts, use wan2.7-t2v. It supports multi-shot narratives, 1080P resolution, and clips up to 15 seconds long.
Audio requirements
If your project requires synchronized narration, sound effects, or background music, use wan2.7-t2v. wan2.6-t2v also supports audio and offers better SDK compatibility, while wan2.5-t2v-preview is another option. For silent video, use a cost-effective model like wan2.2-t2v-plus.
Resolution and duration
For 1080P resolution and up to 15 seconds, use wan2.7-t2v. wan2.6-t2v offers the same specifications and is a good secondary option. For 480P to 720P resolution and a 5-second duration, use wan2.2-t2v-plus or the lower-cost wan2.1-t2v-turbo.
Image-to-video
To create dynamic videos from static images, use wan2.7-i2v. wan2.6-i2v offers similar quality and better SDK compatibility. For lower cost, use wan2.6-i2v-flash.
Build long, coherent videos
Use a first/last frame model, such as wan2.7-i2v or wan2.2-kf2v-flash, to stitch multiple clips together. Setting the last frame of one clip as the first frame of the next creates seamless transitions, ideal for narratives, product demos, or tutorials.
Single image to video
Use wan2.7-i2v, which supports audio, 1080P resolution, and durations from 2 to 15 seconds. wan2.6-i2v has the same specifications and offers better SDK compatibility. If speed and low cost are priorities, use wan2.6-i2v-flash.
Video reference
To maintain character consistency across scenes, use wan2.7-r2v. wan2.6-r2v also supports multiple characters and audio synchronization, and offers better SDK compatibility. For a faster, lower-cost option, use wan2.6-r2v-flash.
Video editing
To edit existing videos using text instructions for tasks like style transfer or element replacement, use wan2.7-videoedit. For video inpainting, outpainting, or local editing, use wan2.1-vace-plus.
Character animation
Motion-driven character animation
To transfer motion from a reference video to a character in a static image, use wan2.2-animate-move. The background remains unchanged. The pro mode (wan-pro) produces results closer to live-action footage, while the standard mode (wan-std) is faster and more cost-effective.
Character replacement in video
To replace a character in a video with one from a source image, use wan2.2-animate-mix. It also supports pro and standard modes.
Recommended models
Model | Use case | Audio | Max resolution | Max duration |
| Highest quality text-to-video | Supported | 720P, 1080P | 2–15s |
| Text-to-video, good SDK compatibility | Supported | 720P, 1080P | 2–15s |
| Highest quality image-to-video | Supported | 720P, 1080P | 2–15s |
| Cost-effective image-to-video | Supported | 720P, 1080P | 2–15s |
| Stitching clips with first/last frames | -- | 720P, 1080P | 2–15s |
| Cross-scene character consistency | Supported | 720P, 1080P | 2–10s |
| Cost-effective character consistency | Optional | 720P, 1080P | 2–10s |
| Instruction-based editing, style transfer | -- | 720P, 1080P | 2–10s |
| Transfer motion to a static character | -- | 720P | 2–30s |
| Replace a character in a video | -- | 720P | 2–30s |
All models
Wan 2.7
The following models are available in international and Chinese mainland deployment scopes.
Model ID | Type | Features | Output specifications |
| text-to-video | Audio sync, multi-shot narrative | 720P, 1080P. 2–15s. 30 fps, MP4 |
| image-to-video | First frame, first/last frame, video continuation, audio-driven | 720P, 1080P. 2–15s. 30 fps, MP4 |
| video reference | Multi-character, ImageN/VideoN reference format | 720P, 1080P. 2–10s. 30 fps, MP4 |
| video editing | Instruction-based editing, style transfer | 720P, 1080P. 2–10s. 30 fps, MP4 |
Wan 2.6
The following models are available in international and Chinese mainland deployment scopes.
Model ID | Type | Features | Output specifications |
| text-to-video | Audio sync, multi-shot narrative | 720P, 1080P. 2–15s. 30 fps, MP4 |
| image-to-video | Audio sync, multi-shot narrative | 720P, 1080P. 2–15s. 30 fps, MP4 |
| image-to-video | Audio, multi-shot, fast generation | 720P, 1080P. 2–15s. 30 fps, MP4 |
| video reference | Audio sync, multi-character, narrative | 720P, 1080P. 2–10s. 30 fps, MP4 |
| video reference | Multi-character, fast generation | 720P, 1080P. 2–10s. 30 fps, MP4 |
| text-to-video | Audio sync, multi-shot narrative; for US deployment scope | 720P, 1080P. 2–15s. 30 fps, MP4 |
| image-to-video | Audio sync, multi-shot narrative; for US deployment scope | 720P, 1080P. 2–15s. 30 fps, MP4 |
Wan 2.5
The following models are available in international and Chinese mainland deployment scopes.
Model ID | Type | Features | Output specifications |
| text-to-video | Audio sync | 480P, 720P, 1080P. 5s, 10s. 30 fps, MP4 |
| image-to-video | Audio sync | 480P, 720P, 1080P. 5s, 10s. 30 fps, MP4 |
Wan 2.2
The following models are available in international and Chinese mainland deployment scopes.
Model ID | Type | Features | Output specifications |
| text-to-video | No audio | 480P, 1080P. 5s. 30 fps, MP4 |
| image-to-video | No audio | 480P, 1080P. 5s. 30 fps, MP4 |
| image-to-video | No audio, 50% faster than 2.1 | 480P, 720P, 1080P. 5s. 30 fps, MP4 |
| first/last frame | No audio | 480P, 720P, 1080P. 5s. 30 fps, MP4 |
| character animation | wan-std / wan-pro modes | 720P. 2–30s. 15/25 fps. MP4 |
| character replacement | wan-std / wan-pro modes | 720P. 2–30s. 15/25 fps. MP4 |
Wan 2.1 (Wan 2.7 is recommended)
The following models are available in international and Chinese mainland deployment scopes.
Model ID | Type | Features | Output specifications |
| text-to-video | No audio | 720P. 5s. 30 fps, MP4 |
| text-to-video | No audio | 480P, 720P. 5s. 30 fps, MP4 |
| image-to-video | No audio | 720P. 5s. 30 fps, MP4 |
| image-to-video | No audio | 480P, 720P. 3–5s. 30 fps, MP4 |
| first/last frame | No audio | 720P. 5s. 30 fps, MP4 |
| video editing | No audio | 720P. Max 5s. 30 fps, MP4 |