Wan image-to-video API reference - Alibaba Cloud Model Studio

The Wan image-to-video model supports multi-modal input (text, images, audio, and video) and can perform three main tasks: video generation from the first frame, video generation from the first and last frames, and video continuation.

Note

The new image-to-video API (wan2.7-i2v model) supports these three tasks. Use this new API.

The original image-to-video from first frame API (wan2.6 and earlier models) supports only video generation from the first frame.

Availability

For successful API calls, use the same region for model, endpoint URL, and API key. Cross-region calls will fail.

Select a model: Confirm the model is available in your target region.
Select a URL: Choose the corresponding regional endpoint URL.
Configure an API key: Get an API key for the region, then configure the API key as an environment variable.

Note

The sample code in this topic applies to the Singapore region.

HTTP

Important

This API uses the new image-to-video protocol and supports only the wan2.7 model.

Image-to-video tasks take 1 to 5 minutes, so the API uses asynchronous invocation. The process has two steps:

Step 1: Create a task and get the task ID

Singapore

POST https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis

Beijing

POST https://dashscope.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis

Note

After the task is created, use the returned task_id to query the result. The task_id is valid for 24 hours. Do not create duplicate tasks. Instead, use polling to retrieve the result.
For a beginner's tutorial, see Postman.

Request parameters	Video generation from the first frame Generate a video based on a first frame image and audio. curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis' \ -H 'X-DashScope-Async: enable' \ -H "Authorization: Bearer $DASHSCOPE_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "model": "wan2.7-i2v", "input": { "prompt": "A scene of urban fantasy art. A dynamic graffiti art character. A boy made of spray paint comes to life on a concrete wall. He sings an English rap song at high speed while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The light comes from a single street lamp, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of the rap, with no other dialogue or noise.", "media": [ { "type": "first_frame", "url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/wpimhv/rap.png" }, { "type": "driving_audio", "url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/ozwpvi/rap.mp3" } ] }, "parameters": { "resolution": "720P", "duration": 10, "prompt_extend": true, "watermark": true } }' Video generation from the first and last frames Pass a first frame and a last frame to generate a video. curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis' \ -H 'X-DashScope-Async: enable' \ -H "Authorization: Bearer $DASHSCOPE_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "model": "wan2.7-i2v", "input": { "prompt": "Realistic style, a small black cat looks up at the sky curiously. The camera angle gradually rises from eye level, finally capturing its curious gaze from a top-down view.", "media": [ { "type": "first_frame", "url": "https://wanx.alicdn.com/material/20250318/first_frame.png" }, { "type": "last_frame", "url": "https://wanx.alicdn.com/material/20250318/last_frame.png" } ] }, "parameters": { "resolution": "720P", "duration": 10, "prompt_extend": false, "watermark": true } }' Video continuation Generate subsequent content based on an initial video clip. curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis' \ -H 'X-DashScope-Async: enable' \ -H "Authorization: Bearer $DASHSCOPE_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "model": "wan2.7-i2v", "input": { "prompt": "A dog wearing sunglasses skateboards on a street, 3D cartoon.", "media": [ { "type": "first_clip", "url": "http://wanx.alicdn.com/material/20250318/video_extension_1.mp4" } ] }, "parameters": { "resolution": "720P", "duration": 10, "prompt_extend": true, "watermark": true } }'
Content-Type `string` (Required) The content type of the request. Must be `application/json`.
Authorization `string` (Required) The authentication credentials using a Model Studio API key. Example: `Bearer sk-xxxx`
X-DashScope-Async `string` (Required) Enables asynchronous processing. Must be `enable` as HTTP requests support only asynchronous processing. Important Returns "current user api does not support synchronous calls" error if not included.
Request body
model `string` (Required) The model name. For a list of models and their pricing, see Model pricing. Example: wan2.7-i2v.
input `object` (Required) The basic input information, such as the prompt. Properties prompt `string` (Optional) Text prompt: describes the elements and visual characteristics for the generated video. Chinese and English are supported. Each Chinese character or letter counts as one character. Text that exceeds the limit is automatically truncated. The length limit varies by model version: wan2.7-i2v: up to 5,000 characters. Example: A kitten runs on the grass. For more information about how to use prompts, see Prompt guide for text-to-video and image-to-video. negative_prompt `string` (Optional) The negative prompt. Describes content you do not want in the video. Chinese and English are supported. The prompt can be up to 500 characters long. Text that exceeds the limit is automatically truncated. Example: low resolution, error, worst quality, low quality, deformed, extra fingers, bad proportions. media `array` (Required) Specifies reference materials (images, audio, and video) for video generation. Each element in the array is a media object that contains the `type` and `url` fields. Asset combinations Only the following asset combinations are supported. Invalid combinations result in an error. Video generation from the first frame: First frame: `first_frame` First frame + audio: `first_frame+driving_audio` Video generation from the first and last frames: First frame + last frame: `first_frame+last_frame` First frame + last frame + audio: `first_frame+last_frame+driving_audio` Video continuation: First video clip continuation: `first_clip` First video clip + last frame continuation: `first_clip+last_frame` Properties type `string` (Required) The type of media asset. Valid values: `first_frame` `last_frame` `driving_audio` `first_clip` Limit: Each `type` can appear at most once in the `media` array. For example, you cannot pass two `first_frame` assets. url `string` (Required) The URL of the media asset. Pass an image (type=first_frame or last_frame) The URL of the first or last frame. Image limits: Format: JPEG, JPG, PNG (alpha channel not supported), BMP, WEBP. Resolution: The width and height must be in the range of [240, 8000] pixels. Aspect ratio: 1:8 to 8:1. File size: up to 20 MB. Supported input formats: Public URL: The HTTP or HTTPS protocol is supported. Example: https://xxx/xxx.png. Pass audio (type=driving_audio) The URL of the audio file. Pass audio: The model uses the audio as a driving source to generate the video, such as for lip-syncing and action timing. Do not pass audio: The model automatically generates matching background music or sound effects based on the video content. Audio limits: Format: WAV, MP3. Duration: 2 s to 30 s. File size: up to 15 MB. Truncation: If the audio duration exceeds the `duration` value, for example, 5 s, the first 5 s are used and the remaining audio is discarded. If the audio is shorter than the video, the portion of the video exceeding the audio duration will be silent. For example, if the audio is 3 s long and the video is 5 s long, the first 3 s of the output video will have sound, and the last 2 s will be silent. Supported input formats: Public URL: The HTTP and HTTPS protocols are supported. Example: https://xxx/xxx.mp3. Pass a video (type=first_clip) The URL of the video file. The model continues the video based on its content. The maximum duration of the continuation is controlled by the `duration` parameter. For example, if duration=15 and the input video is 3 s long, the model generates a 12-s continuation. The final output video is 15 s long and is billed for 15 s. Video limits: Format: MP4, MOV. Duration: 2 s to 10 s. Resolution: The width and height must be in the range of [240, 4096] pixels. Aspect ratio: 1:8 to 8:1. File size: up to 100 MB. Supported input formats: Public URL: The HTTP and HTTPS protocols are supported. Example: https://xxx/xxx.mp4.
parameters `object` (Optional) Video processing parameters, such as resolution, duration, prompt rewriting, and watermarks. Properties resolution `string` (Optional) Important The resolution directly affects the cost. Before you make a call, confirm the Model pricing. The resolution tier for the generated video. Controls the total pixel count. The model automatically scales the video to a total pixel count close to the selected resolution tier. The video's aspect ratio should be as consistent as possible with the input material (first frame or first video clip). For more information, see FAQ. wan2.7-i2v: Valid values are 720P and 1080P. Default: `1080P`. Example: 1080P. duration `integer` (Optional) Important The duration directly affects the cost. Billing is by the second. Before you make a call, confirm the Model pricing. The duration of the generated video in seconds. wan2.7-i2v: an integer from 2 to 15. Default: 5. Example: 5. prompt_extend`boolean` (Optional) Specifies whether to enable prompt rewriting. When enabled, a large language model rewrites the input prompt. This can improve results for short prompts but increases the running time. `true` (default) `false` Example: true. watermark `boolean` (Optional) Specifies whether to add a watermark. The watermark is placed in the lower-right corner of the video and contains the fixed text "AI Generated". `false` (default) `true` Example: false. seed `integer` (Optional) The random number seed. Must be an integer between `0` and `2147483647`. If not provided, a random seed is generated. Using a fixed seed improves reproducibility, though results may still vary due to model randomness. Example: `12345`

Response parameters	Successful response Save the `task_id` to query the task status and result. `{ "output": { "task_status": "PENDING", "task_id": "0385dc79-5ff8-4d82-bcb6-xxxxxx" }, "request_id": "4909100c-7b5a-9f92-bfe5-xxxxxx" }` Error response Task creation failed. See error codes to resolve the issue. `{ "code": "InvalidApiKey", "message": "No API-key provided.", "request_id": "7438d53d-6eb8-4596-8835-xxxxxx" }`
output `object` The task output information. Properties task_id `string` The ID of the task. Can be used to query the task for up to 24 hours. task_status `string` The status of the task. Enumeration PENDING RUNNING SUCCEEDED FAILED CANCELED UNKNOWN: Task does not exist or status is unknown
request_id `string` Unique identifier for the request. Use for tracing and troubleshooting issues.
code `string` The error code. Returned only when the request fails. See error codes for details.
message `string` Detailed error message. Returned only when the request fails. See error codes for details.

Step 2: Query the result by task ID

Singapore

GET https://dashscope-intl.aliyuncs.com/api/v1/tasks/{task_id}

Beijing

GET https://dashscope.aliyuncs.com/api/v1/tasks/{task_id}

Note

Polling suggestion: Video generation can take several minutes. Use a polling mechanism with a reasonable query interval, such as 15 seconds, to retrieve the result.
Task status transition: PENDING → RUNNING → SUCCEEDED or FAILED.
Result URL: After the task is successful, a video URL is returned. The URL is valid for 24 hours. After you retrieve the URL, you must immediately download and save the video to a permanent storage service, such as Object Storage Service (OSS).
task_id validity: 24 hours. After this period, you cannot query the result, and the API returns a task status of UNKNOWN.

Request parameters	Query task result Replace `{task_id}` with the `task_id` value returned by the previous API call. `task_id` is valid for queries within 24 hours. `curl -X GET https://dashscope-intl.aliyuncs.com/api/v1/tasks/{task_id} \ --header "Authorization: Bearer $DASHSCOPE_API_KEY"`
Headers
Authorization `string` (Required) The authentication credentials using a Model Studio API key. Example: `Bearer sk-xxxx`
Path parameters
task_id `string` (Required) The ID of the task to query.

Response parameters	Task successful Video URLs are retained for only 24 hours and then automatically purged. Save generated videos promptly. { "request_id": "2ca1c497-f9e0-449d-9a3f-xxxxxx", "output": { "task_id": "af6efbc0-4bef-4194-8246-xxxxxx", "task_status": "SUCCEEDED", "submit_time": "2025-09-25 11:07:28.590", "scheduled_time": "2025-09-25 11:07:35.349", "end_time": "2025-09-25 11:17:11.650", "orig_prompt": "A scene of urban fantasy art. A dynamic graffiti art character. A boy made of spray paint comes to life on a concrete wall. He sings an English rap song at high speed while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The light comes from a single street lamp, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of his rap, with no other dialogue or noise.", "video_url": "https://dashscope-result-sh.oss-cn-shanghai.aliyuncs.com/xxx.mp4?Expires=xxx" }, "usage": { "duration": 15, "input_video_duration": 0, "output_video_duration": 15, "video_count": 1, "SR": 720 } } Task failed When a task fails, `task_status` is set to FAILED with an error code and message. See error codes to resolve the issue. `{ "request_id": "e5d70b02-ebd3-98ce-9fe8-759d7d7b107d", "output": { "task_id": "86ecf553-d340-4e21-af6e-a0c6a421c010", "task_status": "FAILED", "code": "InvalidParameter", "message": "The size is not match xxxxxx" } }` Task query expired The `task_id` is valid for 24 hours. After this period, queries fail and return the following error message. `{ "request_id": "a4de7c32-7057-9f82-8581-xxxxxx", "output": { "task_id": "502a00b1-19d9-4839-a82f-xxxxxx", "task_status": "UNKNOWN" } }`
output `object` The task output information. Properties task_id `string` The ID of the task. Can be used to query the task for up to 24 hours. task_status `string` The status of the task. Enumeration PENDING RUNNING SUCCEEDED FAILED CANCELED UNKNOWN: Task does not exist or status is unknown Status transitions during polling: PENDING → RUNNING → SUCCEEDED or FAILED First query typically returns PENDING or RUNNING SUCCEEDED status includes the generated video URL in the response FAILED status requires checking the error message and retrying submit_time `string` The time when the task was submitted. Time is in UTC+8. Format: `YYYY-MM-DD HH:mm:ss.SSS`. scheduled_time `string` The time when the task started running. Time is in UTC+8. Format: `YYYY-MM-DD HH:mm:ss.SSS`. end_time `string` The time when the task was completed. Time is in UTC+8. Format: `YYYY-MM-DD HH:mm:ss.SSS`. video_url `string` The URL of the generated video. Returned only when `task_status` is SUCCEEDED. URL is valid for 24 hours. Use to download the video in MP4 format with H.264 encoding. orig_prompt `string` The original input prompt. This is the value of the `prompt` request parameter. code `string` The error code. Returned only when the request fails. See error codes for details. message `string` Detailed error message. Returned only when the request fails. See error codes for details.
usage `object` Statistics for the output information. Only successful results are counted. Properties input_video_duration `integer` The duration of the input video in seconds. output_video_duration `integer` The duration of the output video in seconds. duration `integer` The total video duration, used for billing. SR `integer` The resolution tier of the output video. Example: 720. video_count `integer` The number of output videos. The value is fixed at 1.
request_id `string` Unique identifier for the request. Use for tracing and troubleshooting issues.

Limitations

Data validity: The task_id and video_url are retained for only 24 hours. After this period, you cannot query or download them.
Content moderation: The input content (such as prompts, images, and videos) and the output video are subject to content moderation. If the content violates the rules, the system returns an "IPInfringementSuspect" or "DataInspectionFailed" error. For more information, see Error messages.

Error codes

If a model call fails and returns an error message, see Error messages to resolve the issue.

FAQ

Q: How do I generate a video with a specific aspect ratio, such as 3:4?

A: The output video's aspect ratio is determined by the input material (first frame image or first video clip). However, the output aspect ratio is not guaranteed to be exactly the same as the input ratio. For example, it may not be exactly 3:4. A slight drift may occur.

The following example explains the logic using an "input first frame image":

Why does bias occur?
- Execution logic: The system uses the input image's aspect ratio as a baseline reference. It combines this with the target total pixels of the resolution tier. The video width and height must be multiples of 16 because of video encoding specifications. The system automatically adjusts to the closest valid resolution.
- Calculation example: An input first frame image measures 750 × 1000 pixels (aspect ratio 3:4 = 0.75). The resolution is set to "720P" (with a target of approximately 920,000 total pixels). The actual output video resolution is 816 × 1104 pixels (aspect ratio ≈ 0.739, approximately 900,000 total pixels).
Recommendations:
- Input control: Use a first frame or video clip that matches your target aspect ratio.
- Post-processing: If you have strict aspect ratio requirements, use an editing tool to crop the video or add black bars after generation.

Q: How do I get the whitelist of domain names for video storage access?

A: Videos generated by models are stored in OSS. The API returns a temporary public URL. To configure a firewall whitelist for this download URL, note the following: The underlying storage may change dynamically. This topic does not provide a fixed OSS domain name whitelist to prevent access issues caused by outdated information. If you have security control requirements, contact your account manager to obtain the latest OSS domain name list.

Availability

HTTP

Step 1: Create a task and get the task ID

Singapore

Beijing

Request parameters

Video generation from the first frame

Video generation from the first and last frames

Video continuation

Request body

Response parameters

Successful response

Error response

Step 2: Query the result by task ID

Singapore

Beijing

Request parameters

Query task result

Headers

Path parameters

Response parameters

Task successful

Task failed

Task query expired

Limitations

Error codes

FAQ

Q: How do I generate a video with a specific aspect ratio, such as 3:4?

Q: How do I get the whitelist of domain names for video storage access?