Wan wan2.2-s2v digital human lip-sync video generation API reference - Alibaba Cloud Model Studio

The wan2.2-s2v digital human model can generate videos of a person speaking, singing, or performing with natural movements from a single image and an audio file.

Audio-driven: An input audio file drives a character in a static image, synchronizing its lip movements, expressions, and actions with the audio.
Rich scenarios: Supports three lip-sync scenarios: "speaking", "singing", and "performing".
Diverse characters: Supports real people (portraits, half-body, and full-body) and cartoon characters.
Output video resolution: Supports 480p and 720p resolution options.

Important

This document applies only to the China (Beijing) region. To use the model, you must use an API key from the China (Beijing) region.

Model and pricing

Beijing region

Model

Unit price

Rate limit (shared by Alibaba Cloud accounts and RAM users)

Task submission RPS limit

Number of concurrent tasks

wan2.2-s2v

480P: $0.071677/second

720P: $0.129018/second

Click to view a billing example

The billing formula is: Total cost = Actual video duration (seconds) × Unit price of the selected resolution.

Assume you generate a video with the 480p resolution, and the `usage.video_duration` returned for the successful task is 10.23 seconds.

Cost calculation: 10.23 seconds × $0.071677/second = $0.73325571

Note: The billable duration is based on the `usage.video_duration` field returned for the successful task.

HTTP API

Prerequisites

You have activated the service and obtained an API key. For more information, see Preparations: Configure an API key.
You have configured the environment variable for your API key. For more information, see Configure an API key as an environment variable (to be deprecated and merged into Configure an API key).

Step 1: Create a task and get the task ID

POST https://dashscope.aliyuncs.com/api/v1/services/aigc/image2video/video-synthesis

Note

Because this model call takes a long time, the task is created through an asynchronous invocation.
After the task is created, the system immediately returns a task_id. In the next step, use this `task_id` to query the task result within 24 hours.

Request parameters

Field	Type	Passing method	Required	Description	Example
Content-Type	String	Header	Yes	The request type. Set the value to application/json.	application/json
Authorization	String	Header	Yes	The API key. The format is Bearer sk-xxx.	Bearer sk-1a**2b
X-DashScope-Async	String	Header	Yes	A static field set to `enable`, which indicates that an asynchronous invocation is used.	enable
model	String	Body	Yes	The model to call.	wan2.2-s2v
input.image_url	String	Body	Yes	The URL of the uploaded image. Image format: JPG, JPEG, PNG, BMP, and WEBP are supported. Image resolution: The width and height of the image must be between 400 and 7,000 pixels. Only HTTP/HTTPS links accessible over the Internet are supported.	http://aaa/bbb.jpg
input.audio_url	String	Body	Yes	The URL of the uploaded audio file. Audio format: WAV and MP3 are supported. Audio limits: The file size must be less than 15 MB, and the duration must be less than 20 seconds. Audio content: The audio must contain clear and loud human speech. Remove interference such as ambient noise and background music. Only HTTP/HTTPS links accessible over the Internet are supported.	http://aaa/bbb.mp3
parameters.resolution	String	Body	No	The video resolution level. Valid values are 480P and 720P. The default value is 480P. The model tries to keep the aspect ratio of the output video the same as the input image. It adjusts the total pixels of the video to be close to the selected level while keeping the aspect ratio unchanged. Example 480P: This resolution is typically 640 × 480 (about 310,000 pixels) with a 4:3 aspect ratio. 720P: This resolution is typically 1280 × 720 (about 920,000 pixels) with a 16:9 aspect ratio. Example: If the input image has a 4:5 aspect ratio and you select the 480P level, the output video will maintain the 4:5 aspect ratio. The resolution will be adjusted to be close to 310,000 pixels. For example, the output video resolution might be 480 × 600, for a total of 288,000 pixels. This data is for reference only. The actual output may vary.	480P

Response parameters

Field	Type	Description	Example
output.task_id	String	The unique ID of the asynchronous task.	a8532587-fa8c-4ef8-82be-0c46b17950d1
output.task_status	String	The status of the job after the asynchronous task is submitted.	PENDING
request_id	String	The unique ID of the request.	7574ee8f-38a3-4b1e-9280-11c33ab46e51

Sample request

curl 'https://dashscope.aliyuncs.com/api/v1/services/aigc/image2video/video-synthesis/' \
 --header 'X-DashScope-Async: enable' \
 --header "Authorization: Bearer $DASHSCOPE_API_KEY" \
 --header 'Content-Type: application/json' \
 --data '{
     "model": "wan2.2-s2v",
     "input": {
            "image_url": "https://img.alicdn.com/imgextra/i3/O1CN011FObkp1T7Ttowoq4F_!!6000000002335-0-tps-1440-1797.jpg",
            "audio_url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250825/iaqpio/input_audio.MP3"
        },
        "parameters": {
            "resolution": "480P"
        }
    }'

Sample response

{
    "output": {
        "task_id": "a8532587-fa8c-4ef8-82be-xxxxxx", 
    	"task_status": "PENDING"
    }
    "request_id": "7574ee8f-38a3-4b1e-9280-xxxxxx"
}

Step 2: Query the result by task ID

Use the task_id from the previous step to send a GET request to query the task status and result. Replace {task_id} in the URL with your actual task ID.

GET https://dashscope.aliyuncs.com/api/v1/tasks/{task_id}

Note

Video generation tasks take a long time. Use a polling mechanism and set a reasonable query interval, such as 10 seconds, to retrieve the result.
The video_url returned for a successful task is valid for 24 hours. Download and save the video promptly.

Request parameters

Field	Type	Passing method	Required	Description	Example
Authorization	String	Header	Yes	The API key. Example: Bearer sk-xxx.	Bearer sk-xxx
task_id	String	Url Path	Yes	The ID of the task to query.	a8532587-fa8c-4ef8-82be-0c46b17950d1

Response parameters

Field	Type	Description	Example
output.task_id	String	The ID of the queried task.	a8532587-fa8c-4ef8-82be-0c46b17950d1
output.task_status	String	The task status. Possible values include the following: PENDING RUNNING SUCCEEDED FAILED UNKNOWN CANCELED	SUCCEEDED
output.submit_time	String	The time the task was submitted.	2025-09-01 09:37:27.468
output.scheduled_time	String	The time the task started running.	2025-09-01 09:37:34.885
output.end_time	String	The time the task was completed.	2025-09-01 09:40:20.734
output.results.video_url	String	The generated video file. The video_url is valid for 24 hours. Download it promptly.	https://xxx/1.mp4?Expires=xxx
usage.duration	Float	The video duration in seconds. This is used for billing by the second.	10.23
usage.video_count	Integer	The number of videos generated.	1
usage.SR	Integer	The resolution level of the generated video.	480
output.code	String	The error code. This parameter is returned when the task fails.	InvalidParameter
output.message	String	The error details. This parameter is returned when the task fails.	The request is missing required parameters or in a wrong format
request_id	String	The unique ID of the request.	7574ee8f-38a3-4b1e-9280-11c33ab46e51

Sample request

Replace 86ecf553-d340-4e21-xxxxxxxxx with your actual task ID.

curl -X GET https://dashscope.aliyuncs.com/api/v1/tasks/86ecf553-d340-4e21-xxxxxxxxx \
--header "Authorization: Bearer $DASHSCOPE_API_KEY"

Note

You can query the task result using the `task_id` only within 24 hours. After that, the result is automatically purged by the system.

Sample responses

Successful response

Task data, such as the task status and video URL, is retained for only 24 hours. After that, the data is automatically purged. Save the results promptly.

{
    "output": {
        "task_id": "bcae8761-f242-4775-a11e-xxxxxx",
        "task_status": "SUCCEEDED",
        "submit_time": "2025-09-01 09:37:27.468",
        "scheduled_time": "2025-09-01 09:37:34.885",
        "end_time": "2025-09-01 09:40:20.734",
        "results": {
            "video_url": "http://dashscope-result-hz.oss-cn-hangzhou.aliyuncs.com/1d/xxx.mp4?Expires=xxxxxx"
        }
    },
    "usage": {
        "duration": 18.13,
        "video_count": 1,
        "SR": 480
        },
    "request_id": "28cfedb1-cd60-9e0c-b920-xxxxxx"
}

Failed response

{
    "request_id": "8d49f522-f6a4-9eed-b322-xxxxxx",
    "output": {
        "task_id": "101ad32f-7653-4ae9-8f22-xxxxxx",
        "task_status": "FAILED",
        "submit_time": "2025-09-01 11:43:41.174",
        "scheduled_time": "2025-09-01 11:43:48.937",
        "end_time": "2025-09-01 11:43:49.802",
        "code": "InvalidURL",
        "message": "Required URL is missing or invalid, please check the request URL."
    }
}

Billing and Rate limit

Billing rules

Billable item: You are billed for the number of seconds of successfully generated video on a pay-as-you-go basis.
Billing formula: Fee = Unit price × Video duration (seconds).
Billing priority: Your free quota is consumed first. After your free quota is exhausted, the pay-as-you-go billing method is used by default.
- You can enable the "Free quota only" feature to prevent extra charges after your free quota is exhausted. For more information, see Free quota for new users.
No charge for failures: Failed model calls or processing errors do not incur fees or consume free quotas.

Free Quotas

For more information about how to claim, query, and use free quotas, see Free quota for new users.

Querying usage

Approximately one hour after a model call is complete, you can go to the Model Observation (Singapore) page to view metrics such as usage, number of calls, and success rate.

If your model is in the China (Beijing) region, go to the Model Observation page.

Rate limiting

For model rate limiting rules and FAQs, see Rate limits.

Error codes

If a model call fails and an error message is returned, see Error messages.