All Products
Search
Document Center

Alibaba Cloud Model Studio:Wan - reference-to-video - API reference

Last Updated:Jan 28, 2026

Wan reference-to-video model generates videos with consistent characters from a prompt and a character's appearance from an input video or image. If a video is used as the reference, its voice can also be transferred.

  • Basic features: Supports video duration from 2 to 10 seconds in integer increments, video resolution (720P or 1080P), and watermarking.

  • Audio features: Generates sound from prompts and reference the voice of the input video.

  • Multi-shot narrative: Generates videos with multiple shots while maintaining subject consistency.

Quick entry: Try Wan online

Note

The features available on the Wan official website may differ from those supported by the API. This document describes the API's current capabilities and will be updated as they change.

Prerequisites

You must obtain an API key and set the API key as an environment variable.

Important

The Beijing and Singapore regions have separate API keys and request endpoints. Do not use them interchangeably. Cross-region calls result in authentication failures or service errors.

HTTP

Because video generation tasks take a long time (usually 1 to 5 minutes), the API uses asynchronous invocation. The process involves two main steps: "Create a task -> Poll for the result". The steps are as follows:

The actual time required depends on the number of tasks in the queue and the service's execution status.

Step 1: Create a task to get a task ID

Singapore: POST https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis

US (Virginia): POST https://dashscope-us.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis

China (Beijing): POST https://dashscope.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis

Note
  • After the task is created, use the returned task_id to query the result. The task_id is valid for 24 hours. Do not create duplicate tasks. Instead, use polling to retrieve the result.

  • For a beginner's tutorial, see Postman.

Request parameters

Single-character reference

To generate a multi-shot video, reference the character's appearance from a video or image (and voice from a video), and set `shot_type` to `multi`.

# Note: If you use a model in the China (Beijing) region, replace the URL with: https://dashscope.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis
curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis' \
    -H 'X-DashScope-Async: enable' \
    -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
    "model": "wan2.6-r2v",
    "input": {
        "prompt": "character1 drinks bubble tea while dancing impromptu to the music.",
        "reference_urls":["https://cdn.wanxai.com/static/demo-wan26/vace.mp4"]
    },
    "parameters": {
        "size": "1280*720",
        "duration": 5,
        "shot_type":"multi"
    }
}'

Multi-character reference

To generate a multi-shot video, use reference videos or images for characters and props, define their relationship in a prompt, and set `shot_type` to `multi`. You can reference the same character multiple times in the prompt.

# Note: If you use a model in the China (Mainland) region, replace the URL with: https://dashscope.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis
curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis' \
    -H 'X-DashScope-Async: enable' \
    -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
    "model": "wan2.6-r2v",
    "input": {
        "prompt": "character1 says to character2: “I’ll rely on you tomorrow morning!” character2 replies: “You can count on me!”",
        "reference_urls": [
            "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20251217/dlrrly/%E5%B0%8F%E5%A5%B3%E5%AD%A91%E8%8B%B1%E6%96%872.mp4",
            "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20251217/fkxknn/%E9%93%83%E9%93%83.mp4"
        ]
    },
    "parameters": {
        "size": "1280*720",
        "duration": 10,
        "shot_type": "multi"
    }
}'

Request headers

Content-Type string (Required)

The content type of the request. Must be application/json.

Authorization string (Required)

The authentication credentials using a Model Studio API key.

Example: Bearer sk-xxxx

X-DashScope-Async string (Required)

Enables asynchronous processing. Must be enable as HTTP requests support only asynchronous processing.

Important

Returns "current user api does not support synchronous calls" error if not included.

Request body

model string (Required)

The model name. For a list of models and their prices, see Model pricing.

Example: wan2.6-r2v.

input object (Required)

The basic input information, such as the prompt.

Properties

prompt string (Required)

The text prompt. It describes the elements and visual features you want in the generated video.

Supports Chinese and English. Each Chinese character, letter, and punctuation mark counts as one character. The text is truncated if it exceeds the limit.

  • wan2.6-r2v: The length cannot exceed 1,500 characters.

Character reference instructions: Use identifiers such as "character1" and "character2" to reference characters from the reference files. Each reference file (video or image) must contain only a single character. The model uses only this method to identify characters.

Example: character1 is happily watching a movie on the sofa.

For tips, see Text-to-video/image-to-video prompt guide.

negative_prompt string (Optional)

The negative prompt. It describes content you do not want to see in the video frames, which helps constrain the video output.

Supports Chinese and English. The length cannot exceed 500 characters. The text is truncated if it exceeds the limit.

Example: low resolution, error, worst quality, low quality, deformed, extra fingers, bad proportions.

reference_urls array[string] (Required)

Important

The reference_urls parameter directly affects billing. For billing rules, see Billing and rate limits.

An array of URLs for the uploaded reference files. You can use videos or images. The model uses these files to extract a character's appearance and, if applicable, voice to generate a video with the referenced features.

  • Each URL can point to one image or one video:

    • Number of images: 0 to 5

    • Number of videos: 0 to 3

    • Total limit: Images + Videos ≤ 5

  • When you provide multiple reference files, their order in the array defines the character order. The first URL corresponds to character1, the second to character2, and so on.

  • Each reference file must contain only one main character. For example, character1 is a little girl, and character2 is an alarm clock.

  • The URLs must use the HTTP or HTTPS protocol.

Requirements for reference videos:

  • Format: MP4, MOV.

  • Duration: 1s to 30s.

  • Video size: Does not exceed 100 MB.

Requirements for reference images:

  • Format: JPEG, JPG, PNG (alpha channel not supported), BMP, WEBP.

  • Resolution: Both width and height must be between 240 and 5,000 pixels.

  • Image size: Does not exceed 10 MB.

Example: ["https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/xxx.mp4", "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/xxx.jpg"].

Deprecated fields

reference_video_urls array[string]

Important

We recommend using reference_urls instead of reference_video_urls.

An array of URLs for uploaded reference videos. The model uses these videos to extract a character's appearance and, if applicable, voice to generate a video with the referenced features.

  • Up to 3 videos are supported.

  • When you provide multiple videos, their order in the array defines the character order. The first URL corresponds to character1, the second to character2, and so on.

  • Each reference video must contain only one character (for example, character1 is a little girl, and character2 is an alarm clock).

  • URLs support the HTTP or HTTPS protocol.

Requirements for each video:

  • Format: MP4, MOV.

  • Duration: 2 s to 30 s.

  • File size: The video cannot exceed 100 MB.

Example: ["https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/xxx.mp4"].

parameters object (Optional)

Video editing parameters. Use these to set the video resolution, enable prompt rewriting, add a watermark, and more.

Properties

size string (Optional)

Important
  • The size parameter directly affects billing. The cost is calculated as Unit price (based on resolution) × Duration (seconds). For the same model, 1080P costs more than 720P. Before you call the API, confirm the model pricing.

  • The size must be set to specific values (such as 1280*720), not ratios such as 1:1 or tiers such as 720P.

Specifies the resolution of the generated video in the width*height format. The default value and available enumerated values for this parameter depend on the model parameter. The rules are as follows:

  • wan2.6-r2v: The default value is 1920*1080 (1080P). Optional resolutions include all resolutions corresponding to 720P and 1080P.

720P tier: The optional video resolutions and their corresponding video aspect ratios are:

  • 1280*720: 16:9.

  • 720*1280: 9:16.

  • 960*960: 1:1.

  • 1088*832: 4:3.

  • 832*1088: 3:4.

1080P tier: The optional video resolutions and their corresponding video aspect ratios are:

  • 1920*1080: 16:9.

  • 1080*1920: 9:16.

  • 1440*1440: 1:1.

  • 1632*1248: 4:3.

  • 1248*1632: 3:4.

duration integer (Optional)

Important

The duration parameter directly affects billing. Cost = Unit price (based on resolution) × Duration (seconds).

The duration of the generated video in seconds.

  • wan2.6-r2v: An integer between 2 and 10. The default value is 5.

Example: 5.

shot_type string (Optional)

Specifies the shot type of the generated video. This determines whether the video consists of a single continuous shot or multiple changing shots.

Parameter priority: shot_type > prompt. For example, if shot_type is set to "single", the model will output a single-shot video even if the prompt includes "generate a multi-shot video".

Optional values:

  • single (default): Outputs a single-shot video.

  • multi: Outputs a multi-shot video.

Example: single.

Note

Use this parameter when you need to strictly control the narrative structure of the video, such as using a single shot for a product display or multiple shots for a short story.

watermark boolean (Optional)

Specifies whether to add a watermark. The watermark is placed in the lower-right corner of the video and displays the text "AI Generated".

  • false (default)

  • true

Example: false.

seed integer (Optional)

The random number seed. Must be an integer between 0 and 2147483647.

If not provided, a random seed is generated. Using a fixed seed improves reproducibility, though results may still vary due to model randomness.

Example: 12345

Response parameters

Successful response

Save the task_id to query the task status and result.

{
    "output": {
        "task_status": "PENDING",
        "task_id": "0385dc79-5ff8-4d82-bcb6-xxxxxx"
    },
    "request_id": "4909100c-7b5a-9f92-bfe5-xxxxxx"
}

Error response

Task creation failed. See error codes to resolve the issue.

{
    "code": "InvalidApiKey",
    "message": "No API-key provided.",
    "request_id": "7438d53d-6eb8-4596-8835-xxxxxx"
}

output object

The task output information.

Properties

task_id string

The ID of the task. Can be used to query the task for up to 24 hours.

task_status string

The status of the task.

Enumeration

  • PENDING

  • RUNNING

  • SUCCEEDED

  • FAILED

  • CANCELED

  • UNKNOWN: Task does not exist or status is unknown

request_id string

Unique identifier for the request. Use for tracing and troubleshooting issues.

code string

The error code. Returned only when the request fails. See error codes for details.

message string

Detailed error message. Returned only when the request fails. See error codes for details.

Step 2: Query the result by task ID

Singapore: GET https://dashscope-intl.aliyuncs.com/api/v1/tasks/{task_id}

Virginia: GET https://dashscope-us.aliyuncs.com/api/v1/tasks/{task_id}

Beijing: GET https://dashscope.aliyuncs.com/api/v1/tasks/{task_id}

Note
  • Polling suggestion: Video generation can take several minutes. We recommend that you use a polling mechanism with a reasonable query interval, such as 15 seconds, to retrieve the result.

  • Task status transition: PENDING → RUNNING → SUCCEEDED or FAILED.

  • Result URL: After the task is successful, a video URL is returned. The URL is valid for 24 hours. After you retrieve the URL, you must immediately download and save the video to a permanent storage service, such as Object Storage Service (OSS).

  • task_id validity: 24 hours. After this period, you cannot query the result, and the API returns a task status of UNKNOWN.

Request parameters

Query task result

Replace 86ecf553-d340-4e21-xxxxxxxxx with the actual task ID.

API keys are region-specific. See API key documentation for details.
For models in the Beijing region, replace base_url with https://dashscope.aliyuncs.com/api/v1/tasks/86ecf553-d340-4e21-xxxxxxxxx
curl -X GET https://dashscope-intl.aliyuncs.com/api/v1/tasks/86ecf553-d340-4e21-xxxxxxxxx \
--header "Authorization: Bearer $DASHSCOPE_API_KEY"

Request headers

Authorization string (Required)

The authentication credentials using a Model Studio API key.

Example: Bearer sk-xxxx

URL path parameters

task_id string (Required)

The ID of the task to query.

Response parameters

Task successful

Video URLs are retained for only 24 hours and then automatically purged. Save generated videos promptly.

{
    "request_id": "caa62a12-8841-41a6-8af2-xxxxxx",
    "output": {
        "task_id": "eff1443c-ccab-4676-aad3-xxxxxx",
        "task_status": "SUCCEEDED",
        "submit_time": "2025-12-16 00:25:59.869",
        "scheduled_time": "2025-12-16 00:25:59.900",
        "end_time": "2025-12-16 00:30:35.396",
        "orig_prompt": "character1 is happily watching a movie on the sofa",
        "video_url": "https://dashscope-result-sh.oss-accelerate.aliyuncs.com/xxx.mp4?Expires=xxx"
    },
     "usage": {
        "duration": 10.0,
        "size": "1280*720",
        "input_video_duration": 5,
        "output_video_duration": 5,
        "video_count": 1,
        "SR": 720
    }
}

Task failed

When a task fails, task_status is set to FAILED with an error code and message. See error codes to resolve the issue.

{
    "request_id": "e5d70b02-ebd3-98ce-9fe8-759d7d7b107d",
    "output": {
        "task_id": "86ecf553-d340-4e21-af6e-a0c6a421c010",
        "task_status": "FAILED",
        "code": "InvalidParameter",
        "message": "The size is not match xxxxxx"
    }
}

Task query expired

The task_id is valid for 24 hours. After this period, queries fail and return the following error message.

{
    "request_id": "a4de7c32-7057-9f82-8581-xxxxxx",
    "output": {
        "task_id": "502a00b1-19d9-4839-a82f-xxxxxx",
        "task_status": "UNKNOWN"
    }
}

output object

The task output information.

Properties

task_id string (Required)

The ID of the task to query.

task_status string

The task status.

Enumeration

  • PENDING

  • RUNNING

  • SUCCEEDED

  • FAILED

  • CANCELED

  • UNKNOWN: Task does not exist or status is unknown

submit_time string

The time when the task was submitted. Time is in UTC+8. Format: YYYY-MM-DD HH:mm:ss.SSS.

scheduled_time string

The time when the task started running. Time is in UTC+8. Format: YYYY-MM-DD HH:mm:ss.SSS.

end_time string

The time when the task was completed. Time is in UTC+8. Format: YYYY-MM-DD HH:mm:ss.SSS.

video_url string

The URL of the generated video. Returned only when task_status is SUCCEEDED.

URL is valid for 24 hours. Use to download the video in MP4 format with H.264 encoding.

orig_prompt string

The original input prompt. This is the value of the prompt request parameter.

actual_prompt string

When prompt_extend=true, the system rewrites the input prompt. This field returns the optimized prompt that was actually used for generation. If prompt_extend=false, this field is not returned.

Note: The wan2.6 model does not return this field, regardless of the value of prompt_extend.

code string

The error code. Returned only when the request fails. See error codes for details.

message string

Detailed error message. Returned only when the request fails. See error codes for details.

usage object

Statistics on the output information. Only successful results are counted.

Properties

input_video_duration integer

The duration of the input reference video in seconds.

output_video_duration integer

The duration of the output video in seconds.

duration float

The total video duration. Billing is based on the duration.

Formula: duration = input_video_duration + output_video_duration.

SR integer

The resolution tier of the generated video. Example: 720.

sizestring

The resolution of the generated video. The format is "width*height". Example: 1280*720.

video_count integer

The number of generated videos. This is fixed at 1.

request_id string

Unique identifier for the request. Use for tracing and troubleshooting issues.

Limitations

  • Data validity: The task_id and video URL are retained for only 24 hours. After this period, they cannot be queried or downloaded.

  • Content moderation: Both the input prompt and the output video are subject to content moderation. Requests with non-compliant content will result in an "IPInfringementSuspect" or "DataInspectionFailed" error, see Error messages.

  • Network access configuration: The video links are stored in Object Storage Service (OSS). If your business system cannot access external OSS links because of security policies, add the following OSS domain names to your network access whitelist.

    # List of OSS domain names
    dashscope-result-bj.oss-cn-beijing.aliyuncs.com
    dashscope-result-hz.oss-cn-hangzhou.aliyuncs.com
    dashscope-result-sh.oss-cn-shanghai.aliyuncs.com
    dashscope-result-wlcb.oss-cn-wulanchabu.aliyuncs.com
    dashscope-result-zjk.oss-cn-zhangjiakou.aliyuncs.com
    dashscope-result-sz.oss-cn-shenzhen.aliyuncs.com
    dashscope-result-hy.oss-cn-heyuan.aliyuncs.com
    dashscope-result-cd.oss-cn-chengdu.aliyuncs.com
    dashscope-result-gz.oss-cn-guangzhou.aliyuncs.com
    dashscope-result-wlcb-acdr-1.oss-cn-wulanchabu-acdr-1.aliyuncs.com

Billing and rate limits

  • For free quota and unit prices, see Model list and pricing.

  • For rate limits, see Wan series.

  • Billing description:

    • Input images are not billed. Input videos are billed.

    • Billing is based on the combined duration of the input and output videos, measured in seconds. You are charged only when a query returns a task_status of SUCCEEDED and a video is successfully generated.

    • Failed model calls or processing errors do not incur any fees and do not consume the free quota for new users.

    Billing duration calculation rules

    Total billable video duration = Billable input video duration + Billable output video duration.

    Billable input video duration: The total billable duration for input videos does not exceed 5 seconds. This 5-second quota is divided equally among all reference files (images and videos) to set a truncation limit for each individual video. The final billable input duration is the sum of the truncated durations of all reference videos.

    • 1 reference file: The truncation limit for a single video is 5 s.

      • If it is a video: Billable input duration = min(video duration, 5 s).

      • If it is an image: Free of charge.

    • 2 reference files: The truncation limit for a single video is 2.5 s.

      • If 1 video + 1 image: Billable input duration = min(video 1 duration, 2.5 s).

      • If 2 videos: Billable input duration = min(video 1 duration, 2.5 s) + min(video 2 duration, 2.5 s).

    • 3 reference files: The truncation limit for a single video is 1.65 s.

      • If 1 video + 2 images: Billable input duration = min(video 1 duration, 1.65 s).

      • If 3 videos: Billable input duration = min(video 1 duration, 1.65 s) + min(video 2 duration, 1.65 s) + min(video 3 duration, 1.65 s).

    • 4 reference files: The truncation limit for a single video is 1.25 s.

      • If 2 videos + 2 images: Billable input duration = min(video 1 duration, 1.25 s) + min(video 2 duration, 1.25 s).

      • If 3 videos + 1 image: Billable input duration = min(video 1 duration, 1.25 s) + min(video 2 duration, 1.25 s) + min(video 3 duration, 1.25 s).

    • 5 reference files: The truncation limit for a single video is 1 s.

      • If 1 video + 4 images: Billable input duration = min(video 1 duration, 1 s).

      • If 3 videos + 2 images: Billable input duration = min(video 1 duration, 1 s) + min(video 2 duration, 1 s) + min(video 3 duration, 1 s).

    Output video billing duration: The duration in seconds of the successfully generated video.

Error codes

If a call fails, see Error messages for troubleshooting.

FAQ

Q: How can I convert a temporary video link to a permanent one?

A: You cannot convert the link directly. The correct procedure is for your backend service to retrieve the URL, download the video file programmatically, and then upload it to a permanent object storage service, such as Alibaba Cloud OSS. This generates a new, permanent access link.

Example code: Download video to a local file

import requests

def download_and_save_video(video_url, save_path):
    try:
        response = requests.get(video_url, stream=True, timeout=300) # Set timeout
        response.raise_for_status() # Raise an exception if the HTTP status code is not 200
        with open(save_path, 'wb') as f:
            for chunk in response.iter_content(chunk_size=8192):
                f.write(chunk)
        print(f"Video successfully downloaded to: {save_path}")
        # You can add the logic to upload to permanent storage here
    except requests.exceptions.RequestException as e:
        print(f"Failed to download video: {e}")

if __name__ == '__main__':
    video_url = "http://dashscope-result-sh.oss-cn-shanghai.aliyuncs.com/xxxx"
    save_path = "video.mp4"
    download_and_save_video(video_url, save_path)

Q: Can the returned video link be played directly in a browser?

A: This is not recommended because the link expires after 24 hours. The best practice is to have your backend download and store the video, and then use the permanent link for playback.