All Products
Search
Document Center

:Wan - image-to-video - first frame API reference

Last Updated:Apr 17, 2026

The Wan image-to-video model generates a smooth video from a first-frame image and a text prompt.

References: User guide

Note

The newly released Wan2.7 - image-to-video supports first-frame-to-video, first-and-last-frame-to-video, and video continuation. We recommend using it as your preferred choice.

The Wan image-to-video model in this topic (wan2.6 and earlier) only supports first-frame-to-video.

Availability

Model, endpoint URL, and API key must be in the same region -- cross-region calls fail.

Note

Sample codes in this topic apply to the Singapore region.

HTTP

Image-to-video tasks take 1-5 minutes. Use asynchronous invocation: create task, then poll for results.

Step 1: Create a task

Singapore

POST https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis

Virginia

POST https://dashscope-us.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis

Beijing

POST https://dashscope.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis

Frankfurt

POST https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis

Replace WorkspaceId with your actual Workspace ID when making API calls.

Note
  • After the task is created, use the returned task_id to query the result. The task_id is valid for 24 hours. Do not create duplicate tasks. Instead, use polling to retrieve the result.

  • For guidance for beginners, see Postman.

Request parameters

Multi-shot narrative

This feature is supported only by wan2.6 models.

Enable it by setting "prompt_extend": true and "shot_type":"multi".

curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis' \
    -H 'X-DashScope-Async: enable' \
    -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
    "model": "wan2.6-i2v-flash",
    "input": {
        "prompt": "A scene of urban fantasy art. A dynamic graffiti art character. A boy made of spray paint comes to life from a concrete wall. He raps an English song at high speed while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single street lamp, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of his rap, with no other dialogue or noise.",
        "img_url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/wpimhv/rap.png",
        "audio_url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/ozwpvi/rap.mp3"
    },
    "parameters": {
        "resolution": "720P",
        "prompt_extend": true,
        "duration": 10,
        "shot_type":"multi"
    }
}'

Automatic dubbing

This feature is supported only by wan2.6 and wan2.5 models.

If you do not provide input.audio_url, the model automatically generates matching background music or sound effects based on the video content.

curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis' \
    -H 'X-DashScope-Async: enable' \
    -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
    "model": "wan2.5-i2v-preview",
    "input": {
        "prompt": "A scene of urban fantasy art. A dynamic graffiti art character. A boy made of spray paint comes to life from a concrete wall. He raps an English song at high speed while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single street lamp, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of his rap, with no other dialogue or noise.",
        "img_url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/wpimhv/rap.png"
    },
    "parameters": {
        "resolution": "480P",
        "prompt_extend": true,
        "duration": 10
    }
}'

Provide an audio file

This feature is supported only by wan2.6 and wan2.5 models.

To specify background music or voiceover for your video, pass the URL of your custom audio file in the input.audio_url parameter.

curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis' \
    -H 'X-DashScope-Async: enable' \
    -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
    "model": "wan2.5-i2v-preview",
    "input": {
        "prompt": "A scene of urban fantasy art. A dynamic graffiti art character. A boy made of spray paint comes to life from a concrete wall. He raps an English song at high speed while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single street lamp, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of his rap, with no other dialogue or noise.",
        "img_url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/wpimhv/rap.png",
        "audio_url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/ozwpvi/rap.mp3"
    },
    "parameters": {
        "resolution": "480P",
        "prompt_extend": true,
        "duration": 10
    }
}'

Generate a silent video

Only the following models support generating silent videos:

  • wan2.6-i2v-flash: To generate a silent video, you must explicitly set parameters.audio = false.

  • wan2.2 and wan2.1 models: Silent videos are generated by default. No extra parameters are needed.

curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis' \
    -H 'X-DashScope-Async: enable' \
    -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
    "model": "wan2.2-i2v-plus",
    "input": {
        "prompt": "A cat running on the grass",
        "img_url": "https://cdn.translate.alibaba.com/r/wanx-demo-1.png"
    },
    "parameters": {
        "resolution": "480P",
        "prompt_extend": true
    }
}'

Use a negative prompt

Use the negative_prompt parameter to prevent “flowers” from appearing in the generated video.

curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis' \
    -H 'X-DashScope-Async: enable' \
    -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
    "model": "wan2.2-i2v-plus",
    "input": {
        "prompt": "A cat running on the grass",
        "negative_prompt": "flowers",
        "img_url": "https://cdn.translate.alibaba.com/r/wanx-demo-1.png"
    },
    "parameters": {
        "resolution": "480P",
        "prompt_extend": true
    }
}'
Headers

Content-Type string (Required)

The content type of the request. Must be application/json.

Authorization string (Required)

Authenticates the request with a Model Studio API key. Example: Bearer sk-xxxx.

X-DashScope-Async string (Required)

Enables asynchronous processing. HTTP requests support only asynchronous calls. Must be enable.

Important

If this request header is missing, the error "current user api does not support synchronous calls" is returned.

Request body

model string (Required)

The model name. Check the model pricing.

Example: wan2.6-i2v-flash.

input object (Required)

Contains basic input information (e.g., prompt).

Properties

prompt string (Optional)

A text prompt describing desired video content and visual characteristics.

Supports Chinese and English (each character = 1 unit, exceeding text auto-truncated). Length limits by model:

  • wan2.6 and wan2.5 models: up to 1,500 characters.

  • wan2.2 and wan2.1 models: up to 800 characters.

Example: A small cat running on the grass.

For prompt tips, see Text-to-video/image-to-video prompt guide.

negative_prompt string (Optional)

Content to exclude from the video.

Supports Chinese and English. Max 500 characters. Exceeding text is auto-truncated.

Example: low resolution, error, worst quality, low quality, disfigured, extra fingers, bad proportions.

img_url string (Required)

URL or Base64 string of the first-frame image.

Image constraints:

  • Format: JPEG, JPG, PNG (no alpha channel), BMP, WEBP.

  • Resolution: Both width and height must be between 240 and 8,000 pixels.

  • File size:

    • wan2.6 and wan2.5 series models: no more than 20 MB.

    • wan2.2 and wan2.1 series models: no more than 10 MB.

Supported input formats:

  1. Public URL:

    • HTTP and HTTPS are supported.

    • Example: https://cdn.translate.alibaba.com/r/wanx-demo-1.png.

  2. Base64-encoded image string:

    • Data format: data:{MIME_type};base64,{base64_data}.

    • Example: data:image/png;base64,GDU7MtCZzEbTbmRZ...... (truncated due to length).

    • For details, see Input image.

audio_url string (Optional)

Supported models: wan2.6 and wan2.5.

URL of audio file. The model synchronizes video generation with this audio.

Supported input formats:

  1. Public URL:

    • HTTP and HTTPS are supported.

    • Example: https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/xxx.mp3.

Audio constraints:

  • Format: wav, mp3.

  • Duration: 3–30 seconds.

  • File size: Up to 15 MB.

  • Overflow handling: Audio exceeding duration (5 or 10s) is truncated to the first 5 or 10s. Shorter audio leaves remaining video silent (e.g., 3s audio + 5s video = 3s audio + 2s silence).

Example: https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/ozwpvi/rap.mp3.

parameters object (Optional)

Video processing parameters (resolution, duration, prompt rewriting, watermarking).

Properties

resolution string (Optional)

Important

Resolution directly affects cost. Check the model pricing

Resolution tier for generated video. Model scales output to match selected tier's pixel count. Aspect ratio closely matches input image (img_url). See FAQ.

Default values and options depend on the model parameter:

  • wan2.6-i2v-flash: Options: 720P, 1080P. Default: 1080P.

  • wan2.6-i2v: Options: 720P, 1080P. Default: 1080P.

  • wan2.6-i2v-us: Options: 720P, 1080P. Default: 1080P.

  • wan2.5-i2v-preview: Options: 480P, 720P, 1080P. Default: 1080P.

  • wan2.2-i2v-flash: Options: 480P, 720P. Default: 720P.

  • wan2.2-i2v-plus: Options: 480P, 1080P. Default: 1080P.

  • wan2.1-i2v-turbo: Options: 480P, 720P. Default: 720P.

  • wan2.1-i2v-plus: Options: 720P. Default: 720P.

Example: 1080P.

duration integer (Optional)

Important

Duration affects cost. Check the model pricing

The video duration in seconds. Valid values vary by model:

  • wan2.6-i2v-flash: Integer from 2 to 15. Default: 5.

  • wan2.6-i2v: Integer from 2 to 15. Default: 5.

  • wan2.6-i2v-us: Options: 5, 10, 15. Default: 5.

  • wan2.5-i2v-preview: Options: 5, 10. Default: 5.

  • wan2.2-i2v-plus: Fixed at 5 seconds (not configurable).

  • wan2.2-i2v-flash: Fixed at 5 seconds (not configurable).

  • wan2.1-i2v-plus: Fixed at 5 seconds (not configurable).

  • wan2.1-i2v-turbo: Options: 3, 4, or 5. Default: 5.

Example: 5.

prompt_extend boolean (Optional)

Whether to enable prompt rewriting. When enabled, an LLM rewrites the input prompt. This improves generation quality for shorter prompts but increases processing time.

  • true (default)

  • false

Example: true.

shot_type string (Optional)

Supported models: wan2.6 models.

Specifies whether video uses single continuous shot or multiple switching shots.

Takes effect only when "prompt_extend": true.

Priority: shot_type > prompt. E.g., shot_type="single" outputs single-shot video even if prompt requests multi-shot.

Valid values:

  • single (default)

  • multi

Example: single.

Note

Use this parameter to strictly control narrative structure—for example, single shot for product demos or multiple shots for short films.

audio boolean (Optional)

Important

Audio setting affects cost (sound vs silent). Check the pricing in the console before calling.

Supported model: wan2.6-i2v-flash.

Specifies whether to generate a video with sound.

Priority: audio > audio_url. audio=false produces silent video even if audio_url provided (billed at silent rate).

Valid values:

  • true (default)

  • false

Example: true.

watermark boolean (Optional)

Add “AI Generated” watermark to lower-right corner.

  • false (default)

  • true

Example: false.

seed integer (Optional)

The random number seed. Value range: [0, 2147483647].

If unspecified, system generates random seed. Fix seed for better reproducibility.

Note: Identical seeds don't guarantee identical results (probabilistic model).

Example: 12345.

Response parameters

Successful response

Save the task_id to query the task status and result.

{
    "output": {
        "task_status": "PENDING",
        "task_id": "0385dc79-5ff8-4d82-bcb6-xxxxxx"
    },
    "request_id": "4909100c-7b5a-9f92-bfe5-xxxxxx"
}

Error response

Task creation failed. See Error messages.

{
    "code": "InvalidApiKey",
    "message": "No API-key provided.",
    "request_id": "7438d53d-6eb8-4596-8835-xxxxxx"
}

output object

Task output information.

Properties

task_id string

The task ID. Valid for queries for 24 hours.

task_status string

The status of the task.

Enumeration values

  • PENDING

  • RUNNING

  • SUCCEEDED

  • FAILED

  • CANCELED

  • UNKNOWN: The task does not exist or its status is unknown.

request_id string

Unique request identifier for tracing and troubleshooting.

code string

Error code. Returned only for failed requests. See Error messages.

message string

Detailed error message. Returned only for failed requests. See Error messages.

Step 2: Query the result

Singapore

GET https://dashscope-intl.aliyuncs.com/api/v1/tasks/{task_id}

Virginia

GET https://dashscope-us.aliyuncs.com/api/v1/tasks/{task_id}

Beijing

GET https://dashscope.aliyuncs.com/api/v1/tasks/{task_id}

Frankfurt

GET https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/api/v1/tasks/{task_id}

Replace WorkspaceId with your actual Workspace ID when making API calls.

Note
  • Polling recommendation: Video generation takes several minutes. Use a polling mechanism with a reasonable interval, such as 15 seconds.

  • Task state transition: PENDING → RUNNING → SUCCEEDED or FAILED.

  • Result link: After a task succeeds, a video URL valid for 24 hours is returned. Download and save the video to permanent storage, such as OSS.

  • task_id validity: 24 hours. After this period, queries return the task status as UNKNOWN.

Request parameters

Query task result

Replace {task_id} with the task_id value returned by the previous API call. The task_id is valid for queries for 24 hours.

curl -X GET https://dashscope-intl.aliyuncs.com/api/v1/tasks/{task_id} \
--header "Authorization: Bearer $DASHSCOPE_API_KEY"
Headers

Authorization string (Required)

Authenticates the request with a Model Studio API key. Example: Bearer sk-xxxx.

URL path parameters

task_id string (Required)

The ID of the task.

Response parameters

Task succeeded

Video URLs are valid for only 24 hours and then automatically purged. Save generated videos promptly.

{
    "request_id": "2ca1c497-f9e0-449d-9a3f-xxxxxx",
    "output": {
        "task_id": "af6efbc0-4bef-4194-8246-xxxxxx",
        "task_status": "SUCCEEDED",
        "submit_time": "2025-09-25 11:07:28.590",
        "scheduled_time": "2025-09-25 11:07:35.349",
        "end_time": "2025-09-25 11:17:11.650",
        "orig_prompt": "A scene of urban fantasy art. A dynamic graffiti art character. A boy made of spray paint comes to life from a concrete wall. He raps an English song at high speed while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single street lamp, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of his rap, with no other dialogue or noise.",
        "video_url": "https://dashscope-result-sh.oss-cn-shanghai.aliyuncs.com/xxx.mp4?Expires=xxx"
    },
    "usage": {
        "duration": 10,
        "input_video_duration": 0,
        "output_video_duration": 10,
        "video_count": 1,
        "SR": 720
    }
}

Task failed

When a task fails, task_status is FAILED with an error code and message. See Error messages.

{
    "request_id": "e5d70b02-ebd3-98ce-9fe8-759d7d7b107d",
    "output": {
        "task_id": "86ecf553-d340-4e21-af6e-a0c6a421c010",
        "task_status": "FAILED",
        "code": "InvalidParameter",
        "message": "The size is not match xxxxxx"
    }
}

Task query expired

The task_id is valid for 24 hours. After this period, queries return the following error.

{
    "request_id": "a4de7c32-7057-9f82-8581-xxxxxx",
    "output": {
        "task_id": "502a00b1-19d9-4839-a82f-xxxxxx",
        "task_status": "UNKNOWN"
    }
}

output object

Task output information.

Properties

task_id string

The task ID. Valid for queries for 24 hours.

task_status string

The status of the task.

Enumeration values

  • PENDING

  • RUNNING

  • SUCCEEDED

  • FAILED

  • CANCELED

  • UNKNOWN: The task does not exist or its status is unknown.

State transitions during polling:

  • PENDING → RUNNING → SUCCEEDED or FAILED.

  • The initial query status is usually PENDING or RUNNING.

  • When the status changes to SUCCEEDED, the response contains the generated video URL.

  • If the status is FAILED, check the error message and retry the task.

submit_time string

The time when the task was submitted. The time is in UTC+8 and the format is YYYY-MM-DD HH:mm:ss.SSS.

scheduled_time string

The time when the task was executed. The time is in UTC+8 and the format is YYYY-MM-DD HH:mm:ss.SSS.

end_time string

The time when the task was completed. The time is in UTC+8 and the format is YYYY-MM-DD HH:mm:ss.SSS.

video_url string

URL of the generated video. Returned only when task_status is SUCCEEDED.

Valid for 24 hours. The video is in MP4 format with H.264 encoding.

orig_prompt string

The original input prompt, corresponding to the request parameter prompt.

actual_prompt string

When prompt_extend=true, system rewrites prompt. This field returns the optimized version.

  • If prompt_extend=false, this field is not returned.

  • Note: The wan2.6 model never returns this field, regardless of the prompt_extend value.

code string

Error code. Returned only for failed requests. See Error messages.

message string

Detailed error message. Returned only for failed requests. See Error messages.

usage object

Output statistics. Counted only for successful results.

Properties

Parameters returned by wan2.6 models

input_video_duration integer

Input video duration in seconds. Fixed at 0 because input videos are not supported.

output_video_duration integer

Returned only for wan2.6 models.

Output video duration in seconds. Equals input.duration.

duration integer

The total video duration, used for billing.

Billing formula: duration=input_video_duration+output_video_duration.

SR integer

Returned only for wan2.6 models. The resolution tier of the generated video. Example: 720.

video_count integer

The number of generated videos. Fixed at 1.

audio boolean

Returned only for wan2.6-i2v-flash models. Indicates whether the output video has sound.

Parameters returned by wan2.2 and wan2.5 models

duration integer

The duration of the generated video in seconds. Valid values: 5, 10.

Billing formula: Cost = Video seconds × Unit price.

SR integer

The resolution of the generated video. Valid values: 480, 720, 1080.

video_count integer

The number of generated videos. Fixed at 1.

Parameters returned by wan2.1 models

video_duration integer

The duration of the generated video in seconds. Valid values: 3, 4, 5.

Billing formula: Cost = Video seconds × Unit price.

video_ratio string

The aspect ratio of the generated video. Fixed at standard.

video_count integer

The number of generated videos. Fixed at 1.

request_id string

Unique request identifier for tracing and troubleshooting.

DashScope SDK

SDK parameter names mostly match HTTP API (structures follow language conventions).

Image-to-video tasks take 1-5 minutes. The SDK wraps the async HTTP call flow and supports both sync and async invocation.

Actual processing time depends on queue length and service execution. Wait patiently for results.

Python SDK

null

Requires DashScope Python SDK ≥ 1.25.8. Older versions may error (“url error, please check url!”) -- see Install the SDK to update.

Set base_http_api_url based on the model’s region:

Singapore

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

Virginia

dashscope.base_http_api_url = 'https://dashscope-us.aliyuncs.com/api/v1'

Beijing

dashscope.base_http_api_url = 'https://dashscope.aliyuncs.com/api/v1'

Frankfurt

dashscope.base_http_api_url = 'https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/api/v1'

Replace WorkspaceId with your actual Workspace ID when making API calls.

Sample code

Synchronous call

Synchronous calls block until completion. Three image input methods: public URL, Base64, local file path.

Request example
import base64
import os
from http import HTTPStatus
from dashscope import VideoSynthesis
import mimetypes
import dashscope

# Singapore region URL. Get URL: https://www.alibabacloud.com/help/en/model-studio/image-to-video-api-reference
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'


# If you haven't configured the environment variable, replace the next line with your Model Studio API key: api_key="sk-xxx"
# Get your API key: https://www.alibabacloud.com/help/zh/model-studio/get-api-key
api_key = os.getenv("DASHSCOPE_API_KEY")

# --- Helper function: For Base64 encoding ---
# Format: data:{MIME_type};base64,{base64_data}
def encode_file(file_path):
    mime_type, _ = mimetypes.guess_type(file_path)
    if not mime_type or not mime_type.startswith("image/"):
        raise ValueError("Unsupported or unrecognized image format")
    with open(file_path, "rb") as image_file:
        encoded_string = base64.b64encode(image_file.read()).decode('utf-8')
    return f"data:{mime_type};base64,{encoded_string}"

"""
Image input methods:
Choose one of the following three methods,

1. Use a public URL - Suitable for publicly accessible images
2. Use a local file - Suitable for local development and testing
3. Use Base64 encoding - Suitable for private images or scenarios requiring encrypted transmission
"""

# [Method 1] Use a publicly accessible image URL
# Example: Use a public image URL
img_url = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/wpimhv/rap.png"

# [Method 2] Use a local file (supports absolute and relative paths)
# Format requirement: file:// + file path
# Example (absolute path):
# img_url = "file://" + "/path/to/your/img.png"    # Linux/macOS
# img_url = "file://" + "/C:/path/to/your/img.png"  # Windows
# Example (relative path):
# img_url = "file://" + "./img.png"                # Relative to the current executable file's path

# [Method 3] Use a Base64-encoded image
# img_url = encode_file("./img.png")

# Set audio URL
audio_url = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/ozwpvi/rap.mp3"

def sample_call_i2v():
    # Synchronous call, returns result directly
    print('please wait...')
    rsp = VideoSynthesis.call(api_key=api_key,
                              model='wan2.6-i2v-flash',
                              prompt='A scene of urban fantasy art. A dynamic graffiti art character. A boy made of spray paint comes to life from a concrete wall. He raps an English song at high speed while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single street lamp, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of his rap, with no other dialogue or noise.',
                              img_url=img_url,
                              audio_url=audio_url,
                              resolution="720P",
                              duration=10,
                              prompt_extend=True,
                              watermark=False,
                              negative_prompt="",
                              seed=12345)
    print(rsp)
    if rsp.status_code == HTTPStatus.OK:
        print("video_url:", rsp.output.video_url)
    else:
        print('Failed, status_code: %s, code: %s, message: %s' %
              (rsp.status_code, rsp.code, rsp.message))


if __name__ == '__main__':
    sample_call_i2v()
Response example
The video_url is valid for 24 hours. Download promptly.
{
    "status_code": 200,
    "request_id": "2794c7a3-fe8c-4dd4-a1b7-xxxxxx",
    "code": null,
    "message": "",
    "output": {
        "task_id": "c15d5b14-07c4-4af5-b862-xxxxxx",
        "task_status": "SUCCEEDED",
        "video_url": "https://dashscope-result-bj.oss-cn-beijing.aliyuncs.com/xxx.mp4?Expires=xxx",
        "submit_time": "2026-01-22 23:24:46.527",
        "scheduled_time": "2026-01-22 23:24:46.565",
        "end_time": "2026-01-22 23:25:59.978",
        "orig_prompt": "A scene of urban fantasy art. A dynamic graffiti art character. A boy made of spray paint comes to life from a concrete wall. He raps an English song at high speed while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single street lamp, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of his rap, with no other dialogue or noise."
    },
    "usage": {
        "video_count": 1,
        "video_duration": 0,
        "video_ratio": "",
        "duration": 10,
        "input_video_duration": 0,
        "output_video_duration": 10,
        "audio": true,
        "SR": 720
    }
}

Asynchronous call

Async calls return task ID immediately. Poll for results or wait for completion.

Request example
import os
from http import HTTPStatus
from dashscope import VideoSynthesis
import dashscope

# Singapore region URL. Get URL: https://www.alibabacloud.com/help/en/model-studio/image-to-video-api-reference
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'


# If you haven't configured the environment variable, replace the next line with your Model Studio API key: api_key="sk-xxx"
# Get your API key: https://www.alibabacloud.com/help/zh/model-studio/get-api-key
api_key = os.getenv("DASHSCOPE_API_KEY")

# Use a publicly accessible image URL
img_url = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/wpimhv/rap.png"

# Set audio URL
audio_url = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/ozwpvi/rap.mp3"


def sample_async_call_i2v():
    # Asynchronous call, returns a task_id
    rsp = VideoSynthesis.async_call(api_key=api_key,
                                    model='wan2.6-i2v-flash',
                                    prompt='A scene of urban fantasy art. A dynamic graffiti art character. A boy made of spray paint comes to life from a concrete wall. He raps an English song at high speed while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single street lamp, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of his rap, with no other dialogue or noise.',
                                    img_url=img_url,
                                    audio_url=audio_url,
                                    resolution="720P",
                                    duration=10,
                                    prompt_extend=True,
                                    watermark=False,
                                    negative_prompt="",
                                    seed=12345)
    print(rsp)
    if rsp.status_code == HTTPStatus.OK:
        print("task_id: %s" % rsp.output.task_id)
    else:
        print('Failed, status_code: %s, code: %s, message: %s' %
              (rsp.status_code, rsp.code, rsp.message))

    # Get asynchronous task information
    status = VideoSynthesis.fetch(task=rsp, api_key=api_key)
    if status.status_code == HTTPStatus.OK:
        print(status.output.task_status)
    else:
        print('Failed, status_code: %s, code: %s, message: %s' %
              (status.status_code, status.code, status.message))

    # Wait for asynchronous task to finish
    rsp = VideoSynthesis.wait(task=rsp, api_key=api_key)
    print(rsp)
    if rsp.status_code == HTTPStatus.OK:
        print(rsp.output.video_url)
    else:
        print('Failed, status_code: %s, code: %s, message: %s' %
              (rsp.status_code, rsp.code, rsp.message))


if __name__ == '__main__':
    sample_async_call_i2v()
Response example

1. Response example for task creation

{
    "status_code": 200,
    "request_id": "6dc3bf6c-be18-9268-9c27-xxxxxx",
    "code": "",
    "message": "",
    "output": {
        "task_id": "686391d9-7ecf-4290-a8e9-xxxxxx",
        "task_status": "PENDING",
        "video_url": ""
    },
    "usage": null
}

2. Response example for querying task result

The video_url is valid for 24 hours. Download promptly.
{
    "status_code": 200,
    "request_id": "2794c7a3-fe8c-4dd4-a1b7-xxxxxx",
    "code": null,
    "message": "",
    "output": {
        "task_id": "c15d5b14-07c4-4af5-b862-xxxxxx",
        "task_status": "SUCCEEDED",
        "video_url": "https://dashscope-result-bj.oss-cn-beijing.aliyuncs.com/xxx.mp4?Expires=xxx",
        "submit_time": "2026-01-22 23:24:46.527",
        "scheduled_time": "2026-01-22 23:24:46.565",
        "end_time": "2026-01-22 23:25:59.978",
        "orig_prompt": "A scene of urban fantasy art. A dynamic graffiti art character. A boy made of spray paint comes to life from a concrete wall. He raps an English song at high speed while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single street lamp, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of his rap, with no other dialogue or noise."
    },
    "usage": {
        "video_count": 1,
        "video_duration": 0,
        "video_ratio": "",
        "duration": 10,
        "input_video_duration": 0,
        "output_video_duration": 10,
        "audio": true,
        "SR": 720
    }
}

Java SDK

null

Requires DashScope Java SDK ≥ 2.22.6. Older versions may error (“url error, please check url!”) -- see Install the SDK to update.

Set baseHttpApiUrl based on the model’s region:

Singapore

Constants.baseHttpApiUrl = "https://dashscope-intl.aliyuncs.com/api/v1";

Virginia

Constants.baseHttpApiUrl = "https://dashscope-us.aliyuncs.com/api/v1";

Beijing

Constants.baseHttpApiUrl = "https://dashscope.aliyuncs.com/api/v1";

Frankfurt

Constants.baseHttpApiUrl = "https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/api/v1";

Replace WorkspaceId with your actual Workspace ID when making API calls.

Sample code

Synchronous call

Synchronous calls block until completion. Three image input methods: public URL, Base64, local file path.

Request example
// Copyright (c) Alibaba, Inc. and its affiliates.

import com.alibaba.dashscope.aigc.videosynthesis.VideoSynthesis;
import com.alibaba.dashscope.aigc.videosynthesis.VideoSynthesisParam;
import com.alibaba.dashscope.aigc.videosynthesis.VideoSynthesisResult;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.JsonUtils;
import com.alibaba.dashscope.utils.Constants;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Base64;
import java.util.HashMap;
import java.util.Map;

 
public class Image2Video {

    static {
        // Singapore region URL. Get URL: https://www.alibabacloud.com/help/en/model-studio/image-to-video-api-reference
        Constants.baseHttpApiUrl = "https://dashscope-intl.aliyuncs.com/api/v1";
    }

    // If you haven't configured the environment variable, replace the next line with your Model Studio API key: apiKey="sk-xxx"
    // Get your API key: https://www.alibabacloud.com/help/zh/model-studio/get-api-key
    static String apiKey = System.getenv("DASHSCOPE_API_KEY");
    
    /**
     * Image input methods: Choose one of the following three
     *
     * 1. Use a public URL - Suitable for publicly accessible images
     * 2. Use a local file - Suitable for local development and testing
     * 3. Use Base64 encoding - Suitable for private images or scenarios requiring encrypted transmission
     */

    // [Method 1] Public URL
    static String imgUrl = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/wpimhv/rap.png";

    // [Method 2] Local file path (file://+absolute path)
    // static String imgUrl = "file://" + "/your/path/to/img.png";    // Linux/macOS
    // static String imgUrl = "file://" + "/C:/your/path/to/img.png";  // Windows

    // [Method 3] Base64 encoding
    // static String imgUrl = Image2Video.encodeFile("/your/path/to/img.png");
    
    // Set audio URL
    static String audioUrl = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/ozwpvi/rap.mp3";

    public static void image2video() throws ApiException, NoApiKeyException, InputRequiredException {
        // Set parameters
        Map<String, Object> parameters = new HashMap<>();
        parameters.put("prompt_extend", true);
        parameters.put("watermark", false);
        parameters.put("seed", 12345);

        VideoSynthesis vs = new VideoSynthesis();
        VideoSynthesisParam param =
                VideoSynthesisParam.builder()
                        .apiKey(apiKey)
                        .model("wan2.6-i2v-flash")
                        .prompt("A scene of urban fantasy art. A dynamic graffiti art character. A boy made of spray paint comes to life from a concrete wall. He raps an English song at high speed while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single street lamp, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of his rap, with no other dialogue or noise.")
                        .imgUrl(imgUrl)
                        .audioUrl(audioUrl)
                        .duration(10)
                        .parameters(parameters)
                        .resolution("720P")
                        .negativePrompt("")
                        .build();
        System.out.println("please wait...");
        VideoSynthesisResult result = vs.call(param);
        System.out.println(JsonUtils.toJson(result));
    }
    
     /**
     * Encode a file into a Base64 string
     * @param filePath File path
     * @return Base64 string in the format: data:{MIME_type};base64,{base64_data}
     */
    public static String encodeFile(String filePath) {
        Path path = Paths.get(filePath);
        if (!Files.exists(path)) {
            throw new IllegalArgumentException("File does not exist: " + filePath);
        }
        // Detect MIME type
        String mimeType = null;
        try {
            mimeType = Files.probeContentType(path);
        } catch (IOException e) {
            throw new IllegalArgumentException("Cannot detect file type: " + filePath);
        }
        if (mimeType == null || !mimeType.startsWith("image/")) {
            throw new IllegalArgumentException("Unsupported or unrecognized image format");
        }
        // Read file content and encode
        byte[] fileBytes = null;
        try{
            fileBytes = Files.readAllBytes(path);
        } catch (IOException e) {
            throw new IllegalArgumentException("Cannot read file content: " + filePath);
        }
    
        String encodedString = Base64.getEncoder().encodeToString(fileBytes);
        return "data:" + mimeType + ";base64," + encodedString;
    }
    

    public static void main(String[] args) {
        try {
            image2video();
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            System.out.println(e.getMessage());
        }
        System.exit(0);
    }
}
Response example
The video_url is valid for 24 hours. Download promptly.
{
    "request_id": "87c091bb-7a3c-4904-8501-xxxxxx",
    "output": {
        "task_id": "413ed6e4-5f3a-4f57-8d58-xxxxxx",
        "task_status": "SUCCEEDED",
        "video_url": "https://dashscope-result-bj.oss-cn-beijing.aliyuncs.com/xxx.mp4?Expires=xxx",
        "orig_prompt": "A scene of urban fantasy art. A dynamic graffiti art character. A boy made of spray paint comes to life from a concrete wall. He raps an English song at high speed while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single street lamp, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of his rap, with no other dialogue or noise.",
        "submit_time": "2026-01-22 23:25:45.729",
        "scheduled_time": "2026-01-22 23:25:45.771",
        "end_time": "2026-01-22 23:26:44.942"
    },
    "usage": {
        "video_count": 1,
        "duration": 10.0,
        "input_video_duration": 0.0,
        "output_video_duration": 10.0,
        "SR": "720"
    },
    "status_code": 200,
    "code": "",
    "message": ""
}

Asynchronous call

Async calls return task ID immediately. Poll for results or wait for completion.

Request example
// Copyright (c) Alibaba, Inc. and its affiliates.

import com.alibaba.dashscope.aigc.videosynthesis.VideoSynthesis;
import com.alibaba.dashscope.aigc.videosynthesis.VideoSynthesisListResult;
import com.alibaba.dashscope.aigc.videosynthesis.VideoSynthesisParam;
import com.alibaba.dashscope.aigc.videosynthesis.VideoSynthesisResult;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.task.AsyncTaskListParam;
import com.alibaba.dashscope.utils.JsonUtils;
import com.alibaba.dashscope.utils.Constants;

import java.util.HashMap;
import java.util.Map;

public class Image2Video {

    static {
        // Singapore region URL. Get URL: https://www.alibabacloud.com/help/en/model-studio/image-to-video-api-reference
        Constants.baseHttpApiUrl = "https://dashscope-intl.aliyuncs.com/api/v1";
    }

    // If you haven't configured the environment variable, replace the next line with your Model Studio API key: api_key="sk-xxx"
    // Get your API key: https://www.alibabacloud.com/help/zh/model-studio/get-api-key
    static String apiKey = System.getenv("DASHSCOPE_API_KEY");
    // Set input image URL
    static String imgUrl = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/wpimhv/rap.png";

    // Set audio URL
    static String audioUrl = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/ozwpvi/rap.mp3";

    public static void image2video() throws ApiException, NoApiKeyException, InputRequiredException {
        // Set parameters
        Map<String, Object> parameters = new HashMap<>();
        parameters.put("prompt_extend", true);
        parameters.put("watermark", false);
        parameters.put("seed", 12345);

        VideoSynthesis vs = new VideoSynthesis();
        VideoSynthesisParam param =
                VideoSynthesisParam.builder()
                        .apiKey(apiKey)
                        .model("wan2.6-i2v-flash")
                        .prompt("A scene of urban fantasy art. A dynamic graffiti art character. A boy made of spray paint comes to life from a concrete wall. He raps an English song at high speed while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single street lamp, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of his rap, with no other dialogue or noise.")
                        .imgUrl(imgUrl)
                        .audioUrl(audioUrl)
                        .duration(10)
                        .parameters(parameters)
                        .resolution("720P")
                        .negativePrompt("")
                        .build();
        // Asynchronous call
        VideoSynthesisResult task = vs.asyncCall(param);
        System.out.println(JsonUtils.toJson(task));
        System.out.println("please wait...");

        // Get result
        VideoSynthesisResult result = vs.wait(task, apiKey);
        System.out.println(JsonUtils.toJson(result));
    }

    // Get task list
    public static void listTask() throws ApiException, NoApiKeyException {
        VideoSynthesis is = new VideoSynthesis();
        AsyncTaskListParam param = AsyncTaskListParam.builder().build();
        param.setApiKey(apiKey);
        VideoSynthesisListResult result = is.list(param);
        System.out.println(result);
    }

    // Get single task result
    public static void fetchTask(String taskId) throws ApiException, NoApiKeyException {
        VideoSynthesis is = new VideoSynthesis();
        // If DASHSCOPE_API_KEY is set as an environment variable, apiKey can be null
        VideoSynthesisResult result = is.fetch(taskId, apiKey);
        System.out.println(result.getOutput());
        System.out.println(result.getUsage());
    }

    public static void main(String[] args) {
        try {
            image2video();
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            System.out.println(e.getMessage());
        }
        System.exit(0);
    }
}
Response example

1. Response example for task creation

{
    "request_id": "5dbf9dc5-4f4c-9605-85ea-xxxxxxxx",
    "output": {
        "task_id": "7277e20e-aa01-4709-xxxxxxxx",
        "task_status": "PENDING"
    }
}

2. Response example for querying task result

The video_url is valid for 24 hours. Download promptly.
{
    "request_id": "87c091bb-7a3c-4904-8501-xxxxxx",
    "output": {
        "task_id": "413ed6e4-5f3a-4f57-8d58-xxxxxx",
        "task_status": "SUCCEEDED",
        "video_url": "https://dashscope-result-bj.oss-cn-beijing.aliyuncs.com/xxx.mp4?Expires=xxx",
        "orig_prompt": "A scene of urban fantasy art. A dynamic graffiti art character. A boy made of spray paint comes to life from a concrete wall. He raps an English song at high speed while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single street lamp, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of his rap, with no other dialogue or noise.",
        "submit_time": "2026-01-22 23:25:45.729",
        "scheduled_time": "2026-01-22 23:25:45.771",
        "end_time": "2026-01-22 23:26:44.942"
    },
    "usage": {
        "video_count": 1,
        "duration": 10.0,
        "input_video_duration": 0.0,
        "output_video_duration": 10.0,
        "SR": "720"
    },
    "status_code": 200,
    "code": "",
    "message": ""
}

Limitations

  • Data validity: task_id and video URL retained for 24 hours only. After expiry, cannot be queried or downloaded.

  • Content moderation: Inputs (prompts, images) and outputs undergo security review. Violations return errors (e.g., “IPInfringementSuspect”, “DataInspectionFailed”). See .

Error codes

For failed calls and errors, see Error messages for troubleshooting.

FAQ

Q: How do I generate a video with a specific aspect ratio (e.g., 3:4)?

A: Output aspect ratio matches input image (img_url), but exact ratios (e.g., strict 3:4) aren’t guaranteed -- minor deviations may occur.

  • Why deviation occurs:

    Model uses input image ratio as baseline, calculates nearest valid resolution based on target pixels (set by resolution). Width/height must be multiples of 16, requiring minor adjustments. Output ratio approximates target (e.g., 3:4) but not exact.

    • Example: Input image 750×1000 (aspect ratio 3:4 = 0.75), with resolution = "720P" (target ~920,000 pixels) produces output 816×1104 (aspect ratio ≈ 0.739, ~900,000 pixels).

  • Best practices:

    • Input control: Use an image with the target aspect ratio as the first frame.

    • Post-processing: For strict ratios, crop or pad the generated video with editing tools.

Q: How to get the domain name whitelist for video storage?

A: Videos generated by models are stored in OSS. The API returns a temporary public URL. To configure a firewall whitelist for this download URL, note the following: The underlying storage may change dynamically. This topic does not provide a fixed OSS domain name whitelist to prevent access issues caused by outdated information. If you have security control requirements, contact your account manager to obtain the latest OSS domain name list.