All Products
Search
Document Center

Alibaba Cloud Model Studio:Wan image-to-video API reference

Last Updated:Nov 10, 2025

The Wan image-to-video model generates a smooth video from a first frame image and a text prompt. The supported features include the following:

  • Basic features: You can select a video duration from 3 to 10 seconds, specify a video resolution of 480p, 720p, or 1080p, use prompt rewriting, and add watermarks.

  • Audio capabilities: You can use automatic audio generation or provide a custom audio file for audio-video synchronization. (Supported only by wan2.5)

Quick links: Try it online on the Wan official website | Video effect list

Note

The features available on the Wan official website may differ from those supported by the API. This document describes the API's capabilities and is updated promptly to reflect new features.

Overview

Input first frame image and audio

Output video (wan2.5)

rap-转换自-png

Input audio:

Input prompt: A scene of urban fantasy art. A dynamic graffiti art character. A boy painted with spray paint comes to life from a concrete wall. He sings an English rap song at a very fast pace while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single streetlight, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of the boy's rap, with no other dialogue or noise.

Model

Description

Output video specifications

wan2.5-i2v-preview Recommended

Wan 2.5 preview (Video with audio)

New audio capabilities: Supports automatic audio generation and the use of custom audio files.

Resolution tiers: 480p, 720p, 1080p

Video duration: 5 or 10 seconds

Fixed specifications: 24 fps, MP4 (H.264 encoding)

wan2.2-i2v-flash

Wan 2.2 Flash Edition (Silent video)

50% faster than the 2.1 model.

Resolution tiers: 480p, 720p, 1080p

Video duration: 5 seconds

Fixed specifications: 30 fps, MP4 (H.264 encoding)

wan2.2-i2v-plus

Wan 2.2 Professional Edition (Silent video)

Improved stability and success rate compared to the 2.1 model.

Resolution tiers: 480p, 1080p

Video duration: 5 seconds

Fixed specifications: 30 fps, MP4 (H.264 encoding)

wanx2.1-i2v-plus

Wan 2.1 Professional Edition (Silent video)

Resolution tiers: 720p

Video duration: 5 seconds

Fixed specifications: 30 fps, MP4 (H.264 encoding)

wanx2.1-i2v-turbo

Wan 2.1 Flash Edition (Silent video)

Resolution tiers: 480p, 720p

Video duration: 3, 4, or 5 seconds

Fixed specifications: 30 fps, MP4 (H.264 encoding)

Note

Before making a call, check the models and pricing supported in each region.

Prerequisites

Before making a call, you must create an API key and then export the API key as an environment variable. To make calls using an SDK, install the DashScope SDK.

Important

The Beijing and Singapore regions have separate API keys and request endpoints. Do not use them interchangeably. Cross-region calls cause authentication failures or service errors.

HTTP

Image-to-video tasks can take a long time to complete, typically 1 to 5 minutes. For this reason, the API uses asynchronous invocation. The process involves two core steps: Create a task and then poll for results. The steps are as follows:

The actual time required depends on the number of tasks in the queue and the service execution status. Please be patient while you wait for the result.

Step 1: Create a task to get a task ID

Singapore: POST https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis

Beijing: POST https://dashscope.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis

Note
  • After the task is created, use the returned task_id to query the result. The task_id is valid for 24 hours. Do not create duplicate tasks. Use polling to retrieve the result.

Request parameters

Automatic audio generation

This feature is supported only by the wan2.5-i2v-preview model. Automatic audio generation is enabled by default and requires no configuration. To explicitly enable this feature, set the parameters.audio parameter to true.

The API keys for the Singapore and Beijing regions are different. For more information, see Obtain an API key.
The following example uses the base URL for the Singapore region. If you use a model in the Beijing region, replace the base URL with https://dashscope.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis.
curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis' \
    -H 'X-DashScope-Async: enable' \
    -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
    "model": "wan2.5-i2v-preview",
    "input": {
        "prompt": "A scene of urban fantasy art. A dynamic graffiti art character. A boy painted with spray paint comes to life from a concrete wall. He sings an English rap song at a very fast pace while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single streetlight, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of the boy'\''s rap, with no other dialogue or noise.",
        "img_url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/wpimhv/rap.png"
    },
    "parameters": {
        "resolution": "480P",
        "prompt_extend": true,
        "duration": 10,
        "audio": true
    }
}'

Provide an audio file

This feature is supported only by the wan2.5-i2v-preview model. Provide the audio link in the input.audio_url parameter.

The API keys for the Singapore and Beijing regions are different. For more information, see Obtain an API key.
The following example uses the base URL for the Singapore region. If you use a model in the Beijing region, replace the base URL with https://dashscope.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis.
curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis' \
    -H 'X-DashScope-Async: enable' \
    -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
    "model": "wan2.5-i2v-preview",
    "input": {
        "prompt": "A scene of urban fantasy art. A dynamic graffiti art character. A boy painted with spray paint comes to life from a concrete wall. He sings an English rap song at a very fast pace while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single streetlight, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of the boy'\''s rap, with no other dialogue or noise.",
        "img_url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/wpimhv/rap.png",
        "audio_url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/ozwpvi/rap.mp3"
    },
    "parameters": {
        "resolution": "480P",
        "prompt_extend": true,
        "duration": 10
    }
}'

Generate a silent video

The method for generating a silent video varies by model version:

  • For the wan2.5-i2v-preview model, you must explicitly set the parameters.audio parameter to false.

  • For wan2.2 and earlier versions, the model generates silent videos by default and no parameters are required. For more information, see the following code example.

The API keys for the Singapore and Beijing regions are different. For more information, see Obtain API key.
The following example uses the base URL for the Singapore region. If you use a model in the Beijing region, replace the base URL with https://dashscope.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis.
curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis' \
    -H 'X-DashScope-Async: enable' \
    -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
    "model": "wan2.2-i2v-plus",
    "input": {
        "prompt": "A cat running on the grass",
        "img_url": "https://cdn.translate.alibaba.com/r/wanx-demo-1.png"
    },
    "parameters": {
        "resolution": "1080P",
        "prompt_extend": true
    }
}'

Use a negative prompt

Use negative_prompt to prevent the generated video from including "flowers".

The API keys for the Singapore and Beijing regions are different. For more information, see Obtain an API key.
The following example uses the base URL for the Singapore region. If you use a model in the Beijing region, replace the base URL with https://dashscope.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis.
curl --location 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis' \
    -H 'X-DashScope-Async: enable' \
    -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
    "model": "wan2.1-i2v-turbo",
    "input": {
        "prompt": "A cat running on the grass",
        "negative_prompt": "flowers",
        "img_url": "https://cdn.translate.alibaba.com/r/wanx-demo-1.png"
    },
    "parameters": {
        "resolution": "720P",
        "prompt_extend": true
    }
}'
Request headers

Content-Type string (Required)

The content type of the request. Set this parameter to application/json.

Authorization string (Required)

The identity authentication credentials for the request. This API uses an Model Studio API key for identity authentication. Example: Bearer sk-xxxx.

X-DashScope-Async string (Required)

The asynchronous processing configuration parameter. HTTP requests support only asynchronous processing. You must set this parameter to enable.

Important

If this request header is missing, the error message "current user api does not support synchronous calls" is returned.

Request body

model string (Required)

The model name. Example: wan2.2-i2v-plus.

For a list of models and their prices, see Models and pricing.

input object (Required)

Basic input information, such as the prompt.

Properties

prompt string (Optional)

The text prompt that describes the desired elements and visual features of the generated video.

This parameter supports both Chinese and English. Each Chinese character or letter is counted as one character. Excess characters are automatically truncated. The length limit varies by model version:

  • wan2.5-i2v-preview: Up to 2,000 characters.

  • wan2.2 and earlier models: Up to 800 characters.

Example: A kitten running on the grass.

For prompt usage tips, see Text-to-video/image-to-video prompt guide.

negative_prompt string (Optional)

The negative prompt, which describes content that you do not want to appear in the video. This can be used to constrain the video content.

This parameter supports both Chinese and English. The length is limited to 500 characters. Excess characters are automatically truncated.

Example: low resolution, error, worst quality, low quality, deformed, extra fingers, bad proportions.

img_url string (Required)

The URL or Base64-encoded data of the first frame image.

Image limits:

  • Image format: JPEG, JPG, PNG (alpha channels are not supported), BMP, or WEBP.

  • Image resolution: The width and height must be between 360 and 2,000 pixels.

  • File size: No more than 10 MB.

Input image instructions:

  1. Publicly accessible URL

    • Supports HTTP or HTTPS protocols.

    • Example: https://cdn.translate.alibaba.com/r/wanx-demo-1.png.

  2. Base64-encoded image string

    • Data format: data:{MIME_type};base64,{base64_data}.

    • Example: data:image/png;base64,GDU7MtCZzEbTbmRZ....... (The encoded string is too long and only a snippet is shown.)

    • For more information, see Input image.

audio_url string (Optional)

Supported only by wan2.5-i2v-preview. The URL of the audio file. The model uses this audio to generate the video. For more information about how to use this parameter, see Audio settings.

This parameter supports HTTP or HTTPS protocols.

Audio limits:

  • Format: WAV, MP3.

  • Duration: 3 to 30 s.

  • File size: No more than 15 MB.

  • Handling of oversized files: If the audio duration exceeds the duration value of 5 or 10 seconds, the audio is automatically truncated to the first 5 or 10 seconds. The rest of the audio is discarded. If the audio is shorter than the video, the remaining part of the video is silent. For example, if the audio is 3 s long and the video is 5 s long, the first 3 s of the output video have sound, and the last 2 s are silent.

Example: https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/ozwpvi/rap.mp3.

parameters object (Optional)

Video processing parameters, such as the video resolution, video duration, prompt rewriting, and watermark.

Properties

resolution string (Optional)

Important

The resolution parameter directly affects the cost. For the same model, the cost is as follows: 1080p > 720p > 480p. Before making a call, confirm the Models and pricing.

Specifies the resolution tier for the generated video. This setting adjusts the video's definition, which is measured in total pixels. The model automatically scales the video to a similar total pixel count based on the selected tier. The aspect ratio of the video is kept as consistent as possible with the aspect ratio of the input `img_url` image. For more information, see the FAQ section.

The default value and valid values for this parameter depend on the `model` parameter, as described in the following list:

  • wan2.5-i2v-preview: Optional values: 480p, 720p, 1080p. Default value: 1080p.

  • wan2.2-i2v-flash: Optional values: 480p, 720p. Default value: 720p.

  • wan2.2-i2v-plus: Optional values: 480p, 1080p. Default value: 1080p.

  • wan2.1-i2v-turbo: Optional values: 480p, 720p. Default value: 720p.

  • wan2.1-i2v-plus: Optional values: 720p. Default value: 720p.

Example: 1080p.

duration integer (Optional)

Important

The duration directly affects the cost. Billing is per second, so a longer duration results in a higher cost. Before making a call, confirm the Models and pricing.

The duration of the generated video in seconds. The valid values for this parameter depend on the `model` parameter:

  • wan2.5-i2v-preview: Optional values: 5, 10. Default value: 5.

  • wan2.2-i2v-plus: Fixed at 5 seconds and cannot be modified.

  • wan2.2-i2v-flash: Fixed at 5 seconds and cannot be modified.

  • wan2.1-i2v-plus: Fixed at 5 seconds and cannot be modified.

  • wan2.1-i2v-turbo: Optional values: 3, 4, or 5. Default value: 5.

Example: 5.

prompt_extend boolean (Optional)

Specifies whether to enable prompt rewriting. If enabled, a large language model (LLM) rewrites the input prompt. This can significantly improve the generation quality for shorter prompts but increases the time required.

  • true (default)

  • false

Example: true.

watermark boolean (Optional)

Specifies whether to add a watermark. The watermark, which contains the text "AI-generated", is placed in the lower-right corner of the video.

  • false (default)

  • true: Adds a watermark.

Example: false.

audio boolean (Optional)

Supported only by wan2.5-i2v-preview. Controls whether to add audio.

Parameter priority is `audio_url` over `audio`. This parameter takes effect only when audio_url is empty. For more information about how to use this parameter, see Audio settings.

  • true: (default) Automatically adds audio to the video.

  • false: Does not add audio. Outputs a silent video.

Example: true.

seed integer (Optional)

The random seed. The value must be in the range of [0, 2147483647].

If this parameter is not specified, the system automatically generates a random seed. To improve the reproducibility of the generated results, set a fixed seed value.

Note that because model generation is probabilistic, using the same seed value does not guarantee that the generated results are identical for every call.

Example: 12345.

Response parameters

Successful response

Save the task_id to query the task status and result.

{
    "output": {
        "task_status": "PENDING",
        "task_id": "0385dc79-5ff8-4d82-bcb6-xxxxxx"
    },
    "request_id": "4909100c-7b5a-9f92-bfe5-xxxxxx"
}

Error response

The task creation failed. For more information, see 429-Error messages to resolve the issue.

{
    "code":"InvalidApiKey",
    "message":"Invalid API-key provided.",
    "request_id":"fb53c4ec-1c12-4fc4-a580-xxxxxx"
}

output object

The task output information.

Properties

task_id string

The task ID. The query is valid for 24 hours.

task_status string

The task status.

Enumeration

  • PENDING

  • RUNNING

  • SUCCEEDED

  • FAILED

  • CANCELED

  • UNKNOWN

request_id string

The unique request ID. You can use this ID to trace and troubleshoot issues.

code string

The error code for a failed request. This parameter is not returned if the request is successful. For more information, see 429-Error messages.

message string

The detailed information about a failed request. This parameter is not returned if the request is successful. For more information, see 429-Error messages.

Step 2: Query the result by task ID

Singapore region: GET https://dashscope-intl.aliyuncs.com/api/v1/tasks/{task_id}

Beijing region: GET https://dashscope.aliyuncs.com/api/v1/tasks/{task_id}

Note
  • Polling suggestion: Video generation takes several minutes. Use a polling mechanism and set a reasonable query interval, such as 15 seconds, to retrieve the result.

  • Task status transition: PENDING (In queue) → RUNNING (Processing) → SUCCEEDED (Successful) or FAILED (Failed).

  • Result link: After the task is successful, a video link is returned. The link is valid for 24 hours. After you retrieve the link, immediately download and save the video to a permanent storage service, such as Alibaba Cloud OSS.

  • task_id validity: 24 hours. After this period, you cannot query the result, and the API returns a task status of UNKNOWN.

Request parameters

Query task results

Replace 86ecf553-d340-4e21-xxxxxxxxx with the actual task ID.

The API keys for the Singapore and Beijing regions are different. Obtain an API key.
The following code provides the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/api/v1/tasks/{task_id}
curl -X GET https://dashscope-intl.aliyuncs.com/api/v1/tasks/86ecf553-d340-4e21-xxxxxxxxx \
--header "Authorization: Bearer $DASHSCOPE_API_KEY"
Request headers

Authorization string (Required)

The identity authentication credentials for the request. This API uses an Model Studio API key for identity authentication. Example: Bearer sk-xxxx.

URL path parameters

task_id string (Required)

The task ID.

Response parameters

Task succeeded

Video URLs are retained for only 24 hours and are automatically purged after this period. You must save the generated videos promptly.

{
    "request_id": "2ca1c497-f9e0-449d-9a3f-xxxxxx",
    "output": {
        "task_id": "af6efbc0-4bef-4194-8246-xxxxxx",
        "task_status": "SUCCEEDED",
        "submit_time": "2025-09-25 11:07:28.590",
        "scheduled_time": "2025-09-25 11:07:35.349",
        "end_time": "2025-09-25 11:17:11.650",
        "orig_prompt": "A scene of urban fantasy art. A dynamic graffiti art character. A boy painted with spray paint comes to life from a concrete wall. He sings an English rap song at a very fast pace while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single streetlight, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of the boy'\''s rap, with no other dialogue or noise.",
        "video_url": "https://dashscope-result-sh.oss-cn-shanghai.aliyuncs.com/xxx.mp4?Expires=xxx",
        "actual_prompt": "A boy made of spray paint emerges from a concrete wall and begins to sing an English rap at a very fast pace, his lips moving quickly, his head nodding slightly, and his eyes looking directly at the camera. He gives a thumbs-up to the wall with his right hand and puts his left hand on his hip, moving his body back and forth to the rhythm. The audio is the boy'\''s continuous English rap, with the lyrics: 'Skyscrapers loom, shadows kiss the pavement. Dreams stack high, but the soul'\''s in the basement. Pocket full of lint, chasing gold like it'\''s sacred. Every breath a gamble, the odds never patient.'"
    },
    "usage": {
        "duration": 10,
        "video_count": 1,
        "SR": 480
    }
}

Task failed

If a task fails, task_status is set to FAILED, and an error code and message are provided. For more information, see 429-Error messages to resolve the issue.

{
    "request_id": "e5d70b02-ebd3-98ce-9fe8-759d7d7b107d",
    "output": {
        "task_id": "86ecf553-d340-4e21-af6e-a0c6a421c010",
        "task_status": "FAILED",
        "code": "InvalidParameter",
        "message": "The size is not match xxxxxx"
    }
}

Task query expired

The task_id is valid for 24 hours. After this period, the query fails and the following error message is returned.

{
    "request_id": "a4de7c32-7057-9f82-8581-xxxxxx",
    "output": {
        "task_id": "502a00b1-19d9-4839-a82f-xxxxxx",
        "task_status": "UNKNOWN"
    }
}

output object

The task output information.

Properties

task_id string

The task ID. The query is valid for 24 hours.

task_status string

The task status.

Enumeration

  • PENDING

  • RUNNING

  • SUCCEEDED

  • FAILED

  • CANCELED

  • UNKNOWN

Status transitions during polling:

  • PENDING (In queue) → RUNNING (Processing) → SUCCEEDED (Successful) or FAILED (Failed).

  • The status of the first query is usually PENDING (In queue) or RUNNING (Processing).

  • If the status changes to SUCCEEDED, the response contains the generated video URL.

  • If the status is FAILED, check the error message and retry.

submit_time string

The time when the task was submitted. The format is YYYY-MM-DD HH:mm:ss.SSS.

scheduled_time string

The time when the task started running. The format is YYYY-MM-DD HH:mm:ss.SSS.

end_time string

The time when the task was completed. The format is YYYY-MM-DD HH:mm:ss.SSS.

video_url string

The video URL. This parameter is returned only if task_status is SUCCEEDED.

The link is valid for 24 hours. You can use this URL to download the video. The video is in MP4 format with H.264 encoding.

orig_prompt string

The original input prompt. This corresponds to the prompt request parameter.

actual_prompt string

If prompt rewriting is enabled, this parameter returns the actual optimized prompt that is used. If this feature is disabled, this parameter is not returned.

code string

The error code for a failed request. This parameter is not returned if the request is successful. For more information, see 429-Error messages.

message string

The detailed information about a failed request. This parameter is not returned if the request is successful. For more information, see 429-Error messages.

usage object

Statistics for the output information. Only successful results are counted.

Properties

video_duration integer

This field is currently returned only by the 2.1 model. The duration of the generated video in seconds. Enumeration values: 3, 4, 5.

Billing formula: Cost = Video duration in seconds × Unit price.

video_ratio string

This field is currently returned only by the 2.1 model. The aspect ratio of the generated video. The value is fixed at "standard".

duration integer

This field is currently returned only by the 2.2 and later models. The duration of the generated video in seconds. Enumeration values: 5, 10.

Billing formula: Cost = Video duration in seconds × Unit price.

SR integer

This field is currently returned only by the 2.2 and later models. The resolution of the generated video. Enumeration values: 480, 720, 1080.

video_count integer

The number of generated videos. The value is fixed at 1.

request_id string

The unique request ID. You can use this ID to trace and troubleshoot issues.

DashScope SDK

The parameter names in the SDK are mostly consistent with the HTTP API. The parameter structure is encapsulated based on the features of the programming language.

Because image-to-video tasks can take a long time to complete, typically 1 to 5 minutes, the SDK encapsulates the asynchronous HTTP call process at the underlying layer and supports both synchronous and asynchronous call methods.

The actual time required depends on the number of tasks in the queue and the service execution status. Please be patient while you wait for the result.

Python SDK

The Python SDK supports three image input methods: public URL, Base64-encoded string, and local file path (absolute or relative). You can choose one of these methods. For more information, see Input image.

Note

We recommend that you install the latest version of the DashScope Python SDK to avoid potential runtime errors. For more information, see Install or upgrade the SDK.

Sample code

Synchronous

A synchronous call blocks and waits until the video generation is complete and the result is returned. This example shows three image input methods: public URL, Base64 encoding, and local file path.

Request example
import base64
import os
from http import HTTPStatus
from dashscope import VideoSynthesis
import mimetypes
import dashscope

# The following is the URL for the Singapore region. If you use a model in the Beijing region, replace the URL with: https://dashscope.aliyuncs.com/api/v1
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'


# If you have not configured an environment variable, replace the following line with your Model Studio API key: api_key="sk-xxx"
# The API keys for the Singapore and Beijing regions are different. Get an API key: https://www.alibabacloud.com/help/zh/model-studio/get-api-key
api_key = os.getenv("DASHSCOPE_API_KEY")

# --- Helper function for Base64 encoding ---
# Format: data:{MIME_type};base64,{base64_data}
def encode_file(file_path):
    mime_type, _ = mimetypes.guess_type(file_path)
    if not mime_type or not mime_type.startswith("image/"):
        raise ValueError("Unsupported or unknown image format")
    with open(file_path, "rb") as image_file:
        encoded_string = base64.b64encode(image_file.read()).decode('utf-8')
    return f"data:{mime_type};base64,{encoded_string}"

"""
Image input methods:
The following are three image input methods.

1. Use a public URL - Suitable for publicly accessible images.
2. Use a local file - Suitable for local development and testing.
3. Use Base64 encoding - Suitable for private images or scenarios requiring encrypted transmission.
"""

# [Method 1] Use a publicly accessible image URL
# Example: Use a public image URL
img_url = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/wpimhv/rap.png"

# [Method 2] Use a local file (supports absolute and relative paths)
# Format: file:// + file path
# Example (absolute path):
# img_url = "file://" + "/path/to/your/img.png"    # Linux/macOS
# img_url = "file://" + "C:/path/to/your/img.png"  # Windows
# Example (relative path):
# img_url = "file://" + "./img.png"                # Path relative to the current execution file

# [Method 3] Use a Base64-encoded image
# img_url = encode_file("./img.png")

# Set the audio URL
audio_url = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/ozwpvi/rap.mp3"

def sample_call_i2v():
    # Synchronous call, returns the result directly
    print('Please wait...')
    rsp = VideoSynthesis.call(api_key=api_key,
                              model='wan2.5-i2v-preview',
                              prompt='A scene of urban fantasy art. A dynamic graffiti art character. A boy painted with spray paint comes to life from a concrete wall. He sings an English rap song at a very fast pace while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single streetlight, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of the boy'\''s rap, with no other dialogue or noise.',
                              img_url=img_url,
                              audio_url=audio_url,
                              resolution="480P",
                              duration=10,
                              # audio=True,
                              prompt_extend=True,
                              watermark=False,
                              negative_prompt="",
                              seed=12345)
    print(rsp)
    if rsp.status_code == HTTPStatus.OK:
        print("video_url:", rsp.output.video_url)
    else:
        print('Failed, status_code: %s, code: %s, message: %s' %
              (rsp.status_code, rsp.code, rsp.message))


if __name__ == '__main__':
    sample_call_i2v()
Response example
The `video_url` is valid for 24 hours. Download the video promptly.
{
    "status_code": 200,
    "request_id": "55194b9a-d281-4565-8ef6-xxxxxx",
    "code": null,
    "message": "",
    "output": {
        "task_id": "e2bb35a2-0218-4969-8c0d-xxxxxx",
        "task_status": "SUCCEEDED",
        "video_url": "https://dashscope-result-sh.oss-cn-shanghai.aliyuncs.com/xxx.mp4?Expires=xxx",
        "submit_time": "2025-10-28 13:45:48.620",
        "scheduled_time": "2025-10-28 13:45:57.378",
        "end_time": "2025-10-28 13:48:05.361",
        "orig_prompt": "A scene of urban fantasy art. A dynamic graffiti art character. A boy painted with spray paint comes to life from a concrete wall. He sings an English rap song at a very fast pace while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single streetlight, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of the boy'\''s rap, with no other dialogue or noise.",
        "actual_prompt": "A boy made of spray paint emerges from a concrete wall, stands still, and begins to sing an English rap, his mouth opening and closing, his head nodding to the rhythm, and his eyes focused. He gives a thumbs-up with his right hand, puts his left hand on his hip, and moves his body rhythmically in place. The background is a night scene under a railway bridge, lit by a single streetlight. The audio is the boy'\''s rap performance, with the lyrics: 'Skyscrapers loom, shadows kiss the pavement. Dreams stack high, but the soul'\''s in the basement. Pocket full of lint, chasing gold like it'\''s sacred. Every breath a gamble, the odds never patient.'"
    },
    "usage": {
        "video_count": 1,
        "video_duration": 0,
        "video_ratio": "",
        "duration": 10,
        "SR": 480
    }
}

Asynchronous

This example shows an asynchronous call. This method immediately returns a task ID, and you must poll for or wait for the task to complete.

Request example
import os
from http import HTTPStatus
from dashscope import VideoSynthesis
import dashscope

# The following is the URL for the Singapore region. If you use a model in the Beijing region, replace the URL with: https://dashscope.aliyuncs.com/api/v1
dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'


# If you have not configured an environment variable, replace the following line with your Model Studio API key: api_key="sk-xxx"
# The API keys for the Singapore and Beijing regions are different. Get an API key: https://www.alibabacloud.com/help/zh/model-studio/get-api-key
api_key = os.getenv("DASHSCOPE_API_KEY")

# Use a publicly accessible image URL
img_url = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/wpimhv/rap.png"

# Set the audio URL
audio_url = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/ozwpvi/rap.mp3"


def sample_async_call_i2v():
    # Asynchronous call, returns a task_id
    rsp = VideoSynthesis.async_call(api_key=api_key,
                                    model='wan2.5-i2v-preview',
                                    prompt='A scene of urban fantasy art. A dynamic graffiti art character. A boy painted with spray paint comes to life from a concrete wall. He sings an English rap song at a very fast pace while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single streetlight, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of the boy'\''s rap, with no other dialogue or noise.',
                                    img_url=img_url,
                                    audio_url=audio_url,
                                    resolution="480P",
                                    duration=10,
                                    # audio=True,
                                    prompt_extend=True,
                                    watermark=False,
                                    negative_prompt="",
                                    seed=12345)
    print(rsp)
    if rsp.status_code == HTTPStatus.OK:
        print("task_id: %s" % rsp.output.task_id)
    else:
        print('Failed, status_code: %s, code: %s, message: %s' %
              (rsp.status_code, rsp.code, rsp.message))

    # Get asynchronous task information
    status = VideoSynthesis.fetch(task=rsp, api_key=api_key)
    if status.status_code == HTTPStatus.OK:
        print(status.output.task_status)
    else:
        print('Failed, status_code: %s, code: %s, message: %s' %
              (status.status_code, status.code, status.message))

    # Wait for the asynchronous task to finish
    rsp = VideoSynthesis.wait(task=rsp, api_key=api_key)
    print(rsp)
    if rsp.status_code == HTTPStatus.OK:
        print(rsp.output.video_url)
    else:
        print('Failed, status_code: %s, code: %s, message: %s' %
              (rsp.status_code, rsp.code, rsp.message))


if __name__ == '__main__':
    sample_async_call_i2v()
Response example

1. Response example for creating a task

{
    "status_code": 200,
    "request_id": "6dc3bf6c-be18-9268-9c27-xxxxxx",
    "code": "",
    "message": "",
    "output": {
        "task_id": "686391d9-7ecf-4290-a8e9-xxxxxx",
        "task_status": "PENDING",
        "video_url": ""
    },
    "usage": null
}

2. Response example for querying a task result

The `video_url` is valid for 24 hours. Download the video promptly.
{
    "status_code": 200,
    "request_id": "55194b9a-d281-4565-8ef6-xxxxxx",
    "code": null,
    "message": "",
    "output": {
        "task_id": "e2bb35a2-0218-4969-8c0d-xxxxxx",
        "task_status": "SUCCEEDED",
        "video_url": "https://dashscope-result-sh.oss-cn-shanghai.aliyuncs.com/xxx.mp4?Expires=xxx",
        "submit_time": "2025-10-28 13:45:48.620",
        "scheduled_time": "2025-10-28 13:45:57.378",
        "end_time": "2025-10-28 13:48:05.361",
        "orig_prompt": "A scene of urban fantasy art. A dynamic graffiti art character. A boy painted with spray paint comes to life from a concrete wall. He sings an English rap song at a very fast pace while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single streetlight, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of the boy'\''s rap, with no other dialogue or noise.",
        "actual_prompt": "A boy made of spray paint emerges from a concrete wall, stands still, and begins to sing an English rap, his mouth opening and closing, his head nodding to the rhythm, and his eyes focused. He gives a thumbs-up with his right hand, puts his left hand on his hip, and moves his body rhythmically in place. The background is a night scene under a railway bridge, lit by a single streetlight. The audio is the boy'\''s rap performance, with the lyrics: 'Skyscrapers loom, shadows kiss the pavement. Dreams stack high, but the soul'\''s in the basement. Pocket full of lint, chasing gold like it'\''s sacred. Every breath a gamble, the odds never patient.'"
    },
    "usage": {
        "video_count": 1,
        "video_duration": 0,
        "video_ratio": "",
        "duration": 10,
        "SR": 480
    }
}

Java SDK

The Java SDK supports three image input methods: public URL, Base64-encoded string, and local file path (absolute path only). You can choose one of these methods. For more information, see Input image.

Note

We recommend that you install the latest version of the DashScope Java SDK to avoid potential runtime errors. For more information, see Install or upgrade the SDK.

Sample code

Synchronous

A synchronous call blocks and waits until the video generation is complete and the result is returned. This example shows three image input methods: public URL, Base64 encoding, and local file path.

Request example
// Copyright (c) Alibaba, Inc. and its affiliates.

import com.alibaba.dashscope.aigc.videosynthesis.VideoSynthesisParam;
import com.alibaba.dashscope.aigc.videosynthesis.VideoSynthesisResult;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.Constants;
import com.alibaba.dashscope.utils.JsonUtils;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Base64;
import java.util.HashMap;
import java.util.Map;

 
public class Image2Video {

    static {
        Constants.baseHttpApiUrl = "https://dashscope-intl.aliyuncs.com/api/v1";
        // The preceding is the URL for the Singapore region. If you use a model in the Beijing region, replace the URL with: https://dashscope.aliyuncs.com/api/v1
    }

    // If you have not configured an environment variable, replace the following line with your Model Studio API key: apiKey="sk-xxx"
    // The API keys for the Singapore and Beijing regions are different. Get an API key: https://www.alibabacloud.com/help/zh/model-studio/get-api-key
    static String apiKey = System.getenv("DASHSCOPE_API_KEY");
    
    /**
     * Image input methods: Choose one of the following three.
     *
     * 1. Use a public URL - Suitable for publicly accessible images.
     * 2. Use a local file - Suitable for local development and testing.
     * 3. Use Base64 encoding - Suitable for private images or scenarios requiring encrypted transmission.
     */

    // [Method 1] Public URL
    static String imgUrl = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/wpimhv/rap.png";

    // [Method 2] Local file path (file:// + absolute path)
    // static String imgUrl = "file://" + "/your/path/to/img.png";    // Linux/macOS
    // static String imgUrl = "file://" + "C:/your/path/to/img.png";  // Windows

    // [Method 3] Base64 encoding
    // static String imgUrl = Image2Video.encodeFile("/your/path/to/img.png");
    
    // Set the audio URL
    static String audioUrl = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/ozwpvi/rap.mp3";

    public static void image2video() throws ApiException, NoApiKeyException, InputRequiredException {
        // Set the parameters
        Map<String, Object> parameters = new HashMap<>();
        parameters.put("prompt_extend", true);
        parameters.put("watermark", false);
        parameters.put("seed", 12345);

        VideoSynthesis vs = new VideoSynthesis();
        VideoSynthesisParam param =
                VideoSynthesisParam.builder()
                        .apiKey(apiKey)
                        .model("wan2.5-i2v-preview")
                        .prompt("A scene of urban fantasy art. A dynamic graffiti art character. A boy painted with spray paint comes to life from a concrete wall. He sings an English rap song at a very fast pace while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single streetlight, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of the boy'\''s rap, with no other dialogue or noise.")
                        .imgUrl(imgUrl)
                        .audioUrl(audioUrl)
                        //.audio(true)
                        .duration(10)
                        .parameters(parameters)
                        .resolution("480P")
                        .negativePrompt("")
                        .build();
        System.out.println("Please wait...");
        VideoSynthesisResult result = vs.call(param);
        System.out.println(JsonUtils.toJson(result));
    }
    
     /**
     * Encodes a file into a Base64 string.
     * @param filePath The file path.
     * @return A Base64 string in the format: data:{MIME_type};base64,{base64_data}
     */
    public static String encodeFile(String filePath) {
        Path path = Paths.get(filePath);
        if (!Files.exists(path)) {
            throw new IllegalArgumentException("File does not exist: " + filePath);
        }
        // Detect MIME type
        String mimeType = null;
        try {
            mimeType = Files.probeContentType(path);
        } catch (IOException e) {
            throw new IllegalArgumentException("Cannot detect file type: " + filePath);
        }
        if (mimeType == null || !mimeType.startsWith("image/")) {
            throw new IllegalArgumentException("Unsupported or unknown image format");
        }
        // Read file content and encode
        byte[] fileBytes = null;
        try{
            fileBytes = Files.readAllBytes(path);
        } catch (IOException e) {
            throw new IllegalArgumentException("Cannot read file content: " + filePath);
        }
    
        String encodedString = Base64.getEncoder().encodeToString(fileBytes);
        return "data:" + mimeType + ";base64," + encodedString;
    }
    

    public static void main(String[] args) {
        try {
            image2video();
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            System.out.println(e.getMessage());
        }
        System.exit(0);
    }
}
Response example
The `video_url` is valid for 24 hours. Download the video promptly.
{
    "request_id": "f1bfb531-6e13-4e17-8e93-xxxxxx",
    "output": {
        "task_id": "9ddebba6-f784-4f55-b845-xxxxxx",
        "task_status": "SUCCEEDED",
        "video_url": "https://dashscope-result-sh.oss-cn-shanghai.aliyuncs.com/xxx.mp4?Expires=xxx"
    },
    "usage": {
        "video_count": 1
    }
}

Asynchronous

This example shows an asynchronous call. This method immediately returns a task ID, and you must poll for or wait for the task to complete.

Request example
// Copyright (c) Alibaba, Inc. and its affiliates.

// dashscope sdk >= 2.20.1
import com.alibaba.dashscope.aigc.videosynthesis.VideoSynthesis;
import com.alibaba.dashscope.aigc.videosynthesis.VideoSynthesisParam;
import com.alibaba.dashscope.aigc.videosynthesis.VideoSynthesisResult;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.utils.Constants;
import com.alibaba.dashscope.utils.JsonUtils;

import java.util.HashMap;
import java.util.Map;

public class Image2Video {

    static {
        // The following is the URL for the Singapore region. If you use a model in the Beijing region, replace the URL with: https://dashscope.aliyuncs.com/api/v1
        Constants.baseHttpApiUrl = "https://dashscope-intl.aliyuncs.com/api/v1";
    }

    // If you have not configured an environment variable, replace the following line with your Model Studio API key: api_key="sk-xxx"
    // The API keys for the Singapore and Beijing regions are different. Get an API key: https://www.alibabacloud.com/help/zh/model-studio/get-api-key
    static String apiKey = System.getenv("DASHSCOPE_API_KEY");
    // Set the input image URL
    static String imgUrl = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/wpimhv/rap.png";

    // Set the audio URL
    static String audioUrl = "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/ozwpvi/rap.mp3";

    public static void image2video() throws ApiException, NoApiKeyException, InputRequiredException {
        // Set the parameters
        Map<String, Object> parameters = new HashMap<>();
        parameters.put("prompt_extend", true);
        parameters.put("watermark", false);
        parameters.put("seed", 12345);

        VideoSynthesis vs = new VideoSynthesis();
        VideoSynthesisParam param =
                VideoSynthesisParam.builder()
                        .apiKey(apiKey)
                        .model("wan2.5-i2v-preview")
                        .prompt("A scene of urban fantasy art. A dynamic graffiti art character. A boy painted with spray paint comes to life from a concrete wall. He sings an English rap song at a very fast pace while striking a classic, energetic rapper pose. The scene is set under an urban railway bridge at night. The lighting comes from a single streetlight, creating a cinematic atmosphere full of high energy and amazing detail. The audio of the video consists entirely of the boy'\''s rap, with no other dialogue or noise.")
                        .imgUrl(imgUrl)
                        .audioUrl(audioUrl)
                        //.audio(true)
                        .duration(10)
                        .parameters(parameters)
                        .resolution("480P")
                        .negativePrompt("")
                        .build();
        // Asynchronous call
        VideoSynthesisResult task = vs.asyncCall(param);
        System.out.println(JsonUtils.toJson(task));
        System.out.println("Please wait...");

        // Get the result
        VideoSynthesisResult result = vs.wait(task, apiKey);
        System.out.println(JsonUtils.toJson(result));
    }

    // Get the task list
    public static void listTask() throws ApiException, NoApiKeyException {
        VideoSynthesis is = new VideoSynthesis();
        AsyncTaskListParam param = AsyncTaskListParam.builder().build();
        param.setApiKey(apiKey);
        VideoSynthesisListResult result = is.list(param);
        System.out.println(result);
    }

    // Get a single task result
    public static void fetchTask(String taskId) throws ApiException, NoApiKeyException {
        VideoSynthesis is = new VideoSynthesis();
        // If DASHSCOPE_API_KEY is set as an environment variable, apiKey can be null.
        VideoSynthesisResult result = is.fetch(taskId, apiKey);
        System.out.println(result.getOutput());
        System.out.println(result.getUsage());
    }

    public static void main(String[] args) {
        try {
            image2video();
        } catch (ApiException | NoApiKeyException | InputRequiredException e) {
            System.out.println(e.getMessage());
        }
        System.exit(0);
    }
}
Response example

1. Response example for creating a task

{
    "request_id": "5dbf9dc5-4f4c-9605-85ea-xxxxxxxx",
    "output": {
        "task_id": "7277e20e-aa01-4709-xxxxxxxx",
        "task_status": "PENDING"
    }
}

2. Response example for querying a task result

The `video_url` is valid for 24 hours. Download the video promptly.
{
    "request_id": "f1bfb531-6e13-4e17-8e93-xxxxxx",
    "output": {
        "task_id": "9ddebba6-f784-4f55-b845-xxxxxx",
        "task_status": "SUCCEEDED",
        "video_url": "https://dashscope-result-sh.oss-cn-shanghai.aliyuncs.com/xxx.mp4?Expires=xxx"
    },
    "usage": {
        "video_count": 1
    }
}

Limitations

  • Data retention: The task_id and video URL are retained for only 24 hours. After this period, they cannot be queried or downloaded.

  • Audio support: The wan2.5 model supports videos with audio, including automatic audio generation or a custom audio file. The wan2.2 and earlier versions only output silent videos. If needed, you can use speech synthesis to generate audio.

  • Content moderation: The input prompt and video, along with the output video, are subject to content moderation. Non-compliant content results in an "IPInfringementSuspect" or "DataInspectionFailed" error. For more information, see Error codes.

  • Network access configuration: Video links are stored in Alibaba Cloud OSS. If your business system cannot access external OSS links due to security policies, you must add the following OSS domain names to your network access whitelist.

    # OSS domain name list
    dashscope-result-bj.oss-cn-beijing.aliyuncs.com
    dashscope-result-hz.oss-cn-hangzhou.aliyuncs.com
    dashscope-result-sh.oss-cn-shanghai.aliyuncs.com
    dashscope-result-wlcb.oss-cn-wulanchabu.aliyuncs.com
    dashscope-result-zjk.oss-cn-zhangjiakou.aliyuncs.com
    dashscope-result-sz.oss-cn-shenzhen.aliyuncs.com
    dashscope-result-hy.oss-cn-heyuan.aliyuncs.com
    dashscope-result-cd.oss-cn-chengdu.aliyuncs.com
    dashscope-result-gz.oss-cn-guangzhou.aliyuncs.com
    dashscope-result-wlcb-acdr-1.oss-cn-wulanchabu-acdr-1.aliyuncs.com

Key parameter descriptions

Input image

The img_url parameter specifies the input image and supports the following three input methods:

Method 1: Public URL

  • A publicly accessible address that supports HTTP/HTTPS.

  • Example: https://example.com/images/cat.png.

Method 2: Base64 encoding

Sample code

import base64
import mimetypes


# --- For Base64 encoding ---
# Format: data:{MIME_type};base64,{base64_data}
def encode_file(file_path):
    mime_type, _ = mimetypes.guess_type(file_path)
    if not mime_type or not mime_type.startswith("image/"):
        raise ValueError("Unsupported or unknown image format")
    with open(file_path, "rb") as image_file:
        encoded_string = base64.b64encode(image_file.read()).decode('utf-8')
    return f"data:{mime_type};base64,{encoded_string}"


if __name__ == "__main__":
    print(encode_file("./image_demo_input.png"))
  • Example: data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAABDg......(Snippet shown due to length limit). When making a call, pass the complete string.

  • Encoding format: Use the data:{MIME_type};base64,{base64_data} format, where:

    • {base64_data}: The Base64-encoded string of the image file.

    • {MIME_type}: The media type of the image, which must correspond to the file format.

      Image format

      MIME Type

      JPEG

      image/jpeg

      JPG

      image/jpeg

      PNG

      image/png

      BMP

      image/bmp

      WEBP

      image/webp

Method 3: Local file path (SDK only)

  • Python SDK: Supports both absolute and relative file paths. The file path rules are as follows:

    System

    Passed file path

    Example (absolute path)

    Example (relative path)

    Linux or macOS

    file://{absolute or relative path of the file}

    file:///home/images/test.png

    file://./images/test.png

    Windows

    file://D:/images/test.png

    file://./images/test.png

  • Java SDK: Supports only the absolute path of the file. The file path rules are as follows:

    System

    Passed file path

    Example (absolute path)

    Linux or macOS

    file://{absolute path of the file}

    file:///home/images/test.png

    Windows

    file:///{absolute path of the file}

    file:///D:/images/test.png

Audio settings

Supported model: wan2.5-i2v-preview.

Audio settings: You can control the audio behavior using the input.audio_url and parameters.audio parameters. Priority: audio_url > audio. Three modes are supported:

  1. Generate a silent video

    1. Parameter settings: Do not pass `audio_url`, and set `audio` to `false`.

    2. Scenario: This is useful for purely visual content where you plan to add your own audio or music later.

  2. Generate audio automatically

    1. Parameter settings: Do not pass `audio_url`, and set `audio` to `true`.

    2. Effect description: The model automatically generates matching background audio or music based on the prompt and visual content.

  3. Use custom audio

    1. Parameter settings: Pass an `audio_url`. The `audio` parameter is ignored in this case.

    2. Effect description: The video content is aligned with the audio content, such as lip movements and rhythm.

Billing and rate limiting

  • For information about the model's free quota and pricing, see Models and pricing.

  • For more information about model rate limiting, see Wan series.

  • Billing description:

    • You are charged based on the duration in seconds of successfully generated videos. A charge is incurred only when the query result API returns a task_status of SUCCEEDED and the video is successfully generated.

    • Failed model calls or processing errors do not incur any fees or consume the free quota.

Error codes

If a model call fails and an error message is returned, see 429-Error messages for more information.

FAQ

For video-related questions, see the FAQ.

Q: How do I generate a video with a specific aspect ratio, such as 3:4?

A: The aspect ratio of the output video is determined by the input first frame image (img_url), but an exact ratio, such as a strict 3:4, cannot be guaranteed.

How it works: The model uses the aspect ratio of the input image as a baseline and then adapts it to a supported resolution based on the resolution parameter, such as 480p, 720p, or 1080p. Because the output resolution must meet technical requirements where the width and height must be divisible by 16, the final aspect ratio may have a slight deviation, for example, an adjustment from 0.75 to 0.739. This is normal behavior.

  • Example: An input image is 750 × 1000 (aspect ratio 3:4 = 0.75), and `resolution` is set to "720p" (target total pixels approx. 920,000). The actual output is 816 × 1104 (aspect ratio ≈ 0.739, total pixels approx. 900,000).

  • Note that the resolution parameter mainly controls the video's definition (total pixel count). The final video aspect ratio is still based on the input image, with only necessary minor adjustments.

Best practice: To strictly match a target aspect ratio, use an input image with that ratio and then post-process the output video by cropping or padding it. For example, you can use a video editing tool to crop the output video to the target ratio, or add black bars or a blurred background for padding.

Appendix

Examples of basic image-to-video features

Feature

Input first frame image

Input prompt

Output video

Silent video

image

A cat running on the grass