All Products
Search
Document Center

Intelligent Media Services:Intelligent one-click video creation

Last Updated:Aug 19, 2025

This topic describes the billing for one-click video creation.

One-click video creation

  • Billing rules:

    • Billing is based on the total duration of the input video and the output video clip. Durations less than 1 minute are rounded up to 1 minute. You are not charged for failed synthesis jobs.

    • Intelligent text generation is billed based on the number of tokens consumed. A token count of less than 1,000 is rounded up to 1,000. You are not charged for failed generation jobs.

  • Billing cycle: Bills are generated hourly. In each billing cycle, Alibaba Cloud measures your usage from the previous cycle and issues a bill. The exact billing time may vary depending on the system.

Task name

Billing method

The Chinese mainland

Singapore

US West

Unit

Product document

Intelligent text generation

Billed by the number of tokens

0.017

0.025

0.025

USD/1,000 tokens

Script-based video creation

  • Same as video editing. Billed by output duration and resolution.

  • If you only generate a TimeLine (that is, GeneratePreviewOnly=true), you will not be charged. If you later compose the TimeLine into a final video, the composition will still be billed according to the standards for video clipping.

Same as video editing

Same as video editing

Same as video editing

Same as video editing

  • Automated script-based video creation

Intelligent image-text matching - General-purpose Edition

  • Billed by the total duration of the input and output videos. Note: If you use materials from a theme description search as input videos, the actual input duration may be longer than the duration of the search results. This is because a certain amount of redundancy is required for the random nature of intelligent editing.

  • If you only generate a Timeline (i.e., GeneratePreviewOnly=true), the output duration is calculated based on the total duration of the Timeline.

0.04

0.06

0.06

USD/minute

Intelligent image-text matching - Film and TV Highlights Edition

  • Billed by the total duration of the input and output videos

  • If only the Timeline is generated (that is, GeneratePreviewOnly=true), the output duration is calculated based on the total duration of the Timeline.

0.14

0.21

0.21

USD/minute

High-energy montage creation

  • Billed by the total duration of the input and output videos

  • If only a TimeLine is generated (that is, GeneratePreviewOnly=true), the output duration is calculated based on the total duration of the TimeLine.

0.28

0.42

0.42

USD/minute

Intelligent highlight clip extraction

Billed by the input video duration

0.28

0.42

0.42

USD/minute

Billing examples

Assume that between 8:00 and 9:00, you use the General-purpose Edition of intelligent image-text matching for one-click video creation in the US West region. You use a 90-second video as the source material and successfully output a 23-second video clip. During this process, you also use the intelligent text generation feature, which consumes 900 tokens. The total fee is calculated as follows: Based on the billing rules, the total duration of the input and output videos is 90 + 23 = 113 seconds. For billing, this duration is rounded up to 2 minutes. The fee for the video clip is USD 0.06/minute × 2 minutes = USD 0.12. The number of tokens used for intelligent text generation is 900. For billing, this is rounded up to 1,000 tokens. The fee for text generation is USD 0.025/1,000 tokens × 1 = USD 0.025. Therefore, the total fee for one-click video creation between 8:00 and 9:00 is USD 0.12 + USD 0.025 = USD 0.145.