This topic describes the public parameters for batch video production.
This topic covers public parameters for the following features: Script-to-video, Image-Text Matching (Common Scenarios), Image-Text Matching (Movie Collections), and highlight mashup. For usage examples, refer to the specific documentation for each feature.
SubHeading
Parameter | Type | Description | Example | Required |
Level | Integer | The level of the subheading. Enum: 1, 2, 3, 4, 5. | 1 | Yes |
TitleArray | List<String> | An array of subheadings. One is randomly selected for each production. Max 50 titles, each up to 50 characters long. | ["****b4549d46c88681030f6e****","****549d46c88b4681030f6e****"] | Yes |
Sticker
Parameter | Type | Description | Example | Required |
MediaId | String | The ID of the sticker, logo, watermark, or other image asset. | ****b4545fg6c88681030f6e**** | Required (choose one: MediaId or MediaURL). MediaId takes priority if both are provided. |
MediaURL | String | The URL of the image, which must be on your own OSS. | Rule: http://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name].png Example: http://example.oss-cn-shanghai.aliyuncs.com/example/example.png | |
X | Float | Refer to VideoTrackClip.X | 10 | No |
Y | Float | Refer to VideoTrackClip.Y | 100 | No |
Width | Float | Refer to VideoTrackClip.Width | 300 | No |
Height | Float | Refer to VideoTrackClip.Height | 300 | No |
DyncFrames | Integer | Number of frames for animated stickers. Value range: [0,100]. | 25 | No, required only for animated stickers. |
Opacity | Float | The opacity. Value range: [0,1]. Default value: 1. | 0.6 | No |
MediaConfig
Parameter | Type | Description | Example | Required |
Volume | String | Volume of input video. Value range: [0, 10.0]. supports up to two decimal places. | 0.5 | No, default is 0. |
MediaMetaDataArray | List<MediaMetaData> | A list of media metadata objects. | See MediaMetaData. | No |
MediaMetaData
Parameter | Type | Description | Example | Required |
Media | String | The media asset ID or OSS URL, which must match the assets provided in InputConfig. | 53fg5f9d46c88b******0f6etft6 | Yes |
GroupName | String | The group to which the media asset belongs. | group1 | This parameter is effective only in Script-to-Video mode. Do not use it for other production modes. |
TimeRangeList | List<TimeRange> | A list of in and out points. You can specify multiple time ranges for each asset, and clips will be selected from these ranges for production. | [{"In":"0","Out":"1"},{"In":"2","Out":"3"}] | Yes |
Opacity | Float | The opacity of the asset. Value range: [0,1]. Important When using opacity, it is recommended to add a background image using BackgroundImageArray. If BackgroundImageConfig is configured, this parameter does not take effect. | 0.5 | No, default is 1. |
TimeRange
Parameter | Type | Description | Example | Required |
In | Float | The in-point of the asset in seconds, accurate to 4 decimal places. | 1.1233 | Yes |
Out | Float | The out-point of the asset in seconds, accurate to 4 decimal places. | 2.4566 | Yes |
SpeechConfig
Parameter | Type | Description | Example | Required |
Volume | String | The volume of the voiceover audio. Default is 1. Range: [0, 10.0]. Supports two decimal places. | 0.5 | No, default is 1. |
AsrConfig | JSON | Supports subtitle parameter configuration. For fields, see Banner text. | {"Alignment":"TopCenter","AdaptMode":"AutoWrap","Font":"Alibaba PuHuiTi 2.0 65 Medium","SizeRequestType":"Nominal","Spacing":-1,"Y":0.8} | No |
Voice | String | Specify one or more official voices, separated by commas. When multiple voices are specified, one is randomly selected. For available voices, see Intelligent voice examples. | zhimiao_emo,zhilun | No |
SpeechRate |
Note
| 100 | No, default is 0. | |
Style | String | The voiceover style. A random voice from the specified style is selected.
For voice examples, see Intelligent voice examples. | Gentle | No |
CustomizedVoice | String | A custom voice. Provide the VoiceId from a cloned voice (both Standard and Public versions are supported). If set, Voice and Style are ignored. | Voice-test1 | No |
JSON | Configuration for local effects on subtitles (voiceover), supporting word highlighting and sensitive word filtering. This parameter is an array. Important This parameter is only supported in Script-to-Video (Segmented Scripts) mode. | [{"Type":"Highlight","Style":{"FontName":"KaiTi","FontSize":80,"FontColor":"20AEE9","OutlineColour":"2D20E9","Outline":3,"FontFace":{"Bold":true,"Underline":true}},"WordsList":["ApsaraVideo","Intelligent Media Services","Batch video production"]},{"Type":"Highlight","Style":{"FontFace":{"Italic":true}},"WordsList":["product","look"]},{"Type":"Forbidden","WordsList":["pitter-patter","bilibili"],"SoundReplaceMode":"None"}] | No | |
SpeechLanguage | String | The language of the subtitles (voiceover):
Important This parameter is only supported in Script-to-Video mode. When set to "en," it is recommended to set SpeechConfig.AsrConfig.AdaptMode to AutoWrapAtSpaces. | zh | No. Default is zh. |
SpecialWordsConfig
Parameter | Type | Description | Example | Required |
Type | String | Special word type. Valid values:
| Highlight | Yes |
WordsList | List<String> | A list of special words. Max 20 words, each up to 5 characters long. | ["ApsaraVideo","Intelligent Media Services","Batch video production"] | Yes |
JSON | Required when Type is Highlight | {"FontFace":{"Italic":true}} | No | |
SoundReplaceMode | String |
| None | No |
SoundEventUrl | String | Required when SoundReplaceMode is UserDefined. Supports any accessible download URL for an audio file (must be 16-bit, mono, 16000Hz WAV). | https://www.example.com/files/audio/example_sound.wav | No |
Style
Name | Type | Description | Example | Required |
FontName | String | The font name. Custom fonts are not currently supported. | KaiTi | No |
FontColor | String | The font color, in BBGGRR hex format (BGR order). | 00FF7F | No |
OutlineColour | String | The border color, in BBGGRR hex format. | EBCE87 | No |
BackColour | String | The shadow color, in BBGGRR hex format. | EBDE87 | No |
FontSize | Integer | The font size in pixels. | 100 | No |
Outline | Integer | The border width in pixels. | 3 | No |
JSON | The font style. | {"Bold":true,"Italic":true,"Underline":true} | No |
FontFace
Name | Type | Description | Example | Required |
Bold | Boolean | Bold | false | No |
Italic | Boolean | Italic | false | No |
Underline | Boolean | Underline | false | No |
BackgroundMusicConfig
Parameter | Type | Description | Example | Required |
Volume | String | Background music volume. Range: [0, 10.0]. Supports two decimal places. | 0.5 | No, default is 0.2. |
Style | String | The style of the background music. This field has no effect if background music is already configured in InputConfig. Valid values:
| bgm-beauty | No |
LoopMode | Boolean | Background music loop mode. Valid values:
| true | No, default is true. |
AFadeOutDuration | Float | The duration of the background music fade-out, in seconds. Range: | 2.0 | No, default is 2.0. |
BackgroundImageConfig
Parameter | Type | Description | Example | Required |
SubType | String | Background type. Valid values:
| Color | No |
Radius | Float | The blur radius. Effective when SubType is Blur. Range: [0.01, 1]. | 0.5 | No |
Color | String | The background color. Effective when SubType is Color. Hex RGB color value. | #000000 | No |
Text overlay
The following table lists the parameters for TitleConfig, SubHeadingConfig, and SpeechConfig.AsrConfig. Batch video production relies on standard editing features to render text overlays and has pre-optimized styles. Therefore, default values for batch video production may differ from standard editing defaults. If a field in TitleConfig, SubHeadingConfig, or SpeechConfig.AsrConfig does not have a default value listed here, it means no extra configuration is applied, and the default value will be the same as in standard editing.
Parameter | Type | Description | Example | Required | TitleConfig | SubHeadingConfig | SpeechConfig.AsrConfig |
TimelineIn | Float | The in-point for the text, in seconds, accurate to 4 decimal places. | 1.1233 | No | No default value | No default value | Not supported |
TimelineOut | Float | The out-point for the text, in seconds, accurate to 4 decimal places. | 2.4566 | No | No default value | No default value | Not supported |
X | Float | The horizontal distance from the top-left corner of the output video to the top-left corner of the text overlay. Supports both percentage (0-0.9999) and pixel (≥2) values. This coordinate scales with the output resolution. Default is 0. | 0.1 | No | No default value | No default value | No default value |
Y | Float | The vertical distance from the top-left corner of the output video to the top-left corner of the text overlay. Supports both percentage (0-0.9999) and pixel (≥2) values. This coordinate scales with the output resolution. Default is 0. | 0.2 | No | |||
Font | String | The font for the text overlay. For supported fonts, see Fonts. | SimSun | No | Alibaba PuHuiTi 2.0 95 ExtraBold | Alibaba PuHuiTi 2.0 95 ExtraBold | Alibaba PuHuiTi 2.0 65 Medium |
FontSize | Int | The font size. This size scales with the output resolution. Default is 0, max 5000. Note:
| 24 | No | |||
SizeRequestType | String | How font size is calculated for rendering.
| Nominal | No | Nominal | Nominal | Nominal |
FixedFontSize | Int | The font size. This size does not scale with the output resolution.
| 14 | No | No default value | No default value | No default value |
FixedX | Float | The horizontal distance from the top-left corner of the output video. Supports both percentage (0-0.9999) and pixel (≥2) values. This coordinate does not scale with the output resolution. | 64 | No | No default value | No default value | No default value |
FixedY | Float | The vertical distance from the top-left corner of the output video. Supports both percentage (0-0.9999) and pixel (≥2) values. This coordinate does not scale with the output resolution. | 64 | No | No default value | No default value | No default value |
FontColor | String | The color of the text overlay, in hex format. | #ffffff | No | No default value | No default value | No default value |
FontColorOpacity | String | The opacity of the text overlay. Range: 0 to 1 (1 is opaque, 0 is fully transparent). | 0.5 | No | 1 | 1 | 1 |
FontFace | The font style. All are | {"Bold":true,"Italic":true,"Underline":true} | No | No default value | No default value | No default value | |
Spacing | Integer | Character spacing in pixels. | 1 | No | 0 | -1 | |
LineSpacing | Integer | Line spacing in pixels. Default is 0. | 1 | No | No default value | No default value | No default value |
Angle | Float | Counter-clockwise rotation angle in degrees. Default is 0. | 5 | No | No default value | No default value | No default value |
BorderStyle | Int | Border and shadow style. | 3 | No | No default value | No default value | No default value |
Outline | Int | The width of the text border in pixels. Default is 0. | 1 | No | No default value | No default value | No default value |
OutlineColour | String | The color of the text border, in hex format. | #ffffff | No | No default value | No default value | No default value |
Shadow | Int | The depth of the text shadow in pixels. Default is 0. | 3 | No | 0 | 0 | 0 |
BackColour | String | The color of the text shadow, in hex format. | #ffffff | No | No default value | No default value | No default value |
Alignment | String | The alignment of the text overlay. Valid values:
To accurately position it under different alignment methods, it is recommended to set the following:
| TopLeft | No | TopCenter | TopCenter | TopCenter |
AdaptMode | String | How text wraps or scales when it exceeds the video width or TextWidth.
| AutoWrapAtSpaces | No | AutoWrap | AutoWrap | AutoWrap |
TextWidth | String | The width of the text box. Effective when AdaptMode is set. A value (0, 1] is relative to the video width; a value > 1 is an absolute pixel value. | 0.8 | No | 0.8 | 0.8 | Random between 0.8 and 0.9 |
FontUrl | String | The path to a custom font file in your OSS bucket (supports .ttf, .otf, .woff). | Rule: http://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name].ttf Example: https://your-bucket.oss-cn-shanghai.aliyuncs.com/example-font.ttf | No | No default value | No default value | No default value |
EffectColorStyle | String | The style type for stylized text. See Examples of word art effects. This is ignored if other style parameters are set, including FontColor, FontColorOpacity, BorderStyle, Outline, OutlineColour, Shadow, BackColour, SubtitleEffects, TextureURL, BubbleStyleId, BubbleWidth, and BubbleHeight. | CS0001-000001 | No | |||
SubtitleEffects | Multi-layer effects for the text, supporting multi-layer borders, shadows, Gaussian blur (for shadows), and backgrounds.
| [{"Type":"Outline"},{"Type":"Shadow"}] | No | No default value | No default value | No default value | |
AaiMotionInEffect | String | The in-animation effect type(s) for the text overlay, separated by commas. See Subtitle effect examples. | blur_in,wave_in | No | No default value | No default value | No default value |
AaiMotionIn | Float | The duration of the in-animation in seconds, accurate to 4 decimal places. Default is 0.5s. If the text duration is less than 0.5s, it will be the total duration minus the out duration. | 0.5 | No | No default value | No default value | No default value |
AaiMotionOutEffect | String | The in-animation effect type(s) for the text overlay, separated by commas. See Subtitle effect examples. | blur_in,wave_in | No | No default value | No default value | No default value |
AaiMotionOut | Float | The duration of the out-animation in seconds, accurate to 4 decimal places. Default is 0.5s. If the text duration is less than 0.5s, it will be the total text duration. | 0.5 | No | No default value | No default value | No default value |
AaiMotionLoopEffect | String | The loop animation effect type(s) for the text overlay, separated by commas. Cannot be used with in/out animations. See Subtitle effect examples. | blur_in,wave_in | No | No default value | No default value | No default value |
Ratio | Float | The playback speed of the loop animation, accurate to 4 decimal places. 1 is normal speed. A value greater than 1 means to accelerate. | 1.2 | No | No default value | No default value | No default value |
TextureURL | String | An image URL (OSS only) to use as a texture for the text. Supports PNG, JPG, JPEG, BMP. | https://your-bucket.oss-cn-shanghai.aliyuncs.com/your-image.png | No | No default value | No default value | No default value |
BubbleStyleId | String | The text bubble style. See Text bubble examples. | BS0001-000001 | No | No default value | No default value | No default value |
BubbleWidth | Float | The width of the bubble background, relative to the video width (if ≤ 1) or in absolute pixels (if > 1). | 24 | No | No default value | No default value | No default value |
BubbleHeight | Float | The height of the bubble background, relative to the video height (if ≤ 1) or in absolute pixels (if > 1). | 24 | No | No default value | No default value | No default value |
Default value descriptions
Default values of Y
Output aspect ratio | TitleConfig.Y | SubHeadingConfig.Y | SpeechConfig.AsrConfig.Y |
Portrait (≤ 3:4) | 0.1 | No default value | 0.8 |
Landscape (≥ 4:3) | 0.05 | No default value | 0.9 |
Square (> 3:4, < 4:3) | 0.08 | No default value | 0.85 |
Default values of TitleConfig
The default values below are based on resolutions of 1080×1920 (portrait), 1920×1080 (landscape), and 1080×1080 (square). For other resolutions, calculate the new font size using the formula:
Definitions:
newOutputHeight: height of the desired output;
oldOutputHeight: height of the original size";
newOutputWidth = "width of the target size";
oldOutputWidth = "width of the original size";
ratio" = "ratio";
newFontSize = "converted font size"
min(a,b): Take the smaller value of a and b;
round(a): Round to the nearest integer;
Calculation formula:
ratio = min(newOutputHeight / oldOutputHeight, newOutputWidth / oldOutputWidth);
newFontSize = round(oldFontSize × ratio);
where newFontSize is the converted font size
Example:
Assuming fontSize = 80 at 1920×1080; when converting to 960×540, newFontSize = 80 × min(960/1920, 540/1080) = 40
Default values of TitleConfig.FontSize
Output aspect ratio | M: Character count | Default fontSize |
Portrait (1080×1920) | 1 < M ≤ 8 | 119 |
8 < M ≤ 18 | 102 | |
M > 18 | 85 | |
Landscape (1920×1080) | 1 ≤ M ≤ 13 | 86 |
13 < M ≤ 34 | 67 | |
M > 34 | 52 | |
Square (1080×1080) | 1 < M ≤ 9 | 76 |
9 < M ≤ 20 | 67 | |
M > 20 | 57 |
Default values of SpeechConfig.AsrConfig.FontSize
Output aspect ratio | M: Character count | Default fontSize |
Portrait (1080×1920) | 1 ≤ M ≤ 14 | 68 |
M > 14 | 59 | |
Landscape (1920×1080) | 1 ≤ M ≤ 24 | 48 |
M > 24 | 38 | |
Square (1080×1080) | 1 ≤ M ≤ 16 | 43 |
M > 16 | 38 |
Default values of SubHeadingConfig.Level.FontSize
Default fontSize | Output aspect ratio | ||
Portrait (≤ 3:4) | Landscape (≥ 4:3) | Square (> 3:4, < 4:3) | |
68 | 67 | 57 | |
59 | 57 | 50 | |
51 | 48 | 43 | |
42 | 38 | 36 | |
38 | 33 | 31 | |
Default values of SubHeadingConfig.Spacing
Default fontSize | Output aspect ratio | ||
Portrait (≤ 3:4) | Landscape (≥ 4:3) | Square (> 3:4, < 4:3) | |
0.03 | 0.02 | 0.02 | |
0.03 | 0.01 | 0.01 | |
0.01 | 0 | 0 | |
0 | 0 | 0 | |
0 | 0 | 0 | |
Default style sets (text overlay, subtitle, background)
A style set is a combination of text overlay style, subtitle style, and background. During production, a set is randomly chosen. If you explicitly set style parameters in the API call, they will override the defaults from the chosen set.
Selection rule: If there are 21 style sets and you request 20 videos, the system will pick a random starting index (e.g., 16) and select sets cyclically: 16, 17, ..., 21, 1, 2, ...
Solid color background sets
No. | |||
1 | CS0004-000010 | CS0005-000003 | icepublic-76270ecd8bdbb0670e4f830ce4226d78 |
2 | CS0001-000012 | CS0005-000003 | icepublic-867553776f24ba3806d046fd83d623bd |
3 | CS0003-000013 | CS0005-000003 | icepublic-6098a7e3a964e44da0950f8fc1c801c4 |
4 | CS0001-000001 | CS0005-000003 | icepublic-986c6407d749d5a74bcb6e09c6452b86 |
5 | CS0002-000011 | CS0005-000003 | icepublic-51cc7b3a1747b65f3f201e4fb1940584 |
6 | CS0001-000003 | CS0005-000003 | icepublic-7a194df166e57ecaf2a6ea7d5d40fef6 |
7 | CS0002-000016 | CS0005-000003 | icepublic-19220373b09e5964f7594740a0d90d95 |
8 | CS0001-000008 | CS0005-000003 | icepublic-f538fa00133b3119c3975027261c0f16 |
10 | CS0002-000012 | CS0005-000003 | icepublic-4728b79c865a727e43f2777fcd622425 |
11 | CS0001-000007 | CS0005-000003 | icepublic-f38f87e54bd3cb596915f0fca88768a8 |
12 | CS0003-000001 | CS0005-000003 | icepublic-090d7b5c5e6dbf94f77a0b2913f1233c |
13 | CS0002-000002 | CS0005-000003 | icepublic-6fb6aec40ddd18fd97c89f286096b13c |
14 | CS0001-000016 | CS0005-000003 | icepublic-123c94c93b441ee6274cc57af3a5f192 |
15 | CS0001-000013 | CS0005-000003 | icepublic-1daf8b4e4ba14a0b65ce36a74fb6f7ed |
16 | CS0001-000005 | CS0005-000003 | icepublic-6026a4719922367ac9911da573cf4ac3 |
17 | CS0004-000005 | CS0005-000003 | icepublic-e6a95f972fe9c0bade2b4bfbd14bd2b6 |
18 | CS0004-000009 | CS0005-000003 | icepublic-b9d7f849a675c2e88ac5d0b6c9dc13ae |
19 | CS0003-000014 | CS0005-000003 | icepublic-d647e23699457a716ec248002d07e441 |
20 | CS0004-000019 | CS0005-000003 | icepublic-80a6a9a1f1e9c92ed1fbb3c5fb273129 |
21 | CS0004-000012 | CS0005-000003 | icepublic-965162f2ffba0ca333602bc868959ed1 |
Gradient background sets
No. | |||
1 | CS0003-000019 | CS0005-000003 | icepublic-436399df008b57ce0d5bf18b2018fdf7 |
2 | CS0001-000014 | CS0005-000003 | icepublic-03a7dc16f773d21eb5a12c4532e83792 |
3 | CS0004-000007 | CS0005-000003 | icepublic-7793fe1121bbf0c328756d30eb43fbf6 |
4 | CS0004-000013 | CS0005-000003 | icepublic-e51bbb0ac440a66f5402644f15f167bc |
5 | CS0004-000015 | CS0005-000003 | icepublic-0cea2bfd1713e4be89d4cb2b69dcfd2d |
6 | CS0003-000011 | CS0005-000003 | icepublic-d94fd7ea908e5ef3e0f4776749b2ab21 |
7 | CS0003-000006 | CS0005-000003 | icepublic-85a3a18ad145149720af3f60cf365dbb |
8 | CS0003-000019 | CS0005-000003 | icepublic-22e6db400a8df8629836bfd1e7d0c3e0 |
9 | CS0003-000023 | CS0005-000003 | icepublic-ff9242c8573007d65e42f3497e3290fa |
10 | CS0003-000021 | CS0005-000003 | icepublic-e66f8ba4ecf0565bf03a21f818ec7bb7 |
Image background sets
No. | |||
1 | CS0002-000008 | CS0005-000003 | icepublic-e0dafa38070907f0c33ccfcc98f93e80 |
2 | CS0002-000009 | CS0005-000003 | icepublic-5bbf38de39b72af630a78a6a46d5c4f8 |
3 | CS0003-000015 | CS0005-000003 | icepublic-3dfca71bf8a67992a84ec47b08d37a57 |
4 | CS0003-000006 | CS0005-000003 | icepublic-a8ce54acc939865038196ddf2b3e4628 |
5 | CS0004-000006 | CS0005-000003 | icepublic-7bc1ff25d2fe82d3d6780178be8274a8 |
6 | CS0003-000004 | CS0005-000003 | icepublic-afd1755bea61c89fe2d688b2380f9fe8 |
7 | CS0003-000024 | CS0005-000003 | icepublic-69a1b7dc1f3d5a02a1777f9df90861e2 |
8 | CS0004-000016 | CS0005-000003 | icepublic-c14bf83acdd62760f3cbaffce5c6dddb |
9 | CS0004-000008 | CS0005-000003 | icepublic-cc46ea3761f04bc735769dfc22bc1028 |
TemplateConfig
To create and retrieve templates for batch video production, see Template parameters.
Parameter | Type | Description | Example | Required |
BatchEditingTemplateIdArray | List<String> |
| ["****b4549d46c88681030f6e****","****549d46c88b4681030f6e****"] | No |
Parameter example
{
"BatchEditingTemplateIdArray": [
"****b4549d46c88681030f6e****",
"****549d46c88b4681030f6e****"
]
}