All Products
Search
Document Center

Intelligent Media Services:Script-to-Video

Last Updated:Jan 16, 2026

This topic describes the production parameters, advanced configurations, and SDK examples for the Script-to-Video feature.

Important
  • Both Script-to-Video and Image-Text Matching use the SubmitBatchMediaProducingJob API to submit a task. To differentiate between them based on parameters, see Parameter differences.

  • In this API, the region specified in the OSS URL of all media assets must be the same as the OpenAPI service endpoint.

  • Supported regions: China (Shanghai), China (Beijing), China (Hangzhou), China (Shenzhen), US (Silicon Valley), and Singapore.

  • In practice, replace all placeholders in the examples, such as [your-bucket], [your-region-id], [your-file-name], [your-file-path], and media asset IDs ("****9d46c8b4548681030f6e****"), with your actual values.

Note
  • For a better understanding of this document, first reading the Batch video production guide to familiarize yourself with the concepts and workflow of Script-to-Video.

  • Script-to-Video supports two production modes: Global Scripts and Segmented Scripts.

    • Global Scripts: Randomly combines multiple complete voiceover scripts with video assets to generate a large number of videos with a similar style.

    • Segmented Scripts: Breaks a voiceover script into multiple segments and matches each segment to a specific group of assets.

    • The mode is determined by the following parameter logic:

Usage notes

  • To submit a batch video production job that intelligently mixes multiple video, audio, and image assets, see SubmitBatchMediaProducingJob. Key API parameters are detailed in the InputConfigEditingConfig, and OutputConfig sections below.

  • To get detailed information about a batch video creation job, see GetBatchMediaProducingJob.

InputConfig

Configure the InputConfig to specify parameters for basic assets such as video clips, voiceovers, background music, and stickers.

Parameter

Type

Description

Example

Required

Supported modes

MediaGroupArray

List<MediaGroup>

Specify source assets. Supports grouping assets.

Group name: Up to 50 characters. Emojis are not supported.

Material list: Media asset ID or OSS URL of the material.

Supports a maximum of 40 groups, each containing up to 200 materials.

Example: Global Scripts mode

Example: Segmented Scripts mode

Yes

  • Global Scripts

  • Segmented Scripts

TitleArray

List<String>

An array of titles. One title is randomly selected for each production.

Max 50 titles, each up to 50 characters long.

["Title 1","Title 2"]

No

  • Global Scripts

  • Segmented Scripts

SubHeadingArray

List<SubHeading>

Multi-level subheading settings.

[{"Level":1,"TitleArray":["Level 1 subtitle 1","Level 1 subtitle 2"]},{"Level":3,"TitleArray":["Level 3 subtitle"]}]

No

  • Global Scripts

  • Segmented Scripts

SpeechTextArray

List<String>

  • An array of voiceover scripts. One script is randomly selected for each production.

  • Max 50 scripts, each up to 1000 characters long.

  • Supports controlling speech synthesis using SSML.

  • The default language is Chinese (zh). To set other languages, see SpeechLanguage.

    Important

    Currently, only <break>, <s>, <sub>, <w>, <phoneme>, and <say-as> are supported.

["Voiceover content 1","Voiceover content 2"]

No

  • Global Scripts

StickerArray

List<Sticker>

  • An array of stickers. One is randomly selected for each production. Max 50 stickers.

  • Selection rule: If you provide 10 stickers and request 20 videos, the system will pick a random starting index (e.g., 3) and select stickers cyclically: 3, 4, 5, ..., 10, 1, 2, 3, ...

  • For supported formats, see Image formats.

[{"MediaId":"****9d46c8b4548681030f6e****","X":10,"Y":100,"Width":300,"Height":300,"Opacity":0.6}]

No

  • Global Scripts

  • Segmented Scripts

BackgroundMusicArray

List<String>

  • An array of background music tracks. One is randomly selected for each production. Max 50 tracks. Supports media asset IDs or OSS URLs.

  • Selection rule: Works the same as StickerArray.

  • For supported formats, see Audio formats.

["****b4549d46c88681030f6e****","****549d46c88b4681030f6e****"]

No

  • Global Scripts

  • Segmented Scripts

BackgroundImageArray

List<String>

  • An array of background images. One is randomly selected for each production. Max 50 images. Supports media asset IDs or OSS URLs.

  • Selection rule: Works the same as StickerArray.

  • For supported formats, see Image formats.

["****b4549d46c88681030f6e****","****549d46c88b4681030f6e****"]

No

  • Global Scripts

  • Segmented Scripts

MediaGroup

Note

The differences in MediaGroup parameter configurations between Global Scripts mode and Segmented Scripts mode are indicated in the Supported modes column.

Parameter

Type

Description

Example

Required

Supported modes

GroupName

String

The name of the group. Max 50 characters, no emojis.

Group1

Yes

  • Global Scripts

  • Segmented Scripts

MediaArray

List<String>

  • A list of assets, supporting media asset IDs or URLs. Max 200 assets.

  • For supported formats, see Video formats.

****b4549d46c88681030f6e****

Yes

  • Global Scripts

  • Segmented Scripts

SpeechTextArray

List<String>

  • An array of voiceover scripts. One script is randomly selected for each production.

  • Max 50 scripts, each up to 1000 characters long.

  • Supports controlling speech synthesis using SSML.

    Important

    Currently, only <break>, <s>, <sub>, <w>, <phoneme>, and <say-as> are supported.

["Voiceover content 1","Voiceover content 2"]

No

  • Segmented Scripts

Duration

Float

The duration for the current group, in seconds. Use only when SpeechTextArray is empty.

10

No. Default: 5.

  • Global Scripts

SplitMode

String

NoSplit

No. Default: AverageSplit.

  • Global Scripts

  • Segmented Scripts

Volume

Float

  • The volume of the input video for this group. If set, it overrides EditingConfig.MediaConfig.Volume for this group.

  • Range: [0, 10.0]. Supports two decimal places.

0.5

No

  • Global Scripts

DurationAutoAdapt

Boolean

Whether to enable duration auto-adaptation for this group. If enabled and no voiceover is present, the group's duration will be adjusted to ensure video clips play at their original speed.

true

No. Default: false.

  • Global Scripts

Example: Global Scripts mode

{
  "MediaGroupArray": [
    {
      "GroupName": "UseMediaId",
      "MediaArray": [
        "****9d46c886b45481030f6e****",
        "****c886810b4549d4630f6e****"
      ],
      "SplitMode": "NoSplit"
    },
    {
      "GroupName": "UseOssUrl",
      "MediaArray": [
        "http://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name].mp4",
        "http://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name].png"
      ]
    }
  ],
  "TitleArray": [
    "Freshippo opens a new location in Huilongguan",
    "A new Freshippo store opens"
  ],
  "SubHeadingArray": [
    {
      "Level": 1,
      "TitleArray": ["Subtitle 1", "Subtitle 2"]
    },
    {
      "Level": 3,
      "TitleArray": ["Level 3 subtitle"]
    }
  ],
  "SpeechTextArray": [
    "A new Freshippo store just opened in the nearby mall. It's the grand opening today, so I rushed over to check it out. The store isn't huge, but it's packed with people. Snacks and drinks are pretty cheap, and the checkout lines are super long. Come and see for yourself!",
    "A new  Freshippo store just opened in the nearby mall. It's the grand opening today, so I rushed over to check it out.",
    "<speak>Today, our hero, table tennis legend <phoneme alphabet="ipa" ph="mɑː lʊŋ">Ma Long</phoneme>, is striving for the pinnacle of glory.</speak>"
  ],
  "StickerArray": [
    {
      "MediaId": "****9d46c8b4548681030f6e****",
      "X": 10,
      "Y": 100,
      "Width": 300,
      "Height": 300,
      "Opacity": 0.6
    },
    {
      "MediaURL": "http://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name].png",
      "X": 10,
      "Y": 100,
      "Width": 300,
      "Height": 300
    }
  ],
  "BackgroundMusicArray": [
    "****b4549d46c88681030f6e****",
    "****549d46c88b4681030f6e****",
    "http://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name].mp3"
  ],
  "BackgroundImageArray": [
    "****6c886b4549d481030f6e****",
    "****9d46c8548b4681030f6e****",
    "http://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name].png"
  ]
}

Example: Segmented Scripts

{
  "MediaGroupArray": [{
    "GroupName": "start",
    "MediaArray": ["https://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name].jpeg", "https://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name].mp4"],
    "Duration": 5,
    "SplitMode": "NoSplit",
    "Volume": 1
  },
    {
      "GroupName": "group1",
      "MediaArray": ["https://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name].png", "https://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name].mp4"],
      "SpeechTextArray": ["A new Freshippo store just opened in the nearby mall.", "It's the grand opening today.", "<speak>Today, our hero, table tennis legend <phoneme alphabet="ipa" ph="mɑː lʊŋ">Ma Long</phoneme>, is striving for the pinnacle of glory.</speak>"]
    },
    {
      "GroupName": "group2",
      "MediaArray": ["https://[your-bucket].oss-[your-region-id].aliyuncs.com/0-test-batch-editing-materials/normal%20video.mp4", "https://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name].jpeg"],
      "SpeechTextArray": ["The store isn't huge, but it's packed with people. Snacks and drinks are pretty cheap, and the checkout lines are super long.", "The scene is very lively, with crowds of people and a wide variety of goods."]
    },
    {
      "GroupName": "group3",
      "MediaArray": ["https://[your-bucket].oss-[your-region-id].aliyuncs.com/0-test-batch-editing-materials/young_sunset_walk.mp4"],
      "SpeechTextArray": ["Come and see for yourself!", "Hurry and come take a look!"]
    },
    {
      "GroupName": "end",
      "MediaArray": ["https://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name].jpg", "https://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name].mp4"],
      "Duration": 5
    }
  ],
  "TitleArray": [
    "Freshippo opens a new location in Huilongguan",
    "A new Freshippo store opens"
  ],
  "StickerArray": [
    {
      "MediaId": "****9d46c8b4548681030f6e****",
      "X": 10,
      "Y": 100,
      "Width": 300,
      "Height": 300,
      "Opacity": 0.6
    },
     {
      "MediaURL": "http://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name].png",
      "X": 10,
      "Y": 100,
      "Width": 300,
      "Height": 300
    }
  ],
  "SubHeadingArray": [
    {
      "Level": 1,
      "TitleArray": ["Level 1 subtitle 1", "Level 1 subtitle 2"]
    },
    {
      "Level": 3,
      "TitleArray": ["Level 3 subtitle"]
    }
  ],
  "BackgroundMusicArray": [
    "****b4549d46c88681030f6e****",
    "****549d46c88b4681030f6e****",
    "http://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name].mp3"
  ],
  "BackgroundImageArray": [
    "****6c886b4549d481030f6e****",
    "****9d46c8548b4681030f6e****",
    "http://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name].png"
  ]
}

EditingConfig

Configure EditingConfig to specify parameters for volume, positioning, and other production settings.

Note

Except for the following parameters, all other parameters support both Global Scripts mode and Segmented Scripts mode:

Parameter

Type

Description

Example

Required

MediaConfig

JSON

Configuration for input video assets.

{"Volume":"1","MediaMetaDataArray":[{"Media":"****6c886b4549d481030f6e****","GroupName":"GroupA","TimeRangeList":[{"In":"0","Out":"1"},{"In":"2","Out":"3"}]}]}

No

TitleConfig

JSON

Configuration for titles.

{"Alignment":"TopCenter","AdaptMode":"AutoWrap","Font":"Alibaba PuHuiTi 2.0 95 ExtraBold","SizeRequestType":"Nominal","Y":0.1}

No

SubHeadingConfig

JSON

Configuration for multi-level subtitles.

JSON fields:

{"1":{"Y":0.3,"FontSize":40},"3":{"Y":0.5,"FontSize":30}}

No

SpeechConfig

JSON

Configuration for the voiceover.

See EditingConfig parameter examples

No

BackgroundMusicConfig

JSON

Configuration for background music.

{"Volume":0.2}

No

BackgroundImageConfig

JSON

Configuration for the background image. This field has no effect if a background image is already configured in InputConfig.

{"SubType":"Blur","Radius":0.5}

No

ProcessConfig

JSON

Configuration for the mixing and editing process.

See EditingConfig parameter examples

No

FECanvas

JSON

Canvas configuration for front-end preview.

{"Width": 1080,"Height": 1920}

No

ProduceConfig

JSON

Standard editing and production configuration. For fields, see EditingProduceConfig.

{"AutoRegisterInputVodMedia":true,"OutputWebmTransparentChannel":true,"CoverConfig":{"StartTime":3.3},"AudioChannelCopy":"left","PipelineId":"****d54a97cff4108b555b01166d4****","MaxBitrate":5000,"KeepOriginMaxBitrate":false,"KeepOriginVideoMaxFps":false}

No

ProcessConfig

Parameter

Type

Description

Example

Required

SingleShotDuration

Float

When editing long video assets, they are automatically segmented. This parameter sets the duration of each segmented shot, in seconds.

5

No. Default: 3.

AllowVfxEffect

Boolean

Whether to allow adding special effects.

true

No. Default: false.

VfxEffectProbability

Float

The probability that an effect will be applied to each video clip. Range: 0.0 to 1.0. Supports 2 decimal places.

0.6

No. Default: 0.5.

VfxFirstClipEffectList

List<String>

  • If not empty, the effect for the first clip of the video will be chosen from this list.

  • If empty, a random effect is chosen from the following defaults: slightshowstarfieldshineestarfieldshinee2starsparklecolorfulripplesstarfield.

  • For effect examples, see Special effect examples.

["slightshow","starfieldshinee"]

No

VfxNotFirstClipEffectList

List<String>

  • If not empty, effects for all clips other than the first will be chosen from this list.

  • If empty, a random effect is chosen from the following defaults: zoomslightzoomzoominoutslightshake.

  • For effect examples, see Special effect examples.

["zoomslight","zoom"]

No

AllowTransition

Boolean

Whether to allow adding transition effects.

true

No. Default: false.

TransitionDuration

Float

Duration of transitions in seconds. If TransitionDuration > ClipDuration - 1, the transition for that clip will not be applied.

0.5

No. Default: 0.5.

TransitionList

List<String>

A list of custom transitions. If AllowTransition is true, a random transition from this list will be used. For available transitions, see Transition effects. If this list is empty, a random transition is chosen from: linearblurcolordistancecrosshatchdreamyzoomdoomscreentransition_up.

["directional", "linearblur"]

No

UseUniformTransition

Boolean

Whether to use the same transition throughout a single video.

true

No. Default: true.

AllowFilter

Boolean

Whether to allow adding custom filters

false

No. Default: false.

FilterList

List<String>

A list of custom filters. If AllowFilter is true, a random filter from this list is applied. For available filters, see Filters If this list is empty, no filter is applied.

["m1", "m2"]

No

AlignmentMode

String

The alignment mode for video and voiceover. Effective only in Global Scripts mode.

Valid values:

  • AutoSpeed: The video track duration is scaled to match the audio track.

  • Cut: The video track is truncated to match the audio track.

AutoSpeed

No. Default: AutoSpeed.

ImageDuration

Float

The duration for static image assets, in seconds.

2

No. Default: 2.

Parameter example

{
  "MediaConfig": {
    "Volume": 0 // Input video assets are muted by default
  },
  "TitleConfig": {
    "Alignment": "TopCenter",
    "AdaptMode": "AutoWrap",
    "Font": "Alibaba PuHuiTi 2.0 95 ExtraBold",
    "SizeRequestType": "Nominal",
    "Y": 0.1, // Y-coordinate for portrait video
    "Y": 0.05, // Y-coordinate for landscape video
    "Y": 0.08 // Y-coordinate for square video
  },
   "SubHeadingConfig": {
    "1": {
      "Y": 0.3,
      "FontSize": 40
    },
    "3": {
      "Y": 0.5,
      "FontSize": 30
    }
  },
  "SpeechConfig": {
    "Volume": 1,  // Voiceover uses original volume by default
    "SpeechRate": 0,
    "Voice": null,
    "Style": null,
    "CustomizedVoice": null, // Voice ID. If set, Voice and Style are ignored.
    "AsrConfig": {
      "Alignment": "TopCenter",
      "AdaptMode": "AutoWrap",
      "Font": "Alibaba PuHuiTi 2.0 65 Medium",
      "SizeRequestType": "Nominal",
      "Spacing": -1,
      "Y": 0.8, // Subtitle Y-coordinate for portrait video
      "Y": 0.9, // Subtitle Y-coordinate for landscape video
      "Y": 0.85 // Subtitle Y-coordinate for square video
    },
    "SpecialWordsConfig": [{
      "Type": "Highlight",
      "Style": {
        "FontName": "KaiTi",
        "FontSize": 80,
        "FontColor": "20AEE9",
        "OutlineColour": "2D20E9",
        "Outline": 3,
        "FontFace": {
          "Bold": true,
          "Underline": true
        }
      },
      "WordsList": [
        "ApsaraVideo",
        "Intelligent Media Services",
        "Batch video creation"
      ]
    },
    {
      "Type": "Highlight",
      "Style": {
        "FontFace": {
          "Italic": true
        }
      },
      "WordsList": [
        "product",
        "take a look"
      ]
    },
    {
      "Type": "Forbidden",
      "WordsList": [
        "pilipala",
        "bilibala"
      ],
      "SoundReplaceMode": "None"
    }
  ]},
  "BackgroundMusicConfig": {
    "Volume": 0.2,   // Background music at 20% volume by default
    "Style": null
  },
  "ProcessConfig": {
    "SingleShotDuration": 3,      // Duration of a shot after splitting
    "AllowVfxEffect": false,	  // Specifies whether to add special effects
    "AllowTransition": false,	  // Specifies whether to add transition effects
    "AlignmentMode": "AutoSpeed"  // This field is supported only in Global Scripts mode
  }
}

TemplateConfig

TemplateConfig contains common parameters for batch video production. For detailed parameters and examples, see TemplateConfig.

OutputConfig parameters

Configure OutputConfig to specify the output destination, naming conventions, resolution, and number of videos to produce.

The parameters are the same for both generation modes.

Parameter

Type

Description

Example

Required

MediaURL

String

The output video URL, which must include the {index} placeholder.

Rule: http://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name]_{index}.mp4

Example: http://example.oss-cn-shanghai.aliyuncs.com/example/example_{index}.mp4

Required if GeneratePreviewOnly is false and output is to OSS.

StorageLocation

String

The storage location for media assets output to ApsaraVideo VOD.

Rule: [your-vod-bucket].oss-[your-region-id].aliyuncs.com

Example: outin-****6c886b4549d481030f6e****.oss-cn-shanghai.aliyuncs.com

Required if GeneratePreviewOnly is false and output is to VOD.

FileName

String

The output file name, which must include the {index} placeholder.

Rule: [your-file-name]__{index}.mp4

Example: example_{index}.mp4

Required if GeneratePreviewOnly is false and output is to VOD.

GeneratePreviewOnly

Boolean

  • If true, the job only generates a preview timeline without actually producing a video. The output URL is not required.

  • After the job completes, you can query the result using GetBatchMediaPoducingJob to get the editing project ID (projectId), then call GetEditingProject to retrieve the preview timeline.

false

No. Default: false.

Count

Integer

The number of videos to output. The maximum is 100.

10

No. Default: 1.

MaxDuration

Float

The maximum duration for each output video, in seconds.

20

No. Default: 15.

FixedDuration

Float

The fixed duration for each output video. If set, the video duration will be adjusted to match this value.

  • Not supported in Segmented Scripts mode.

  • In Global Scripts mode, this parameter is supported when SpeechTextArray is empty.

  • You can set either FixedDuration or MaxDuration.

  • For more information, see Video duration rules.

20

No. Default: 15.

Width

Integer

The width of the output video in pixels.

1080

Yes

Height

Integer

The height of the output video in pixels.

1920

Yes

Video

JSON

Configuration for the output video stream, such as CRF and codec.

{"Crf": 27}

No

Parameter example

{
 	"MediaURL": "http://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name]_{index}.mp4",
 	"Count": 20,
 	"MaxDuration": 15,
 	"Width": 1080,
 	"Height": 1920,
 	"Video": {"Crf": 27},
        "GeneratePreviewOnly":false
}

Application

Example 1: Configure an intro and outro with Segmented Scripts mode

Use case

This example shows how to add a consistent intro and outro to your videos. By setting MediaGroup.SplitMode to NoSplit for the first and last groups, the system will play a randomly selected asset from those groups in its entirety.

Sample code

Click to view InputConfig example

{
    "mediaGroupArray": [
        {
            "duration": 4,
            "splitMode": "NoSplit",
            "groupName": "opening",
            "mediaArray": [
                "****e44009ee71f0b62bf6f7d44b****"
            ]
        },
        {
            "groupName": "group1",
            "mediaArray": [
                "****e44009eef1f0b62bf6f7d44b****"
            ],
            "speechTextArray": [
                "Wondering where to go for the holiday?",
                "Still hesitant about your holiday plans?"
            ]
        },
        {
            "groupName": "group2",
            "mediaArray": [
                "****e44009eeferfb62bf6f7d44b****",
                "****e440094fghf0b62bf6f7d44b****",
                "****e44009ee74fgh62bf6f7d44b****"
            ],
            "speechTextArray": [
                "Lugu Lake in Yunnan invites you for a date with nature. The azure lake is like a mirror, reflecting the unique customs of the Mosuo Kingdom of Women, as picturesque as a painting.",
                "Why not consider a natural feast at Lugu Lake in Yunnan? The azure, mirror-like lake reflects the unique folk customs of the Mosuo Kingdom of Women, picturesque and fascinating."
            ]
        },
        {
            "groupName": "group3",
            "mediaArray": [
                "****e44009ee7ft5662bf6f7d44b****"
            ],
            "speechTextArray": [
                "Come to Lugu Lake and share this quiet and charming landscape!",
                "Share the endless poetry brought by this quiet and charming landscape!"
            ]
        },
        {
            "duration": 4,
            "splitMode": "NoSplit",
            "groupName": "ending",
            "mediaArray": [
                "****e44009ee5fgfg62bf6f7d44b****"
            ]
        }
    ]
}

Click to view EditingConfig example

{
    "MediaConfig": {
        "MediaMetaDataArray": [
            {
                "Media": "****e44009eedttg62bf6f7d44b****",
                "GroupName": "opening",
                "TimeRangeList": [
                    {
                        "In": 1.5,
                        "Out": 5.5
                    }
                ]
            },
            {
                "Media": "****e44009ee7dfrf62bf6f7d44b****",
                "GroupName": "ending",
                "TimeRangeList": [
                    {
                        "In": 1.5,
                        "Out": 5.5
                    }
                ]
            }
        ]
    }
}

Click to view OutputConfig example

{
    "count": 10,
    "height": 1920,
    "mediaURL": "http://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name]_{index}.mp4",
    "width": 1080,
    "widthHeightRatio": 0.5625
}

Example 2: Create a face montage video

See Best practices for creating face montage videos.

SDK example

Prerequisites

You have installed the IMS server SDK. For more information, see Preparations.

Code example

This example uses the Global Scripts mode.

Expand to view code example

package com.example;

import java.util.*;

import com.alibaba.fastjson.JSONArray;
import com.alibaba.fastjson.JSONObject;

import com.aliyun.ice20201109.Client;
import com.aliyun.ice20201109.models.*;
import com.aliyun.teaopenapi.models.Config;


/**
 *  You need to add the following Maven dependencies:
 *   <dependency>
 *      <groupId>com.aliyun</groupId>
 *      <artifactId>ice20201109</artifactId>
 *      <version>2.3.0</version>
 *  </dependency>
 *  <dependency>
 *      <groupId>com.alibaba</groupId>
 *      <artifactId>fastjson</artifactId>
 *      <version>1.2.9</version>
 *  </dependency>
 */
public class ScriptBatchEditingService {

    static final String regionId = "[your-region-id]"; // The feature is supported in cn-shanghai, cn-beijing, and cn-hangzhou.
    static final String bucket = "[your-bucket]";
    private Client iceClient;

    public static void main(String[] args) throws Exception {
        ScriptBatchEditingService scriptBatchEditingService = new ScriptBatchEditingService();
        scriptBatchEditingService.initClient();
        scriptBatchEditingService.runExample();
    }

    public void initClient() throws Exception {
        // An Alibaba Cloud account AccessKey has full access to all APIs. We recommend that you use a RAM user for API calls and routine O&M.
        // This example shows how to store the AccessKey ID and AccessKey secret in environment variables. For more information about how to configure them, see https://www.alibabacloud.com/help/en/sdk/developer-reference/v2-manage-access-credentials
        com.aliyun.credentials.Client credentialClient = new com.aliyun.credentials.Client();

        Config config = new Config();
        config.setCredential(credentialClient);

        // To hard-code the AccessKey ID and AccessKey secret, use the following code. However, we strongly recommend that you do not hard-code them in your project code. Otherwise, the AccessKey pair may be leaked, which compromises the security of all your resources.
        // config.accessKeyId = <The AccessKey ID created in Step 2>;
        // config.accessKeySecret = <The AccessKey secret created in Step 2>;
        config.endpoint = "ice." + regionId + ".aliyuncs.com";
        config.regionId = regionId;
        iceClient = new Client(config);
    }

    public void runExample() throws Exception {

        // Video materials
        JSONObject mediaGroup1 = new JSONObject();
        mediaGroup1.put("GroupName", "start");
        mediaGroup1.put("MediaArray", Arrays.asList(
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/lgh/lgh-start-1.mp4"
        ));

        JSONObject mediaGroup2 = new JSONObject();
        mediaGroup2.put("GroupName", "middle");
        mediaGroup2.put("MediaArray", Arrays.asList(
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/lgh/lgh-m-1.mp4",
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/lgh/lgh-m-2.mp4",
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/lgh/lgh-m-3.mp4"
        ));

        JSONObject mediaGroup3 = new JSONObject();
        mediaGroup3.put("GroupName", "end");
        mediaGroup3.put("MediaArray", Arrays.asList(
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/lgh/lgh-end-1.mp4"
        ));

        JSONArray mediaGroupArray = new JSONArray();
        mediaGroupArray.add(mediaGroup1);
        mediaGroupArray.add(mediaGroup2);
        mediaGroupArray.add(mediaGroup3);

        // Narration scripts
        List<String> speechTextArray = Arrays.asList(
            "Wondering where to go for the holiday? Lugu Lake in Yunnan invites you for a date with nature. The azure lake is like a mirror, reflecting the unique customs of the Mosuo Kingdom of Women, as picturesque as a painting.",
            "Still hesitant about your holiday plans? Why not consider a natural feast at Lugu Lake in Yunnan? The azure, mirror-like lake reflects the unique folk customs of the Mosuo Kingdom of Women, picturesque and fascinating."
        );

        // Video titles
        List<String> titleArray = Arrays.asList(
            "Lugu Lake: Mosuo customs in a beautiful landscape",
            "Exploring the mysterious Lugu Lake",
            "Immersive experience of Lugu Lake"
        );

        JSONObject inputConfig = new JSONObject();
        inputConfig.put("MediaGroupArray", mediaGroupArray);
        inputConfig.put("SpeechTextArray", speechTextArray);
        inputConfig.put("TitleArray", titleArray);

        // Number of videos to produce
        int produceCount = 4;

        // Output resolution (portrait)
        //int outputWidth = 1080;
        //int outputHeight = 1920;

        // Output resolution (landscape)
        int outputWidth = 1920;
        int outputHeight = 1080;

        // Output OSS URL, must include the {index} placeholder
        String mediaUrl = "http://" + bucket + ".oss-" + regionId + ".aliyuncs.com/script/output_{index}_w.mp4";

        JSONObject outputConfig = new JSONObject();
        outputConfig.put("MediaURL", mediaUrl);
        outputConfig.put("Count", produceCount);
        outputConfig.put("Width", outputWidth);
        outputConfig.put("Height", outputHeight);

        // Submit batch video production job
        SubmitBatchMediaProducingJobRequest request = new SubmitBatchMediaProducingJobRequest();
        request.setInputConfig(inputConfig.toJSONString());
        request.setOutputConfig(outputConfig.toJSONString());

        SubmitBatchMediaProducingJobResponse response = iceClient.submitBatchMediaProducingJob(request);
        String jobId = response.getBody().getJobId();
        System.out.println("Start script batch job, batchJobId: " + jobId);

        // Poll job status until all are finished
        System.out.println("Waiting job finished...");
        int maxTry = 3000;
        int i = 0;
        while (i < maxTry) {
            Thread.sleep(3000);
            i++;
            GetBatchMediaProducingJobRequest getRequest = new GetBatchMediaProducingJobRequest();
            getRequest.setJobId(jobId);
            GetBatchMediaProducingJobResponse getResponse = iceClient.getBatchMediaProducingJob(getRequest);
            String status = getResponse.getBody().getEditingBatchJob().getStatus();
            System.out.println("BatchJobId: " + jobId + ", status:" + status);

            if ("Failed".equals(status)) {
                System.out.println("Batch job failed. JobInfo: " + JSONObject.toJSONString(getResponse.getBody().getEditingBatchJob()));
                throw new Exception("Produce failed. BatchJobId: " + jobId);
            }

            if ("Finished".equals(status)) {
                System.out.println("Batch job finished. JobInfo: " + JSONObject.toJSONString(getResponse.getBody().getEditingBatchJob()));
                break;
            }
        }
    }
}

API input parameters

InputConfig

{
  "MediaGroupArray": [{
    "GroupName": "start",
    "MediaArray": [
      "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/lgh/lgh-start-1.mp4"
    ]
  },
    {
      "GroupName": "middle",
      "MediaArray": [
        "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/lgh/lgh-m-1.mp4",
        "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/lgh/lgh-m-2.mp4",
        "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/lgh/lgh-m-3.mp4"
      ]
    },
    {
      "GroupName": "end",
      "MediaArray": [
        "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/lgh/lgh-end-1.mp4"
      ]
    }
  ],
  "SpeechTextArray": [
    "Wondering where to go for the holiday? Lugu Lake in Yunnan invites you for a date with nature. The azure lake is like a mirror, reflecting the unique customs of the Mosuo Kingdom of Women, as picturesque as a painting.",
    "Still hesitant about your holiday plans? Why not consider a natural feast at Lugu Lake in Yunnan? The azure, mirror-like lake reflects the unique folk customs of the Mosuo Kingdom of Women, picturesque and fascinating."
  ],
  "TitleArray": [
    "Lugu Lake: Mosuo customs in a beautiful landscape",
    "Exploring the mysterious Lugu Lake",
    "Immersive experience of Lugu Lake"
  ]
}

OutputConfig

{
  "Count": 4,
  "Height": 1080,
  "Width": 1920,
  "MediaURL": "http://[your-bucket].oss-<region-id>.aliyuncs.com/[your-file-path]/[your-file-name]
_{index}_w.mp4"
}

Advanced configurations

For advanced settings, see Editing logic and advanced configurations.

FAQ

For frequently asked questions about Script-to-Video, see FAQ.

References