All Products
Search
Document Center

Intelligent Media Services:Smart highlight extraction

Last Updated:Sep 10, 2025

This topic describes the request parameters for the SubmitHighlightExtractionJob operation and the response parameters returned by the GetSmartHandleJob operation.

Important
  • Note: In this operation, the region of the Object Storage Service (OSS) URLs for all media assets must be the same as the region of the OpenAPI endpoint that you call.

  • Supported regions: China (Shanghai), China (Beijing), China (Hangzhou), China (Shenzhen), US (West), and Singapore. The action tag recognition feature, which corresponds to the Strategy.EnableActionRecog and Strategy.CustomActions parameters, is currently available only in the China (Shanghai) region.

  • Currently, video materials that lack captions or human voices are not supported. Ensure that your video materials meet this requirement.

Usage instructions

InputConfig parameter description

You can configure InputConfig to specify parameters, such as video materials and highlight splitting configurations.

Parameter

Type

Description

Required

MediaArray

List<Media>

  • Only the video format is supported for film and television materials. You can upload materials by providing a list of media asset IDs or OSS URLs. The total duration of the videos can be up to two hours, and the maximum number of videos is 30.

  • For more information about supported formats, see Video formats.

Important

Video materials that do not contain captions or human voices are not supported.

Yes

Strategy

Strategy

Highlight clip extraction policy settings:

  • Count: The number of highlight clips to extract from each material. Valid values: [1,10]. Default value: 5.

  • ClipDuration: The expected duration of each highlight clip in seconds. Valid values: [3,60]. Default value: 15. The actual duration of each highlight clip may vary slightly from this value.

  • EnableActionRecog: Specifies whether to enable action recognition. Default value: false.

  • CustomActions: The custom action tags. The system prioritizes mapping based on the provided tags. For example: ["Fight","Cry"]. The array supports a maximum of 50 tags. Each tag can be a maximum of 5 characters.

No

Strategy parameter description

Parameter

Type

Description

Required

Count

Integer

The number of highlight clips to extract from a single material. The value must be in the range of [1, 10]. The default value is 5.

No

ClipDuration

Float

The expected duration of each highlight clip in seconds. The value must be in the range of [3, 60]. The default value is 15. The actual duration of each highlight clip may fluctuate around this value.

No

EnableActionRecog

Boolean

Specifies whether to enable action recognition. The default value is false.

Note

Action recognition is supported only in the China (Shanghai) region.

No

CustomActions

List<String>

Custom action tags. The system preferentially maps the tags based on the provided tag names. Example: ["Fight", "Cry"]. The array can contain up to 50 tags. Each tag can contain up to 5 characters.

Note

Action recognition is supported only in the China (Shanghai) region.

No

HighlightDescription

String

  • A description of the highlight extraction policy. This parameter takes effect only when ThemeConfig.ThemeType is set to SmoothHighlight.

  • Example: Prioritize scenes that feature strong emotional expression, high contrast, concentrated plot conflicts, and dramatic highlights. Examples include scenes where the male lead XXX expresses intense emotions like anger or protectiveness, where tension is created by contrasting identities and behaviors, where the story revolves around core conflicts like family feuds, and where unusual dialogue or plot twists occur to enhance viewer immersion and create buzz.

No

FaceInfo

FaceInfo

  • Set face information to help identify characters. Configure this parameter if you want to feature specific characters more prominently in the highlights.

No

FaceInfo parameter description

Parameter

Type

Description

Required

ImageInfoList

List<ImageInfo>

A list of character (face) photos. The list can contain up to 200 photos.

No

ImageInfo parameter description

Parameter

Type

Description

Example

Required

Name

String

The name of the character (face).

Daniel

Yes

ImageURL

String

The storage address of the character (face) photo. The address must be a publicly accessible URL. Make sure that the face image contains only one person and the face is clear, without significant obstructions or missing parts.

http://[your-cdn-domain]/[your-file-path]/face1.png

Yes, one is required.

ImageId

String

The ID of the image media asset.

****9d46c886b45481030f6e****

Media parameter description

Parameter

Type

Description

Required

MediaId

String

The ID of the media asset.

You must specify one of the two.

If you specify both, MediaId takes precedence.

MediaURL

String

The URL of the media asset in OSS. Only your own OSS buckets are supported.

Parameter example

{
  "MediaArray": [
    {
      "MediaId": "1cb94770da*******75e6e6c5486302"
    }
  ],
  "Strategy": {
    "Count": 5,
    "ClipDuration": 15,
    "EnableActionRecog": true,
    "CustomActions":  ["Fight","Cry"],
    "HighlightDescription":"Prioritize scenes that feature strong emotional expression, high contrast, concentrated plot conflicts, and dramatic highlights. Examples include scenes where the male lead XXX expresses intense emotions like anger or protectiveness, where tension is created by contrasting identities and behaviors, where the story revolves around core conflicts like family feuds, and where unusual dialogue or plot twists occur to enhance viewer immersion and create buzz.",
    "FaceInfo":{"ImageInfoList":[{"Name":"Ning X","ImageURL":"http://[your-cdn-domain]/[your-file-path]/face1.png"}]}
  }
}

OutputConfig parameter description

You can configure OutputConfig to specify synthesis parameters, such as the output location and naming rules for the final video.

Parameter

Type

Description

Required

Example

NeedExport

Boolean

Specifies whether to export the clips directly.

Valid values:

  • If set to `true`, the result is returned.

  • false: Only the time ranges of the highlight clips are returned. The clips are not split.

No. The default value is false.

false

OutputMediaTarget

String

Required when NeedExport = true.

The target type of the output file.

  • oss-object: an OSS object in your Alibaba Cloud OSS bucket.

No. The default value is oss-object.

oss-object

Endpoint

String

The S3-compatible endpoint.

  • The region of the OSS endpoint must be the same as the region where the service is called.

The default value is the OSS endpoint in the same region.

No

https://oss-cn-shanghai.aliyuncs.com

Bucket

String

Required when NeedExport = true.

The S3-compatible storage bucket.

  • Your own OSS storage bucket.

No

your bucket

ObjectKey

String

Required when NeedExport = true.

The name of the S3-compatible object.

Supported placeholder:

  • {index}: This placeholder must be included in the file path.

No

dir/to/testOutput_{index}.mp4

ExportAsNewMedia

Boolean

Optional when NeedExport = true.

Specifies whether to export the output as a new media asset.

This parameter is supported only when OutputMediaTarget is set to oss-object.

No. The default value is false.

false

Width

Integer

Optional when NeedExport = true.

The width of the output video in pixels. If you do not specify this parameter, the width of the output video is the same as that of the source video.

No

1280

Height

Integer

Optional when NeedExport = true.

The height of the output video in pixels. If you do not specify this parameter, the height of the output video is the same as that of the source video.

No

720

Video

JSONObject

Optional when NeedExport = true.

The configurations of the output video stream, such as Crf and Codec.

No

{

"Bitrate": 3000

}

Parameter example

 {
    "NeedExport": true,
    "OutputMediaTarget": "oss-object",
    "Endpoint": "https://oss-cn-shanghai.aliyuncs.com"
    "Bucket": "your-bucket",
    "ObjectKey": "dir/to/testOutput_{index}.mp4",
    "ExportAsNewMedia": false,
    "Width": 1280,
    "Height": 720,
    "Video": {
      "Bitrate": 3000
    }
  }

GetSmartHandleJob

Call the GetSmartHandleJob operation to retrieve the result of a highlight extraction task. The following section describes the parameters in AiResult.

AiResult parameter description

{
  "HighlightResults": [
    {
      "Media": "MediaId1", // If a URL is specified in InputConfig, a URL is also returned here.
      "TimeRanges": [
        {
          "In": 20,
          "Out": 30,
          "Tags": ["Fight","Cry"], // The detected action tags.
          "OutputURL": "http://your bucket.oss-cn-shanghai.aliyuncs.com/output_0.mp4", // Returned only when needExport is set to true.
          "MediaId": "MediaId11", // Returned only when ExportAsNewMedia is set to true.
        }
      ]
    },
    {
      "Media": "MediaId2", // If a URL is specified in InputConfig, a URL is also returned here.
      "TimeRanges": [
        {
          "In": 2,
          "Out": 10,
          "Tags": ["Run","Shout"],
          "OutputURL": "http://your bucket.oss-cn-******.aliyuncs.com/output_1.mp4" // Returned only when needExport is set to true.
        },
        {
          "In": 40,
          "Out": 50,
          "OutputURL": "http://your bucket.oss-cn-******.aliyuncs.com/output_2.mp4" // Returned only when needExport is set to true.
        }
      ]
    }
  ]
}