All Products
Search
Document Center

Intelligent Media Services:Multi-audio track transcoding and packaging

Last Updated:Nov 20, 2025

This guide explains how to use Intelligent Media Services (IMS) for multi-audio track transcoding and packaging. Following these steps, you can generate multi-language media content compatible with various devices.

Workflow

image

Example of an output file structure:

#EXTM3U

# Audio stream definitions (multi-language)
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",NAME="Chinese",LANGUAGE="zh",DEFAULT=YES,URI="audio/chinese.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",NAME="English",LANGUAGE="en",DEFAULT=NO,URI="audio/english.m3u8"

# Video stream definitions (multi-bitrate)
#EXT-X-STREAM-INF:BANDWIDTH=400000,RESOLUTION=360x640,AUDIO="audio",CODECS="hvc1,mp4a.40.5"
video/360p.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=900000,RESOLUTION=720x1280,AUDIO="audio",CODECS="hvc1,mp4a.40.5"
video/720p.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2000000,RESOLUTION=1080x1920,AUDIO="audio",CODECS="hvc1,mp4a.40.5"
video/1080p.m3u8

Before you begin

Activate IMS to use its features.

Configuration

Basic configuration

  • Storage: Associate an Object Storage Service (OSS) bucket with IMS. For more information, see Configure storage addresses.

  • Callback: Configure an HTTP or MNS callback to receive task status notifications. For callback methods and events, see Overview.

Transcoding template configuration

Procedure

image

Example requirements

Codec: H.264, H.265

Resolution: 360P, 540P, 720P, 1080P

Audio: HE-AAC, 64 Kbps (default)

Example configuration

This example shows how to configure transcoding templates for the four required video resolutions. To learn how to create a template, see Create a transcoding template.

Note

To perform Narrowband HD™ transcoding, create a basic template based on the following table. Then, submit a ticket for backend upgrade.

H.264

Template

Codec

Container format

Other parameters

Video-360P

H.264

m3u8 (.ts)

  • Resolution (long edge fixed): 640px

  • Disable audio

  • Segment length: 5s

  • Configure other parameters as needed

Video-540P

H.264

m3u8 (.ts)

  • Resolution (long edge fixed): 960px

  • Disable audio

  • Segment length: 5s

  • Configure other parameters as needed

Video-720P

H.264

m3u8 (.ts)

  • Resolution (long edge fixed): 1280px

  • Disable audio

  • Segment length: 5s

  • Configure other parameters as needed

Video-1080P

H.264

m3u8 (.ts)

  • Resolution (long edge fixed): 1920px

  • Disable audio

  • Segment length: 5s

  • Configure other parameters as needed

Audio-64Kbps

HE-AAC

m3u8 (.ts)

  • Disable video

  • Segment length: 5s

Note

You cannot create this template in the console. Use the API or submit a ticket for assistance.

H.265

Note
  • Recommended: fMP4 container format. It is Apple's standard and is compatible with Safari.

  • Alternative: The TS container format is feasible but is not compatible with Safari.

  • Console limitation: You cannot create a template with the fMP4 container format in the console. Create a template with the m3u8 (.ts) container format first, and then contact Alibaba Cloud to upgrade the configuration in the backend.

Template

Codec

Container format

Other parameters

Video-360P

H.265

m3u8 (.fmp4)

  • Resolution (long edge fixed): 640px

  • Disable audio

  • Segment length: 5s

  • Configure other parameters as needed

Video-540P

H.265

m3u8 (.fmp4)

  • Resolution (long edge fixed): 960px

  • Disable audio

  • Segment length: 5s

  • Configure other parameters as needed

Video-720P

H.265

m3u8 (.fmp4)

  • Resolution (long edge fixed): 1280px

  • Disable audio

  • Segment length: 5s

  • Configure other parameters as needed

Video-1080P

H.265

m3u8 (.fmp4)

  • Resolution (long edge fixed): 1920px

  • Disable audio

  • Segment length: 5s

  • Configure other parameters as needed

Audio-64Kbps

HE-AAC

m3u8 (.fmp4)

  • Disable video

  • Segment length: 5s

Note

You cannot create this template in the console. Use the API or submit a ticket for assistance.

Submit a transcoding task

Start a multi-bitrate task

Call the SubmitMediaConvertJob operation to submit a transcoding task.

Config parameter (HlsGroupConfig)

Parameter

Type

Description

Type

string

Specifies the stream type.

Valid values:

  • video: a video stream. Only video-related settings are processed.

  • audio: an audio stream. Only audio-related settings are processed.

  • hybrid: a hybrid stream. Both audio-related and video-related settings are processed.

Bandwidth

string

The bandwidth. This parameter is optional.

This parameter is valid when Type is set to video or hybrid.

AudioGroup

string

The audio group referenced by the video stream. This parameter applies when Type is set to video.

SubtitleGroup

string

The subtitle group referenced by the video stream. This parameter applies when Type is set to video or hybrid.

Name

string

The NAME attribute of the output stream in the HLS manifest. This parameter is required when Type is set to audio or subtitle.

Group

string

The GROUP_ID attribute of the output stream in the HLS manifest. This parameter applies when Type is set to audio or subtitle.

By default, the value is the same as Type.

Language

string

The LANGUAGE attribute of the output stream in the HLS manifest. This parameter applies when Type is set to audio or subtitle. The value must comply with RFC 5646.

Default

boolean

Specifies whether to set the stream as the default stream. This parameter is valid when Type is set to audio.

AutoSelect

boolean

Specifies whether to automatically select the stream. This parameter is valid when Type is set to audio.

Forced

boolean

Specifies whether to forcibly display the stream. This parameter is valid when Type is set to audio.

Scenario 1: Transcode and generate a multi-bitrate package

{
  "Inputs": [
    {
      "Name": "video",
      "InputFile": {
        "Type": "OSS",
        "Media": "https://<Bucket>.<Public_Endpoint>/<video_1_chinese>"
      }
    },
    {
      "Name": "EnglishAudio",
      "InputFile": {
        "Type": "OSS",
        "Media": "https://<Bucket>.<Public_Endpoint>/<video_1_english>"
      }
    },
    {
      "Name": "JapaneseAudio",
      "InputFile": {
        "Type": "OSS",
        "Media": "https://<Bucket>.<Public_Endpoint>/<video_1_japanese>"
      }
    }
  ],
  "OutputGroups": [
    {
      "GroupConfig": {
        "Type": "Hls",
        "OutputFileBase": {
          "Type": "OSS",
          "Media": "https://<Bucket>.<Public_Endpoint>/<URI>/"
        },
        "ManifestName": "<m3u8_filename>"
      },
      "Outputs": [
        {
          "Name": "360P",
          "InputRef": "video",
          "OutputFileName": "video/360p/360p",
          "TemplateId": "Video-360P"
        },
        {
          "Name": "540P",
          "InputRef": "video",
          "OutputFileName": "video/540p/540p",
          "TemplateId": "Video-540P"
        },
        {
          "Name": "720P",
          "InputRef": "video",
          "OutputFileName": "video/720p/720p",
          "TemplateId": "Video-720P"
        },
        {
          "Name": "1080P",
          "InputRef": "video",
          "OutputFileName": "video/1080p/1080p",
          "TemplateId": "Video-1080P"
        },
        {
          "OutputFileName": "audio/chinese/chinese",
          "TemplateId": "Audio-64Kbps",
          "HlsGroupConfig": {
            "Name": "Chinese",
            "Type":"audio",
            "Language": "zh",
            "Autoselect": "TRUE",
            "Default": "TRUE"
          }
        },
        {
          "InputRef": "EnglishAudio",
          "OutputFileName": "audio/english/english",
          "TemplateId": "Audio-64Kbps",
          "HlsGroupConfig": {
            "Name": "English",
            "Type":"audio",
            "Language": "en",
            "Autoselect": "TRUE"
          }
        },
        {
          "InputRef": "JapaneseAudio",
          "OutputFileName": "audio/japanese/japanese",
          "TemplateId": "Audio-64Kbps",
          "HlsGroupConfig": {
            "Name": "Japanese",
            "Type":"audio",
            "Language": "ja",
            "Autoselect": "TRUE"
                    }
                }
            ]
        }
    ]
}

Scenario 2: Add audio tracks to an existing HLS manifest

Steps:

  1. Specify an input named ExtraAudio. In the output, reference this input to transcode it into an audio HLS stream.

  2. Set InputRef in the ManifestExtend option within GroupConfig to reference the RefManifest file from the inputs. This reuses the manifest, allowing you to add extra audio tracks based on the original manifest.

{
  "Inputs": [
    {
      "Name": "ExtraAudio",
      "InputFile": {
        "Type": "OSS",
        "Media": "http://your-bucket.oss-region.aliyuncs.com/in/extra-audio.mp4"
      }
    },
    {
      "Name": "RefManifest",
      "InputFile": {
        "Type": "OSS",
        "Media": "http://your-bucket.oss-region.aliyuncs.com/in/manifest.m3u8"
      }
    }
  ],
  "OutputGroups": [
    {
      "GroupConfig": {
        "Type": "Hls",
        "OutputFileBase": {
          "Type": "OSS",
          "Media": "http://your-bucket.oss-region.aliyuncs.com/out/demo"
        },
        "ManifestName": "manifest",
        "ManifestExtend": {
          "InputRef": "RefManifest"
        }
      },
      "Outputs": [
        {
          "Name": "ExtraAudioOut",
          "InputRef": "ExtraAudio",
          "OutputFileName": "extra-audio",
          "TemplateId": "#AudioTemplateId",
          "hlsGroupConfig": {
            "Type": "audio",
            "Name":"Chinese",
            "Language": "zh-cn"
          }
        }
      ]
    }
  ]
}

Scenario 3: Replace an audio track in an existing HLS manifest

This builds on Scenario 2. Add the Excludes option in ManifestExtend to exclude specific streams from the original manifest.

Parameter

Type

Description

Name

string

The Name of the stream to exclude.

Type

string

The Type of the stream to exclude.

Valid values:

  • Audio

  • Subtitle

Language

string

The Language of the stream to exclude. The value must comply with RFC 5646.

{
  "Inputs": [
    {
      "Name": "ExtraAudio",
      "InputFile": {
        "Type": "OSS",
        "Media": "http://your-bucket.oss-region.aliyuncs.com/in/extra-audio.mp4"
      }
    },
    {
      "Name": "RefManifest",
      "InputFile": {
        "Type": "OSS",
        "Media": "http://your-bucket.oss-region.aliyuncs.com/in/manifest.m3u8"
      }
    }
  ],
  "OutputGroups": [
    {
      "GroupConfig": {
        "Type": "Hls",
        "OutputFileBase": {
          "Type": "OSS",
          "Media": "http://your-bucket.oss-region.aliyuncs.com/out/demo"
        },
        "ManifestName": "manifest",
        "ManifestExtend": {
          "InputRef": "RefManifest",
          "Excludes": [{
              "Language": "en",
              "Type": "Audio"
            }]
        }
      },
      "Outputs": [
        {
          "Name": "ExtraAudioOut",
          "InputRef": "ExtraAudio",
          "OutputFileName": "extra-audio",
          "TemplateId": "#AudioTemplateId",
          "hlsGroupConfig": {
            "Type": "audio",
            "Name":"Chinese",
            "Language": "zh-cn"
          }
        }
      ]
    }
  ]
}

Query task results

Call the GetMediaConvertJob operation to obtain the details of a transcoding task.

Callback events

Event type: MediaConvertComplete

Configuration method: This event cannot be configured in the console. Configure it by calling SetEventCallback.

Key callback parameters

Parameter

Type

Required

Description

Name

String

Yes

The name of the main task.

JobId

String

Yes

The ID of the task.

Status

String

Yes

The task status. Success indicates that at least one output (subtask) succeeded.

TriggerSource

String

No

The source that triggered the task. API indicates the task was submitted via an API call.

FinishTime

String

No

The time the task was completed, in UTC format.

UserData

string

No

A custom string specified when submitting the task. It is passed through and returned in the callback.

Example

{
	"FinishTime": "2025-05-09T08:03:21Z",
	"JobId": "your-job-id",
	"Status": "Success",
	"TriggerSource": "IceWorkflow",
	"UserData": "{\"ImsSrc\":\"Workflow\",\"TaskId\":\"e89a955d88ca47f0b9b79c562e5c622f\"}"
}

Play the multi-bitrate video

Use ApsaraVideo Player to play the packaged video.

Video translation + Multi-bitrate packaging

image
  1. Prepare the source file.

  2. Translate the source file into target languages (such as English and Japanese) to generate corresponding audio or video files.

  3. Call the SubmitMediaConvertJob operation to transcode and package the multi-language content into a standardized, multi-bitrate video.