Transcoding details and usage - - Alibaba Cloud Documentation Center

Transcoding converts an audio or video file into one or more audio or video files to adapt to different network bandwidths, terminal devices, and user needs. The transcoding service of Intelligent Media Services (IMS) supports the standard transcoding, subtitle, audio and image enhancement, and watermark features. This topic describes the transcoding features of IMS and how to use these features.

Transcoding templates

Standard transcoding

Video transcoding refers to the process of converting a compressed stream into another stream to adapt to different terminals and network bandwidth and meet different user requirements. Transcoding is a process in which decoding and encoding are performed. Streams before and after transcoding may use the same or different video encoding standards.

The following table describes the standard transcoding methods supported by IMS.

Standard transcoding method	Description	Scenario
Regular transcoding	Provides comprehensive video transcoding features to convert media files between multiple formats. You can choose different container formats, such as MP4, AVI, and MKV, and resolutions to adapt to different playback devices.	Scenarios in which long video content needs to be formatted.
Audio transcoding	Audio transcoding provides a variety of processing capabilities, including converting audio files from one format to another. Audio transcoding also supports the extraction of audio streams from video files, and audio processing and enhancement.	Scenarios in which you need to convert audio files to different formats, adjust audio quality parameters, or extract audio from videos to meet the requirements of playback compatibility, storage optimization, and content production.
Container format conversion	Container format conversion only converts the container format of videos and does not change the resolution or bitrate.	Scenarios in which you do not need to change the image size or bitrate of videos.

Subtitle

A subtitle template is a transcoding template used to embed subtitles into a video. This type of template ensures that subtitles are part of the video and not an external file. This improves playback compatibility and user experience.

Audio and image enhancement

You can use the following audio and image enhancement methods to enhance the quality of input videos: noise reduction, color and contrast enhancement, super resolution, and transcoding standard dynamic range (SDR) videos to high dynamic range (HDR) videos. This way, viewer experience can be improved in terms of color, brightness, and definition. You can flexibly select the processing module and the processing intensity based on the characteristics of input videos.

The following table describes the audio and image enhancement methods supported by IMS.

Audio and image enhancement method	Description	Scenario
Deinterlacing processing	After you remove the interlaced frames of an interlaced video such as odd or even frames, you can double the frame rate of the video to ensure that the frame rate of the output video is consistent with that of the input video. Then, you can transcode the interlaced video to a progressive video.	This method is suitable for scenarios in which an interlaced video needs to be transcoded to a progressive video. Interlaced videos include videos that are disseminated by using radio and television and produced many years ago.
Multi-frame noise reduction	You can remove the time-domain noise of a video to make the video image cleaner and more stable in timing.	This method is applicable to most videos. We recommend that you enable the multi-frame noise reduction feature except for high-definition videos. You can adjust the noise reduction intensity by configuring a parameter.
Compression artifact removal	You can remove the compression noise such as edge glitches and blocking artifacts caused by encoding and enhance edge and detailed textures. This way, the noise is reduced, and the image definition is improved.	This method is applicable to most videos. You can use this method to enhance the detailed textures of high-definition videos.
Color and contrast enhancement	You can adjust the local contrast and global contrast of an image and increase the color clarity.	This method is applicable to most videos. You can adjust the saturation enhancement level by configuring the relevant parameter.
Super resolution	You can enhance the resolution and edge texture of a video. This way, the overall video definition can be significantly improved. Double super-resolution and triple super-resolution are supported.	This method is applicable to most videos that require resolution enhancement. We recommend that you use this method together with the compression artifact removal method.
Transcoding SDR videos to HDR videos	You can transcode regular SDR videos to HDR videos with a wide color gamut. This greatly improves the contrast, brightness, and color of the image. Hybrid log-gamma (HLG) HDR videos and perceptual quantizer (PQ) HDR videos are supported.	This method is suitable for scenarios in which SDR videos need to be transcoded to HDR videos.

Watermark

IMS allows you to add watermarks to a video. During the video transcoding, you can add information such as images or text as watermarks to a video stream. Then, a new video file that has the watermarks is generated. You can add information such as enterprise or brand logos, TV station logos, user IDs, and nicknames as watermarks for video copyright declaration or brand promotion.

The following table describes the watermark types supported by IMS.

Watermark type	Description
Image watermark	You can add static images in the PNG format and dynamic images in the GIF, APNG, and MOV formats as watermarks to videos. An image watermark can be displayed in a specific position throughout a video or within a specific period of time based on the start and end time that you specify.
Text watermark	You can add one or more pieces of text as watermarks to videos. You can configure text properties such as the font, font size, color, transparency, and outline, and add different text content to different videos.

Note

If files are used as dynamic image watermarks, the file name extensions such as GIF, APNG, and MOV must be in lowercase. This limit is not applied to file name extensions of files that are used as static image watermarks.
The files that are used as watermarks and the video to which the watermarks are added must be stored on the same origin server. For example, videos that are stored on an origin server in the China (Shanghai) region can use only watermarks that are stored on the same origin server in the China (Shanghai) region. Videos cannot use watermarks that are stored in another region or on another origin server. For more information about how to add or configure storage addresses in a region, see Configure storage addresses.

Create a transcoding template

Use the IMS console

Log on to the IMS console.
In the left-side navigation pane, choose VOD Media Processing > Template Management. The Template Management page appears.
In the top navigation bar, select the desired region from the drop-down list next to the Workbench button.

On the Transcoding tab, create a template based on your business requirements.

Create a standard transcoding template

In the Basic Parameters section, you can specify a transcoding template name, transcoding type, and Container Format. Regular transcoding, audio transcoding, and container format conversion are supported. The parameters that you need to configure vary based on the transcoding type.

Regular transcoding template

Configure video parameters

Note

The parameters that you need to configure vary based on the selected container format. The parameters in the IMS console shall prevail.

Parameter		Description
Encoding Format		Select the encoding format that you want to use. H.264, H.265, and AV1 are supported.
Bitrate Control		Bitrate control is used to determine the bitrate of the output file during video encoding. You can select a control mode from the drop-down list. The following modes are supported: Bitrate of Input Video: The original video bitrate is used. Fixed Bitrate: The bitrate of the output video is fixed regardless of whether a scene is complex or simple. If you select Fixed Bitrate, the size of the output video is large. Average Bitrate: The average bitrate of the output video is fixed. The bitrate of each scene varies based on the complexity of the scene. More bits are allocated to complex scenes and fewer bits are allocated to simple scenes. This ensures that the bitrate of the output video is within the expected range and that bits are appropriately allocated. CRF: The bitrate of the output video varies based on the quality of the output video. You can specify an integer in the range of 0 to 51 to control the quality of the output video. A value of 0 specifies a lossless output. A value of 51 specifies an output of the worst quality possible. The constant rate factor (CRF) ensures that the quality of the output video is stable. However, the bitrate of the output video varies based on the complexity of the scene and is unpredictable. Configure the parameters related to the bitrate.
Peak Bitrate		The maximum bitrate. Valid values: 10 to 50000. Unit: Kbit/s.
Resolution		Note If your input videos contain videos in landscape mode and videos in portrait mode, we recommend that you select Set by Long and Short Sides to prevent image distortion. The resolution determines the size of the output video. You can choose one of the following values based on your business requirements and the characteristics of the input video. Resolution of Input Video: The original resolution of the input video is used. Set by Long and Short Sides: The resolution is set based on the long or short side of the video. This method automatically adapts to videos in landscape and portrait modes to prevent image distortion. Set by Width and Height: Specify the width and height of the output video. The horizontal side is the width and the vertical side is the height, regardless of the screen orientation.
Resolution Check		This parameter is required if you set the Resolution parameter to Set by Long and Short Sides or Set by Width and Height. This parameter is used to specify the transcoding method when the resolution of the original video is less than the specified resolution. Valid values: Transcode Based on Resolution of Input Video: Transcoding is performed based on the actual resolution of the original video. Transcode Based on Specific Resolution: Transcoding is performed based on the specified video. Do Not Transcode: Use the original video without transcoding the video.
Frame Rate		The number of frames displayed per second. Frame Rate of Input Video: Use the frame rate of the input video. If the frame rate of the input video exceeds 60, a value of 60 is used. Custom Frame Rate: Specify an integer value between 0 and 60. Unit: fps. Common values:15, 25, and 30.
Advanced Parameters	Segment Duration	Enter an integer. Valid values: 1 to 60. Unit: seconds. Note Each segment should contain at least one keyframe. Set the segment duration to a multiple of the group of pictures (GOP) value. If the segment duration is less than the GOP value or is not a multiple of the GOP value, the GOP value is adapted during transcoding.
	Gop	Valid values: Maximum Time Interval between Keyframes and Maximum Number of Frames between Keyframes. Specify the maximum interval between keyframes. You must enter an integer. Valid values: 1 to 100000. Unit: seconds. Note A greater GOP value indicates a higher compression ratio, a lower encoding speed, a longer duration of a single segment of streaming media, and a longer response time to seeking.
	Encoding Profile	This parameter is required only if the Encoding Format parameter is set to H264. Valid values: For High-Resolution Devices, For Standard-Resolution Devices, and For Mobile Devices. Different devices support different encoding profiles. If you want to play the output video in multiple definitions, we recommend that you select For Mobile Devices to ensure normal playback on low-end devices. In other scenarios, select For High-Resolution Devices or For Standard-Resolution Devices.
	Buffer	Enter a value in the text box. Valid values: 1000 to 128,000. Unit: Kbit/s. Note The buffer size is used to control the bitrate fluctuation. A larger buffer size indicates a greater fluctuation in the bitrate and higher video quality.
	Scan Mode	Select the scan mode of the output video. Valid values: Scan Mode of Input Video, Automatic Deinterlacing, Interlaced Scan, and Sequential Scan. Note If you set Scan Mode to Sequential Scan or Interlaced Scan but this scan mode does not match that of the input video, the video fails to be transcoded. We recommend that you leave this parameter empty or set this parameter to Automatic Deinterlacing for higher compatibility.
	HDR of Source Video	When you upload an HDR source video, you can set the dynamic range for the source video after it is transcoded. To play HDR videos as expected, your display device and player must support HDR. Otherwise, issues such as overexposure and color cast may occur during playback. SDR Mapping: uses the HDR-to-SDR conversion technology to convert an HDR video to an SDR video that is supported by general devices and keeps the color saturation of the source video as much as possible. Note The service is in public preview and is free of charge. After the public preview ends, fees for using the service are billed by minute. Constant HDR: keeps the dynamic range of the source video.
	Color Format	Select an encoding format. By default, the original video format is used.

Configure audio parameters

Parameter	Description
Disable Audio	Specifies whether to disable the audio. If you select Disable Audio, the output file does not include the audio stream or the corresponding audio information. This option is applicable if you want to extract the video stream from a video file.
Encoding Format	Select an encoding format. Different container formats support different encoding formats. The encoding formats displayed in the IMS console shall prevail. Note If you set the parameter to AC3 or EAC3, the audio effect of normal audio files is converted to Dolby. You can use ApsaraVideo Player SDK to enable exclusive audio effects on Dolby devices. The bill for the audio services is generated based on the billing rules of the Dolby sound production feature. If you want to play HLS videos on web pages, we recommend that you set the parameter to AAC. If you set the parameter to other values, the player may mute the videos.
Encoding Profile	The encoding profile of the output audio. This parameter is displayed only if you set the Encoding Format parameter to AAC. If the input audio uses the surround sound format such as 5.1 or 7.1, we recommend that you set this parameter to aac_low. If you want to play the output audio on general playback devices, we recommend that you set this parameter to aac_he. This reduces the bitrate by half compared with the bitrate that is used when this parameter is set to aac_low. A common low bitrate is 64 Kbit/s. If you want to play the output audio on high-end playback devices, we recommend that you set this parameter to aac_he_v2. This way, audio files are encoded in smaller sizes and have higher sound quality. Common low bitrates range from 32 Kbit/s to 48 Kbit/s.
Sample Rate	Select a sampling rate from the drop-down list. Unit: Hz. The supported sample rates of audio vary based on the encoding format or container format. For more information, see Audio sample rates.
Audio Bitrate	Valid values: Bitrate of Source Audio and Average Bitrate. Bitrate of Source Audio: The original bitrate of the input audio is used. Average Bitrate: Set an average bitrate for the output audio.
Bitrate check	This parameter is required if you set the Audio Bitrate parameter to Average Bitrate. This parameter is used to specify the transcoding method when the bitrate of the original audio is lower than the specified bitrate. Valid values: Transcode Based on Bitrate of Input Audio: The actual bitrate of the original audio is used for transcoding. Transcode Based on Specific Bitrate: The audio is re-encoded based on the specified average bitrate. Do Not Transcode: The original audio is used without transcoding.
Audio Channels	Select the number of sound channels. You can select Audio Channels of Input Audio. Default value: 2.
Volume Normalization	After you turn on Volume Uni, the system automatically adjusts the volume of audio files to ensure volume consistency. This resolves the issue of unstable volume due to the volume differences of input files. This parameter is supported only if only one output audio stream is configured. This parameter is not supported if more than one output audio streams are configured. Note Volume Normalization Professional Edition provides more natural and accurate volume control and is billed based on the processing duration. The service is suitable for applications that require higher audio quality. For more information, see Audio and video enhancement fees.

Audio transcoding

Configure audio parameters

Parameter	Description
Disable Audio	Specifies whether to disable the audio. If you select Disable Audio, the output file does not include the audio stream or the corresponding audio information. This option is applicable if you want to extract the video stream from a video file.
Encoding Format	Select an encoding format. Different container formats support different encoding formats. The encoding formats displayed in the IMS console shall prevail. Note If you set the parameter to AC3 or EAC3, the audio effect of normal audio files is converted to Dolby. You can use ApsaraVideo Player SDK to enable exclusive audio effects on Dolby devices. The bill for the audio services is generated based on the billing rules of the Dolby sound production feature. If you want to play HLS videos on web pages, we recommend that you set the parameter to AAC. If you set the parameter to other values, the player may mute the videos.
Encoding Profile	The encoding profile of the output audio. This parameter is displayed only if you set the Encoding Format parameter to AAC. If the input audio uses the surround sound format such as 5.1 or 7.1, we recommend that you set this parameter to aac_low. If you want to play the output audio on general playback devices, we recommend that you set this parameter to aac_he. This reduces the bitrate by half compared with the bitrate that is used when this parameter is set to aac_low. A common low bitrate is 64 Kbit/s. If you want to play the output audio on high-end playback devices, we recommend that you set this parameter to aac_he_v2. This way, audio files are encoded in smaller sizes and have higher sound quality. Common low bitrates range from 32 Kbit/s to 48 Kbit/s.
Sample Rate	Select a sampling rate from the drop-down list. Unit: Hz. The supported sample rates of audio vary based on the encoding format or container format. For more information, see Audio sample rates.
Audio Bitrate	Valid values: Bitrate of Source Audio and Average Bitrate. Bitrate of Source Audio: The original bitrate of the input audio is used. Average Bitrate: Set an average bitrate for the output audio.
Bitrate check	This parameter is required if you set the Audio Bitrate parameter to Average Bitrate. This parameter is used to specify the transcoding method when the bitrate of the original audio is lower than the specified bitrate. Valid values: Transcode Based on Bitrate of Input Audio: The actual bitrate of the original audio is used for transcoding. Transcode Based on Specific Bitrate: The audio is re-encoded based on the specified average bitrate. Do Not Transcode: The original audio is used without transcoding.
Audio Channels	Select the number of sound channels. You can select Audio Channels of Input Audio. Default value: 2.
Volume Normalization	After you turn on Volume Uni, the system automatically adjusts the volume of audio files to ensure volume consistency. This resolves the issue of unstable volume due to the volume differences of input files. This parameter is supported only if only one output audio stream is configured. This parameter is not supported if more than one output audio streams are configured. Note Volume Normalization Professional Edition provides more natural and accurate volume control and is billed based on the processing duration. The service is suitable for applications that require higher audio quality. For more information, see Audio and video enhancement fees.

Container format conversion

Container format conversion only converts the container format of videos and does not change the resolution or bitrate. Supports output formats: MP4, FLV, M3U8 (TS), and M3U8 (FMP4).

Create a subtitle template

Create a subtitle template

Parameter	Description
Template Name	The name of the subtitle template.
External Subtitle File Format	The format of the external subtitle file. Valid values: srt and ass.
External Subtitle Encoding Format	The encoding format of the external subtitle. If this parameter is set to auto, the detected encoding format may not be the actual encoding format. We recommend that you set this parameter to a specific encoding format.

Create an audio and image enhancement template

Parameter	Description
Template Name	The name of the audio and image enhancement template.
Container Format	The container format.
Deinterlace	Specifies whether to enable deinterlacing processing if the video contains interlaced clips.
Remove Frame Type	The type of the frames to be removed. This parameter is required if Deinterlace is turned on. Valid values: Even Frames and Odd Frames.
Multi-frame Denoising	Specifies whether to enable multi-frame noise reduction if the video has time-domain noise.
Denoising Intensity	The noise reduction intensity. This parameter is required if Multi-frame Denoising is turned on. Valid values: 0.5 to 5. A smaller value indicates better noise reduction performance.
Remove Compression Distortion	Specifies whether to remove compression artifacts.
Color and Contrast Enhancement	Specifies whether to enable color and contrast enhancement.
Degree of Saturation Enhancement	The saturation enhancement level. This parameter is required if Color and Contrast Enhancement is turned on. Valid values: 0 to 1. A smaller value indicates a higher saturation level.
Super Resolution	Specifies whether to enable the super resolution feature.
Super-resolution Ratio	The resolution magnification ratio. This parameter is required if Super Resolution is turned on. After this parameter is configured, the height and width of the video image is proportionally scaled up.
Output Video Size	Preset Resolution: Select a resolution that is preset by the system. Custom Resolution (Width × Height): Specify a custom resolution. Valid values: 128 to 4096. Unit: pixel. Note If you leave this parameter empty, the video image is scaled up based on the value of the Super-resolution Ratio parameter.
SDR to HDR	Specifies whether to transcode an SDR videos to an HDR video. This parameter is required if the encoding format of the output video is set to H.265.

Create a watermark template

Parameter	Description
Template Name	The name of the audio and image enhancement template.
Watermark Type	The watermark type. Valid values: Image Watermark and Text Watermark.
Image watermark configurations
Watermark Material	The image to be used as the watermark. Only PNG and GIF images are supported. The image can be up to 20 MB in size.
Watermark Position	The watermark position.
Image Width	By Value: the width value of the image watermark. Valid values: 8 to 4096. Unit: pixel. By Percentage: the percentage of the width of the image watermark relative to the video width. The value of this parameter is accurate to two decimal places. Valid values: 0 to 100. Unit: percent (%).
Image Height	By Value: the height value of the image watermark. Valid values: 8 to 4096. Unit: pixel. By Percentage: the percentage of the height of the image watermark relative to the video height. The value of this parameter is accurate to two decimal places. Valid values: 0 to 100. Unit: percent (%).
Horizontal Offset	By Value: the horizontal offset value of the image watermark. Valid values: 8 to 4096. Unit: pixel. By Percentage: the percentage of the horizontal offset of the image watermark relative to the video width. The value of this parameter is accurate to two decimal places. Valid values: 0 to 100. Unit: percent (%).
Vertical Offset	By Value: the vertical offset value of the image watermark. Valid values: 8 to 4096. Unit: pixel. By Percentage: the percentage of the vertical offset of the image watermark relative to the video height. The value of this parameter is accurate to two decimal places. Valid values: 0 to 100. Unit: %.
Animated Watermark (Timeline)	Specifies whether to enable the animated watermark feature. If you enable this feature, the watermark is displayed only within a specific period of time.
Duration	The period of time within the watermark is displayed. This parameter is required if Animated Watermark (Timeline) is turned on. The value of this parameter ranges from 0 to the value of the video duration. Unit: seconds.
Ending Mode	The end mode of the watermark. This parameter is required if Animated Watermark (Timeline) is turned on. Valid values: To End: The watermark lasts until the video ends. Duration: The watermark is displayed within the specified period of time. The value of this parameter ranges from 0 to the value of the video duration. Unit: seconds.
Text watermark configurations
Text Watermark Content	The text to be displayed as the watermark.
Font	The font.
Font Size	The font size. Valid values: 5 to 119. Unit: pixel.
Font Color	The font color.
Font Transparency	The font transparency. Valid values: 0 to 100. Unit: pixel.
Horizontal Offset	By Value: the horizontal offset value of the text watermark. Valid values: 8 to 4096. Unit: pixel. By Percentage: the percentage of the horizontal offset of the text watermark relative to the video width. The value of this parameter is accurate to two decimal places. Valid values: 0 to 100. Unit: percent (%).
Vertical Offset	By Value: the vertical offset value of the text watermark. Valid values: 8 to 4096. Unit: pixel. By Percentage: the percentage of the vertical offset of the text watermark relative to the video height. The value of this parameter is accurate to two decimal places. Valid values: 0 to 100. Unit: percent (%).
Outline Width	The width of the text watermark outline. Valid values: 0 to 4096. Unit: pixel.
Outline Color	The color of the text watermark outline.

Create a transcoding task

Use the IMS console

Log on to the IMS console.
In the upper-left corner, select a region based on your business requirements.
In the left-side navigation pane, choose VOD Media Processing > Task Management.

On the Transcoding tab, click Create Transcoding Task to create a transcoding task.

Parameter		Description
Basic Parameters	Task Name	The name of the transcoding task.
	File Source	The source of the media asset. Valid values: OSS and Media Asset Management.
	OSS Path	The Object Storage Service (OSS) directory in which the media asset is stored. This parameter is required if the File Source parameter is set to OSS.
	Storage Address	The IMS directory in which the media asset is stored. This parameter is required if the File Source parameter is set to Media Asset Management. After you configure this parameter, you must click Add File to add the media asset stored in the directory.
	Select a template	The transcoding template.
Subtitle Configuration	Subtitle	Specifies whether to enable the subtitle feature.
	External Subtitle File	The one or more subtitle files to be added to the video. This parameter is required if Subtitle is turned on.
	Subtitle Template	The subtitle template. This parameter is required if Subtitle is turned on.
Watermark Configuration	Watermark	Specifies whether to enable the watermark feature.
Watermark Configuration	Watermark Template	The watermark template.
Output Information	Output Location	Original Storage Location: The processed file is saved in the directory in which the source file is stored. Custom: Specify a custom directory in which the processed file is saved.
	Output Address	The directory in which the processed file is saved. This parameter is required if the Output Location parameter is set to Custom.
	Output File Name	The name of the transcoded file. You must add a file name extension except for an M3U8 file.

Query the details of the transcoding task

Use callback information

You can query the details of the transcoding task based on the callback that indicates a transcoding subtask is complete or a main transcoding task is complete.

Query the usage duration of the transcoding task

Log on to the IMS console.
In the left-side navigation pane, choose Data Center > Usage.
On the VOD Tasks tab, select Transcoding to query the details and export the usage data of the corresponding task.
Note
The URL that is used to download the usage duration file is valid for 30 minutes. This ensures data security. If the URL expires, refresh the page to obtain another URL.