Intelligent Media Management (IMM) provides a media transcoding feature that lets you perform format conversion, container format conversion, resolution adjustment, frame rate adjustment, video snapshots, sprite generation, and watermarking for audio and video files. This topic describes the media transcoding feature in detail.
Feature overview
The media transcoding feature of Intelligent Media Management (IMM) is a one-stop solution for processing media files. This feature lets you easily perform operations such as format conversion, container format conversion, resolution adjustment, and frame rate adjustment. This adapts your video and audio files for various playback devices and requirements.

Prerequisites
An AccessKey pair is created and obtained. For more information, see Create an AccessKey pair.
OSS is activated, a bucket is created, and objects are uploaded to the bucket. For more information, see Upload objects.
IMM is activated. For more information, see Activate IMM.
A project is created in the IMM console. For more information about how to create a project by using the IMM console, see Create a project.
NoteYou can also call the CreateProject operation to create a project. For more information, see CreateProject.
You can call the ListProjects operation to query existing projects in a specific region. For more information, see ListProjects.
Notes
If you encounter problems, join the DingTalk group (ID: 88490020073) to communicate with Alibaba Cloud Intelligent Media Management engineers in real time.
You are charged for using media transcoding API operations. For more information, see Billing overview.
Supported audio and video formats
Categorization | Format |
Audio | All mainstream formats, such as AAC, MP3, WAV, FLAC, WMA, AC3, and OPUS. |
Video | Multiple mainstream formats, such as MP4, MPEG-TS, MKV, MOV, AVI, FLV, M3U8, WebM, WMV, RM, and VOB. |
Benefits
Comparison item | Alibaba Cloud transcoding | Self-managed transcoding |
Transcoding capability | A high-speed and stable parallel transcoding system that dynamically adjusts transcoding resources on demand. It automatically scales resources in or out and seamlessly expands cluster resources to handle high-concurrency transcoding requests. | Difficult to support large-scale, high-concurrency transcoding tasks. |
Transcoding algorithm | Powerful computing resources and advanced video processing algorithms. | Relies on open source transcoding services. |
Features | Video transcoding, container format conversion, conversion to HLS, video-to-animated-image conversion, video merging, video snapshots, video sprite generation, audio transcoding, audio extraction, and caption extraction. | You must integrate with open source transcoding services and build the transcoding service from the ground up. |
Feature descriptions
The following table describes the features of IMM media transcoding.
Feature | Description |
Video encoding format conversion, container format conversion, resolution adjustment, frame rate adjustment, and bitrate adjustment. | |
Extracts specific frames from a video and saves them as static images to capture specific moments. | |
Audio format conversion, bitrate adjustment, adjustment of the number of sound channels, and sample rate adjustment. | |
Combines multiple video frames into a single image file and arranges them in a grid. This creates a sprite-like effect. | |
Converts video files to animated image formats, such as GIF or WebP. | |
Extracts caption information from video files. | |
Merges multiple video clips into a complete video and converts the video to the required format. | |
Merges multiple audio clips into a continuous audio file. |
FAQ
What do I do if a video transcoding request fails?
If a video transcoding request fails, set the TargetAudio.Codec parameter to a value other than `copy` and retry the request. For more information, see TargetAudio.
Why is the file size larger after video transcoding?
The output file from video transcoding can be larger than the source file. This is because audio and video transcoding is a form of lossy compression. The compression algorithm may produce a larger file, but this does not improve the image or sound quality compared to the source file. It only indicates that less quality was lost during compression.
To control the output file size, set the `BitrateOption`, `Bitrate`, or `CRF` parameter in `TargetVideo` to control the video stream bitrate. You can also set the `BitrateOption`, `Bitrate`, or `Quality` parameter in `TargetAudio` to control the audio stream bitrate.
Can a video transcoding task be canceled?
This operation cannot be canceled.
The video orientation is incorrect after transcoding. How do I adjust it?
In the TargetVideo parameter configuration, set `Codec` to a value other than `copy` and `AdaptiveResolutionDirection` to `true`. This enables automatic rotation based on the video's resolution.
The video is stretched after transcoding. How do I prevent this?
In the TargetVideo parameter configuration, set `Codec` to a value other than `copy`, `AdaptiveResolutionDirection` to `true` to enable automatic rotation, and `ScaleType` to `fit` to scale the video proportionally without black bars.
How do I set a fixed bitrate for video transcoding? Why does setting only the bitrate parameter not work?
In the TargetVideo parameter, the
bitrateparameter sets a variable bitrate (VBR) and does not support a constant bitrate (CBR). The main difference between VBR and CBR is the instantaneous bitrate. If you require a fixed bitrate, set themaxbitrateparameter.
How do I get the playback duration of a successfully transcoded video?
To obtain the playback duration, call the DetectMediaMeta API operation or use the `video/info` parameter of `x-oss-process`. For more information, see Extract video information.
Does the CreateMediaConvertTask API operation support RocketMQ 5.0?
No. It currently supports RocketMQ 4.0.