All Products
Search
Document Center

Drive and Photo Service:Media transcoding

Last Updated:May 19, 2025

Drive and Photo Service (PDS) provides the media transcoding feature that allows you to preview audio and video files on a default web page. This feature also allows you to call an API operation or use an SDK to transcode audio and video files based on the specified parameters and obtain the preview URLs of the transcoded files. Before you use the media transcoding feature, you must enable the feature first. This topic describes the media transcoding feature and how to use this feature.

Procedure

To use the media transcoding feature, perform the following steps:

1. Enable the media transcoding feature.

2. Upload audio or video files.

3. Obtain the information about media transcoding.

Feature overview

The media transcoding feature of PDS is a domain-specific feature that transcodes audio and video files by using templates. The same media file in a domain is transcoded based on a template only once. Files with the same hash value are considered the same media file. This provides a better preview experience and helps you reduce costs.

To meet the requirements in different business scenarios, PDS supports the following transcoding types: offline transcoding (offline_video or offline_audio), live transcoding (live_transcoding), and quick transcoding (quick_video). Offline transcoding is designed for offline processing of audio and video files and supports diverse format conversions of audio and video files. Live transcoding and quick transcoding are designed for real-time video streaming, catering to low latency and high real-time performance requirements.

Offline transcoding allows you to transcode a complete media file to another file. This transcoding type provides high compatibility and supports most transcoding parameters. However, you cannot obtain the preview URL before the transcoding is complete.

Live transcoding 2.0 and quick transcoding divide a transcoding task into the following subtasks: a subtask for generating a playlist and subtasks for transcoding multiple segments. This way, you can quickly preview the transcoding result. After a video is uploaded and the transcoding information is obtained, the subtask for generating a playlist is instantly complete and you can obtain the preview URL. When you use the preview URL to play the video in a player, the playback can instantly start if the transcoding is complete. If the transcoding is not complete, the playback starts after the segment to be played is transcoded by the corresponding transcoding subtask, which often takes no more than 10 seconds. If you seek to a segment that has not been transcoded during the playback, you need to wait only a few seconds for the segment to be transcoded. Compared with quick transcoding, live transcoding launched early has the disadvantage of long buffering time during playback seeking in some players. Therefore, only quick transcoding is supported now. For customers who have enabled live transcoding, a seamless migration solution is provided to help transition from live transcoding to quick transcoding. For more information, see Transition from live transcoding to quick transcoding.

The following figure shows the media transcoding logic of PDS, which consists of three main steps: check for transcoding information, select a transcoding template, and initiate a transcoding task.

image

The logic of selecting transcoding templates varies based on media types. PDS uses all templates that are configured for a domain to transcode an audio file. The following table describes the default templates for offline audio transcoding.

Parameter

LQ

HQ

SQ

Audio codec

MP3

MP3

MP3

Audio sampling rate (Hz)

44,100

44,100

44,100

Number of audio channels

2

2

2

Audio bitrate (Kbit/s)

128

320

640

By default, PDS selects transcoding templates for a video file based on the resolution of the original video. All templates whose resolutions are lower than the resolution of the original video and the first template whose resolution is not lower than that of the original video are selected. When PDS selects transcoding templates for a video file, PDS first exchanges the width and height of the original video to ensure that the width of the video is no smaller than the height of the video. After the exchange, if the width or height of the video is greater than that of the template, the resolution of the original video is considered greater than that of the template. When PDS transcodes the video based on the template whose resolution is not lower than that of the original video, the resolution of the original video is used during the transcoding. The keep_original_resolution field is added to the preview URL to indicate whether the resolution of the original video is retained.

For example, if the resolutions of the templates configured for a domain are 720 × 480, 1280 × 720, and 1920 × 1080, and the resolution of the original video is 540 × 720, the templates whose resolutions are 720 × 480 and 1280 × 720 are selected, and the resolutions of the transcoded videos are 720 × 480 and 720 × 540. The following table describes the default video transcoding templates for all the three video transcoding types.

Parameter

264_480p

264_720p

264_1080p

Video codec

H.264

H.264

H.264

Video resolution

720 × 480

1280 × 720

1920 × 1080

Video bitrate (Kbit/s)

600

1500

3000

Video frame rate (frames per second)

25

25

25

Audio codec

AAC

AAC

AAC

Audio sampling rate (Hz)

44,100

44,100

44,100

Number of audio channels

2

2

2

Audio bitrate (Kbit/s)

72

128

160

If the default templates and the template selection policy cannot meet your business requirements, contact us to customize templates and template selection policies. (PDS Enterprise Edition does not support custom templates or template selection policies.) The following table describes the template parameters that are supported by PDS.

Parameter

Type

Required

Description

TemplateID

string

Yes

The ID of the audio or video transcoding template.

Width

int64

Yes

The width of the output video. The value must be a positive even integer. Valid values: (0,4096].

Height

int64

Yes

The height of the output video. The value must be a positive even integer. Valid values: (0,4096].

Warning

If the value of this parameter is greater than the value of the Width parameter, the template is ignored during transcoding.

VideoCodec

string

Yes

The video encoding format.

  • Valid values for offline transcoding or live transcoding: copy, h264, h265, and vp9.

Warning

If you set this parameter to copy, the video stream to be processed is directly copied to the output file. In this case, other video-related parameters become invalid. You cannot set this parameter to copy if you want to merge videos. This option is often used in container format conversion.

  • Valid values for quick transcoding: h264 and h265.

VideoFrameRate

int64

Yes

The frame rate of the video.

VideoBitrate

int64

Yes

The bitrate of the video. Unit: Kbit/s.

Note

This parameter conflicts with the VideoCRF parameter. If both this parameter and the VideoCRF parameter are left empty, the video is transcoded based on a value of 23 for the VideoCRF parameter by default.

VideoBFrames

int64

No

This parameter is valid only for offline transcoding and quick transcoding. The number of consecutive B-frames. Default value: 3.

VideoBufferSize

int64

No

This parameter is valid only for offline transcoding and quick transcoding. The size of the buffer for decoding when the dynamic bitrate is used. Unit: bit/s.

Note

This parameter must be used together with the VideoCRF parameter.

VideoCRF

float64

No

This parameter is valid only for offline transcoding and quick transcoding. The constant rate factor (CRF) of the video. This parameter conflicts with the VideoBitrate parameter. Valid values: [0,51]. A larger value indicates a poorer image quality. We recommend that you set this parameter to a value from 18 to 38.

VideoGOPSize

int64

No

This parameter is valid only for offline transcoding and quick transcoding. It indicates the group of pictures (GOP) size. Default value: 150.

VideoMaxBitrate

int64

No

This parameter is valid only for offline transcoding and quick transcoding. The maximum bitrate when the dynamic bitrate is used. The VideoBufferSize parameter is required if you specify this parameter.

Note

This parameter must be used together with the VideoCRF parameter.

VideoRefs

int64

No

This parameter is valid only for offline transcoding and quick transcoding. It indicates the number of reference frames. Default value: 2.

VideoResolution

string

No

This parameter is valid only for offline transcoding and quick transcoding. The resolution of the output video. The format is WidthxHeight. If you specify this parameter and the Width and Height parameters at the same time, the value of this parameter takes precedence. Valid values: (0,4096].

Note

If the original video is rotated, the width, height, long side, and short side of the rotated video prevail.

VideoPreset

string

No

This parameter is valid only for offline transcoding and quick transcoding.

VideoResolutionFixPolicy

string

No

This parameter is valid only for offline transcoding. If only the height or width of the output video is configured in the VideoResolution parameter, this parameter specifies the policy that is used to fix the length of a specific side. Valid values:

  • align_long_side

  • align_short_side

Examples:

The VideoResolution parameter is set to 480x and the VideoResolutionFixPolicy parameter is set to align_long_side. If the resolution of the input video is 960 × 720, the resolution of the output video is 480 × 360. If the resolution of the input video is 720 × 960, the resolution of the output video is 360 × 480.

Note

This parameter must be used together with the VideoResolution parameter, and the VideoResolution parameter must specify only the height or width of the video, such as 480x.

AudioCodec

int64

Yes

The audio encoding method.

  • Valid values for offline transcoding or live transcoding: copy, mp3, vorbis, aac, flac, ac3, opus, and amr.

  • Valid value for quick transcoding: aac.

Note

If you set this parameter to copy, the audio stream to be processed is directly copied to the output file. In this case, other audio-related parameters become invalid. You cannot set this parameter to copy if you want to merge audio. This option is often used in container format conversion.

AudioBitrate

int64

Yes

The audio bitrate. Unit: Kbit/s. Valid values: [1,10000].

AudioSampleRate

int64

Yes

The audio sampling rate. Unit: Hz. Valid values: 8000, 12025, 12000,160 00, 22050, 24000,32000, 44100, 48000, 64000, 88200, and 96000.

Note

The audio sampling rate varies based on file formats. The MP3 format supports 48,000 Hz and below. The Opus format supports 8,000 Hz, 12,000 Hz, 16,000 Hz, 24,000 Hz, and 48,000 Hz. The AC3 format supports 32,000 Hz, 44,100 Hz, and 48,000 Hz. The AMR format supports only 8,000 Hz and 16,000 Hz.

AudioChannel

int64

Yes

The number of audio channels. Valid values: [1,8].

AudioStream

[]int64

No

The indexes of audio streams to be processed in the original file. By default, the first audio stream is processed. A value that is greater than 100 indicates that all audio streams are processed.

  • Examples: [0,1] indicates that the audio streams whose indexes are 0 and 1 are processed. [1] indicates that the audio stream whose index is 1 is processed. [101] indicates that all audio streams are processed.

Note

Only existing audio streams corresponding to specified indexes are processed. If the audio stream corresponding to an index does not exist, the index is ignored.

Enable the media transcoding feature

By default, the quick transcoding and offline transcoding features are enabled for PDS Enterprise Edition.

If you want to enable the media transcoding feature for PDS Developer Edition, contact us by joining the official DingTalk group. It is free of charge to enable the media transcoding feature, but you are charged fees when you use this feature. For more information, see Value-added billable items.

Transition from live transcoding to quick transcoding

Compared with quick transcoding, live transcoding launched early has the disadvantage of long buffering time during playback seeking in specific players. To address this issue, a solution to seamlessly transitioning from live transcoding to quick transcoding is provided. For customers who have enabled live transcoding, you can seamlessly transition from live transcoding to quick transcoding. Through the transition, customers can not only reuse their existing code to enjoy faster transcoding and smoother playback, but also reuse most of the audio and video files previously transcoded based on live transcoding. Before the migration, take note of the following limits:

  1. After the migration, the Live transcoding billable item is used. The original billable items Transcoding-Transcode-xxx, such as Transcoding-Transcode264-HD, will gradually decrease and eventually be phased out.

  2. After the migration, the returned playback URL no longer supports Range GET requests when a client or player fetches TS segment files from the m3u8 playlist.

  3. After the migration, the playback URL returned will contain unencoded { and } characters in the query values. Sample URL:

http://sample.oss-cn-hangzhou.aliyuncs.com/xxx/media-token-0.ts?x-oss-process=if_status_eq_404{hls/ts,from_dGFyZ2V0L2ZhbmdhbmxhaS92aWRlby8xMDgwcC8xMDgwcDMzLm0zdTg}&x-oss-expires=12345&x-oss-signature-version=OSS2&x-oss-access-key-id=ak&x-oss-signature=sk

For iOS platforms, you must decode and encode the returned URLs to prevent a decoding error that may occur when you use the NSURL library. Sample code:

#import "NSString+URLReEncode.h"

@implementation NSString (URLEncoding)

- (NSString *)urlReEncode {
    NSRange queryStartRange = [self rangeOfString:@"?"];
    if (queryStartRange.location == NSNotFound) {
        return self;
    }
    
    NSString *baseURL = [self substringToIndex:queryStartRange.location];
    NSMutableCharacterSet *allowedCharacterSet = NSCharacterSet.URLQueryAllowedCharacterSet.mutableCopy;
    [allowedCharacterSet removeCharactersInString:@"=&+"];
    
    NSString *query = [self substringFromIndex:queryStartRange.location + 1];
    NSArray *parameters = [query componentsSeparatedByString:@"&"];
    NSMutableArray *encodedParameters = [NSMutableArray arrayWithCapacity:[parameters count]];
    
    for (NSString *parameter in parameters) {
        NSArray *keyValue = [parameter componentsSeparatedByString:@"="];
        if ([keyValue count] == 2) {
            NSString *key = keyValue[0];
            NSString *value = [keyValue[1] stringByRemovingPercentEncoding];
            NSString *encodedValue = [value stringByAddingPercentEncodingWithAllowedCharacters:allowedCharacterSet];
            [encodedParameters addObject:[NSString stringWithFormat:@"%@=%@", key, encodedValue]];
        } else {
            [encodedParameters addObject:parameter];
        }
    }
    
    NSString *encodedQuery = [encodedParameters componentsJoinedByString:@"&"];
    return [NSString stringWithFormat:@"%@?%@", baseURL, encodedQuery];
}

@end

Upload audio or video files

You can upload new audio or video files for transcoding or transcode existing media files. You must obtain the drive ID and file ID of the original media file, which are required to obtain the transcoding results.

Obtain the information about media transcoding

Call the /v2/file/get_video_preview_play_info operation. Set the transcoding type to the transcoding feature that is enabled in the domain, such as quick transcoding. Set the template_id parameter to the ID of the template that you want to use, such as 264_480p. This way, you can obtain the playback information of the file that is transcoded based on the specified template. If you do not specify a template ID, the playback information of all transcoded files is returned. If the transcoding is not complete, a status code of 202 VideoPreviewWaitAndRetry is returned. If the transcoding is complete, a preview URL is returned in the URL field.

Sample request

{
    "drive_id": "1",
    "file_id": "abcd", // The ID of the media file.
    "category": "quick_video", // The transcoding type.
    "template_id": "264_480p"       // The ID of the template that is used to transcode the file that you want to preview.
}

Sample responses

  1. Response of waiting for retry

{
  "Code": "VideoPreviewWaitAndRetry",
  "Message": "media is transcoding, please wait and retry. xxx"
}
  1. Response of failed video transcoding

    {
      "Code": "NotFound.VideoPreviewInfo",
      "Message": "ErrVideoPreviewLastErrorExist"
    }
  2. Response of offline video transcoding

{
    "domain_id": "xxxxx",
    "drive_id": "1",
    "file_id": "abcd", // The ID of the video file.
    "video_preview_play_info": {
        "category": "offline_video", // The transcoding type.
        // Other video information.
        "offline_video_transcoding_list": [
            {
                "template_id": "264_480p", // The template ID.
                "status": "finished" // The transcoding task is complete.
                "url": "https://example.aliyundoc.com/c/d?e=f...", // The preview URL.
            }
        ]
    }
}
  1. Response of offline audio transcoding

{
    "domain_id": "xxxxx",
    "drive_id": "1",
    "file_id": "abcd", // The ID of the audio file.
    "video_preview_play_info": {
        "category": "offline_audio", // The transcoding type.
        // Other audio information.
        "offline_audio_list": [
            {
                "template_id": "264_480p", // The template ID.
                "status": "finished" // The transcoding task is complete.
                "url": "https://example.aliyundoc.com/c/d?e=f...", // The preview URL.
            }
        ]
    }
}
  1. Response of quick transcoding

{
    "domain_id": "xxxxx",
    "drive_id": "1",
    "file_id": "abcd", // The ID of the video file.
    "video_preview_play_info": {
        "category": "quick_video", // The transcoding type.
        // Other video information.
        "quick_video_list": [
            {
                "template_id": "264_480p", // The template ID.
                "status": "finished" // The playlist is generated.
                "url": "https://example.aliyundoc.com/c/d?e=f...", // The preview URL.
            }
        ]
    }
}