Convert audio files by using audio transcoding - Object Storage Service

Convert audio files stored in OSS to a different format using an asynchronous processing request. This topic describes the parameters and provides code examples for Java, Python, and Go.

Prerequisites

Before you begin, make sure that you have:

An OSS bucket bound to an Intelligent Media Management (IMM) project. For setup instructions, see Quick start and AttachOSSBucket.
The required permissions for audio transcoding.
Access credentials configured as environment variables. Java and Python use OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET. Go additionally requires OSS_SESSION_TOKEN.

Use cases

Format compatibility: Convert audio files to formats supported by a target device or player.
Storage optimization: Transcode lossless audio (such as FLAC or WAV) to compressed lossy formats like MP3 to reduce storage footprint on mobile devices.
Media streaming: Produce multiple bitrate versions of a source file to support adaptive streaming across different network conditions.
Video post-processing: Convert audio tracks to compressed formats during video editing workflows to improve transfer efficiency.

How it works

Submit an asynchronous transcoding request using the x-oss-async-process header. The action is audio/convert. After the task completes, OSS saves the output audio to the path you specify with sys/saveas.

Audio transcoding is asynchronous only — the x-oss-process header used for synchronous image processing is not supported.

Parameters

Action: audio/convert

Parameter	Type	Required	Description
`f`	string	Yes	Output container format. Supported values: `mp3`, `aac`, `flac`, `oga`, `ac3`, `opus`, `amr`
`ss`	int	No	Start time in milliseconds. `0` (default) starts from the beginning. Any positive integer starts from that offset.
`t`	int	No	Duration in milliseconds after the start time. `0` (default) runs until the end of the audio.
`ar`	int	No	Sample rate of the output audio in Hz. Defaults to the source sample rate. See Format constraints for per-format limits. Supported values: 8000, 11025, 12000, 16000, 22050, 24000, 32000, 44100, 48000, 64000, 88200, 96000
`ac`	int	No	Number of audio channels in the output. Defaults to the source channel count. Valid values: 1–8. See Format constraints for per-format limits.
`aq`	int	No	Audio compression quality. Valid values: 0–100. Mutually exclusive with `ab`.
`ab`	int	No	Audio bitrate in bit/s. Valid values: 1000–10000000. Mutually exclusive with `aq`.
`abopt`	string	No	Bitrate behavior when the source bitrate is lower than the target. `0` (default): always use the target bitrate. `1`: use the source bitrate. `2`: return a failure.
`adepth`	int	No	Sampling bit depth of the output audio. Valid values: `16`, `24`. Applies only when `f=flac`.

Use sys/saveas to specify the output path and notify to receive a completion notification. See sys/saveas and Use the notification feature.

Format constraints

Different output formats impose additional restrictions on ar (sample rate) and ac (audio channels). If you use the default sample rate or channel count, transcoding may fail when the source values are incompatible with the target format. Set ar and ac explicitly when targeting a format with strict requirements.

Format	Supported sample rates	Supported audio channels
MP3	Up to 48 kHz	1–2
AAC	All supported values	No format-specific restriction
FLAC	All supported values	No format-specific restriction
OGA	All supported values	No format-specific restriction
AC-3	32 kHz, 44.1 kHz, 48 kHz	Up to 6 (5.1)
Opus	8 kHz, 12 kHz, 16 kHz, 24 kHz, 48 kHz	No format-specific restriction
AMR	8 kHz, 16 kHz	1

Limitations

Audio transcoding does not support adjusting bit depth, except for FLAC output (use adepth). For bit depth control on video tracks, see Video transcoding.
Anonymous requests are denied. Authentication is required.
Only the Java, Python, and Go SDKs support asynchronous audio transcoding.

Use the RESTful API

All examples use the x-oss-async-process header to submit asynchronous transcoding tasks.

Convert MP3 to AAC

Transcode a 60-second clip of example.mp3 starting at the 1,000 ms mark, output as AAC at 96 kbit/s, with a Simple Message Queue (SMQ) completion notification.

POST /example.mp3?x-oss-async-process HTTP/1.1
Host: video-demo.oss-cn-hangzhou.aliyuncs.com
Date: Fri, 28 Oct 2022 06:40:10 GMT
Authorization: OSS4-HMAC-SHA256 Credential=LTAI********************/20250417/cn-hangzhou/oss/aliyun_v4_request,Signature=a7c3554c729d71929e0b84489addee6b2e8d5cb48595adfc51868c299c0c218e

x-oss-async-process=audio/convert,ss_10000,t_60000,f_aac,ab_96000|sys/saveas,b_b3V0YnVja2V0,o_b3V0b2JqcHJlZml4LnthdXRvZXh0fQo/notify,topic_QXVkaW9Db252ZXJ0

Convert WAV to Opus

Transcode the entire example.wav file to Opus at 48 kHz, dual channel, 96 kbit/s, saved to oss://outbucket/outobject.opus, with a Message Notification Service (MNS) completion notification.

POST /example.wav?x-oss-async-process HTTP/1.1
Host: video-demo.oss-cn-hangzhou.aliyuncs.com
Date: Fri, 28 Oct 2022 06:40:10 GMT
Authorization: OSS4-HMAC-SHA256 Credential=LTAI********************/20250417/cn-hangzhou/oss/aliyun_v4_request,Signature=a7c3554c729d71929e0b84489addee6b2e8d5cb48595adfc51868c299c0c218e

x-oss-async-process=audio/convert,f_opus,ab_96000,ar_48000,ac_2|sys/saveas,b_b3V0YnVja2V0,o_b3V0b2JqLnthdXRvZXh0fQo/notify,topic_QXVkaW9Db252ZXJ0

Use OSS SDKs

Java

Requires OSS SDK for Java V3.17.4 or later.

The following example transcodes src.mp3 to AAC format, starting at 10 seconds, for a duration of 60 seconds, at 96 kbit/s, and saves the result to dest.aac in the same bucket.

import com.aliyun.oss.ClientBuilderConfiguration;
import com.aliyun.oss.OSS;
import com.aliyun.oss.OSSClientBuilder;
import com.aliyun.oss.common.auth.CredentialsProviderFactory;
import com.aliyun.oss.common.auth.EnvironmentVariableCredentialsProvider;
import com.aliyun.oss.common.comm.SignVersion;
import com.aliyun.oss.model.AsyncProcessObjectRequest;
import com.aliyun.oss.model.AsyncProcessObjectResult;
import com.aliyuncs.exceptions.ClientException;

import java.util.Base64;

public class Demo {
    public static void main(String[] args) throws ClientException {
        // Endpoint of the region where the bucket is located
        String endpoint = "https://oss-cn-hangzhou.aliyuncs.com";
        // Region ID of the bucket
        String region = "cn-hangzhou";
        // Load credentials from environment variables OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET
        EnvironmentVariableCredentialsProvider credentialsProvider = CredentialsProviderFactory.newEnvironmentVariableCredentialsProvider();
        String bucketName = "examplebucket";
        String targetKey = "dest.aac";
        String sourceKey = "src.mp3";

        ClientBuilderConfiguration clientBuilderConfiguration = new ClientBuilderConfiguration();
        clientBuilderConfiguration.setSignatureVersion(SignVersion.V4);
        OSS ossClient = OSSClientBuilder.create()
                .endpoint(endpoint)
                .credentialsProvider(credentialsProvider)
                .clientConfiguration(clientBuilderConfiguration)
                .region(region)
                .build();

        try {
            // Transcoding parameters: start at 10 s, duration 60 s, output AAC at 96 kbit/s
            String style = String.format("audio/convert,ss_10000,t_60000,f_aac,ab_96000");
            // Base64-encode the bucket name and output object key for sys/saveas
            String bucketEncoded = Base64.getUrlEncoder().withoutPadding().encodeToString(bucketName.getBytes());
            String targetEncoded = Base64.getUrlEncoder().withoutPadding().encodeToString(targetKey.getBytes());
            String process = String.format("%s|sys/saveas,b_%s,o_%s/notify,topic_QXVkaW9Db252ZXJ0", style, bucketEncoded, targetEncoded);

            AsyncProcessObjectRequest request = new AsyncProcessObjectRequest(bucketName, sourceKey, process);
            AsyncProcessObjectResult response = ossClient.asyncProcessObject(request);
            System.out.println("EventId: " + response.getEventId());
            System.out.println("RequestId: " + response.getRequestId());
            System.out.println("TaskId: " + response.getTaskId());
        } finally {
            ossClient.shutdown();
        }
    }
}

Python

Requires OSS SDK for Python V2.18.4 or later.

The following example transcodes src.mp3 to AAC format, starting at 10 seconds, for a duration of 60 seconds, at 96 kbit/s, and saves the result to dest.aac in the same bucket.

# -*- coding: utf-8 -*-
import base64
import oss2
from oss2.credentials import EnvironmentVariableCredentialsProvider

def main():
    # Load credentials from environment variables
    auth = oss2.Auth(EnvironmentVariableCredentialsProvider())
    endpoint = 'https://oss-cn-hangzhou.aliyuncs.com'
    bucket = oss2.Bucket(auth, endpoint, 'examplebucket')

    source_key = 'src.mp3'
    target_key = 'dest.aac'

    # Transcoding parameters: start at 10 s, duration 60 s, output AAC at 96 kbit/s
    style = 'audio/convert,ss_10000,t_60000,f_aac,ab_96000'

    # Base64-encode the bucket name and output object key for sys/saveas
    bucket_name_encoded = base64.urlsafe_b64encode('examplebucket'.encode()).decode().rstrip('=')
    target_key_encoded = base64.urlsafe_b64encode(target_key.encode()).decode().rstrip('=')
    process = f"{style}|sys/saveas,b_{bucket_name_encoded},o_{target_key_encoded}/notify,topic_QXVkaW9Db252ZXJ0"

    try:
        result = bucket.async_process_object(source_key, process)
        print(f"EventId: {result.event_id}")
        print(f"RequestId: {result.request_id}")
        print(f"TaskId: {result.task_id}")
    except Exception as e:
        print(f"Error: {e}")


if __name__ == "__main__":
    main()

Go

Requires OSS SDK for Go V3.0.2 or later.

The following example transcodes src.mp3 to AAC format, starting at 10 seconds, for a duration of 60 seconds, at 96 kbit/s, and saves the result to dest.aac in the same bucket.

package main

import (
	"encoding/base64"
	"fmt"
	"log"
	"os"

	"github.com/aliyun/aliyun-oss-go-sdk/oss"
)

func main() {
	// Load credentials from environment variables OSS_ACCESS_KEY_ID, OSS_ACCESS_KEY_SECRET, and OSS_SESSION_TOKEN
	provider, err := oss.NewEnvironmentVariableCredentialsProvider()
	if err != nil {
		fmt.Println("Error:", err)
		os.Exit(-1)
	}

	// Create an OSS client for the bucket's region
	client, err := oss.New("https://oss-cn-hangzhou.aliyuncs.com", "", "", oss.SetCredentialsProvider(&provider), oss.AuthVersion(oss.AuthV4), oss.Region("cn-hangzhou"))
	if err != nil {
		fmt.Println("Error:", err)
		os.Exit(-1)
	}

	bucketName := "examplebucket"
	bucket, err := client.Bucket(bucketName)
	if err != nil {
		fmt.Println("Error:", err)
		os.Exit(-1)
	}

	sourceKey := "src.mp3"
	targetKey := "dest.aac"

	// Transcoding parameters: start at 10 s, duration 60 s, output AAC at 96 kbit/s
	style := "audio/convert,ss_10000,t_60000,f_aac,ab_96000"

	// Base64-encode the bucket name and output object key for sys/saveas
	bucketNameEncoded := base64.URLEncoding.EncodeToString([]byte(bucketName))
	targetKeyEncoded := base64.URLEncoding.EncodeToString([]byte(targetKey))
	process := fmt.Sprintf("%s|sys/saveas,b_%v,o_%v/notify,topic_QXVkaW9Db252ZXJ0", style, bucketNameEncoded, targetKeyEncoded)

	result, err := bucket.AsyncProcessObject(sourceKey, process)
	if err != nil {
		log.Fatalf("Failed to async process object: %s", err)
	}

	fmt.Printf("EventId: %s\n", result.EventId)
	fmt.Printf("RequestId: %s\n", result.RequestId)
	fmt.Printf("TaskId: %s\n", result.TaskId)
}

What's next

sys/saveas — save transcoded output to a specified OSS path
Use the notification feature — receive a callback when transcoding completes
Video transcoding — transcode video files and adjust parameters including bit depth