Convert audio files stored in OSS to a different format using an asynchronous processing request. This topic describes the parameters and provides code examples for Java, Python, and Go.
Prerequisites
Before you begin, make sure that you have:
An OSS bucket bound to an Intelligent Media Management (IMM) project. For setup instructions, see Quick start and AttachOSSBucket.
The required permissions for audio transcoding.
Access credentials configured as environment variables. Java and Python use
OSS_ACCESS_KEY_IDandOSS_ACCESS_KEY_SECRET. Go additionally requiresOSS_SESSION_TOKEN.
Use cases
Format compatibility: Convert audio files to formats supported by a target device or player.
Storage optimization: Transcode lossless audio (such as FLAC or WAV) to compressed lossy formats like MP3 to reduce storage footprint on mobile devices.
Media streaming: Produce multiple bitrate versions of a source file to support adaptive streaming across different network conditions.
Video post-processing: Convert audio tracks to compressed formats during video editing workflows to improve transfer efficiency.
How it works
Submit an asynchronous transcoding request using the x-oss-async-process header. The action is audio/convert. After the task completes, OSS saves the output audio to the path you specify with sys/saveas.
Audio transcoding is asynchronous only — the x-oss-process header used for synchronous image processing is not supported.
Parameters
Action: audio/convert
| Parameter | Type | Required | Description |
|---|---|---|---|
f | string | Yes | Output container format. Supported values: mp3, aac, flac, oga, ac3, opus, amr |
ss | int | No | Start time in milliseconds. 0 (default) starts from the beginning. Any positive integer starts from that offset. |
t | int | No | Duration in milliseconds after the start time. 0 (default) runs until the end of the audio. |
ar | int | No | Sample rate of the output audio in Hz. Defaults to the source sample rate. See Format constraints for per-format limits. Supported values: 8000, 11025, 12000, 16000, 22050, 24000, 32000, 44100, 48000, 64000, 88200, 96000 |
ac | int | No | Number of audio channels in the output. Defaults to the source channel count. Valid values: 1–8. See Format constraints for per-format limits. |
aq | int | No | Audio compression quality. Valid values: 0–100. Mutually exclusive with ab. |
ab | int | No | Audio bitrate in bit/s. Valid values: 1000–10000000. Mutually exclusive with aq. |
abopt | string | No | Bitrate behavior when the source bitrate is lower than the target. 0 (default): always use the target bitrate. 1: use the source bitrate. 2: return a failure. |
adepth | int | No | Sampling bit depth of the output audio. Valid values: 16, 24. Applies only when f=flac. |
Usesys/saveasto specify the output path andnotifyto receive a completion notification. See sys/saveas and Use the notification feature.
Format constraints
Different output formats impose additional restrictions on ar (sample rate) and ac (audio channels). If you use the default sample rate or channel count, transcoding may fail when the source values are incompatible with the target format. Set ar and ac explicitly when targeting a format with strict requirements.
| Format | Supported sample rates | Supported audio channels |
|---|---|---|
| MP3 | Up to 48 kHz | 1–2 |
| AAC | All supported values | No format-specific restriction |
| FLAC | All supported values | No format-specific restriction |
| OGA | All supported values | No format-specific restriction |
| AC-3 | 32 kHz, 44.1 kHz, 48 kHz | Up to 6 (5.1) |
| Opus | 8 kHz, 12 kHz, 16 kHz, 24 kHz, 48 kHz | No format-specific restriction |
| AMR | 8 kHz, 16 kHz | 1 |
Limitations
Audio transcoding does not support adjusting bit depth, except for FLAC output (use
adepth). For bit depth control on video tracks, see Video transcoding.Anonymous requests are denied. Authentication is required.
Only the Java, Python, and Go SDKs support asynchronous audio transcoding.
Use the RESTful API
All examples use the x-oss-async-process header to submit asynchronous transcoding tasks.
Convert MP3 to AAC
Transcode a 60-second clip of example.mp3 starting at the 1,000 ms mark, output as AAC at 96 kbit/s, with a Simple Message Queue (SMQ) completion notification.
POST /example.mp3?x-oss-async-process HTTP/1.1
Host: video-demo.oss-cn-hangzhou.aliyuncs.com
Date: Fri, 28 Oct 2022 06:40:10 GMT
Authorization: OSS4-HMAC-SHA256 Credential=LTAI********************/20250417/cn-hangzhou/oss/aliyun_v4_request,Signature=a7c3554c729d71929e0b84489addee6b2e8d5cb48595adfc51868c299c0c218e
x-oss-async-process=audio/convert,ss_10000,t_60000,f_aac,ab_96000|sys/saveas,b_b3V0YnVja2V0,o_b3V0b2JqcHJlZml4LnthdXRvZXh0fQo/notify,topic_QXVkaW9Db252ZXJ0Convert WAV to Opus
Transcode the entire example.wav file to Opus at 48 kHz, dual channel, 96 kbit/s, saved to oss://outbucket/outobject.opus, with a Message Notification Service (MNS) completion notification.
POST /example.wav?x-oss-async-process HTTP/1.1
Host: video-demo.oss-cn-hangzhou.aliyuncs.com
Date: Fri, 28 Oct 2022 06:40:10 GMT
Authorization: OSS4-HMAC-SHA256 Credential=LTAI********************/20250417/cn-hangzhou/oss/aliyun_v4_request,Signature=a7c3554c729d71929e0b84489addee6b2e8d5cb48595adfc51868c299c0c218e
x-oss-async-process=audio/convert,f_opus,ab_96000,ar_48000,ac_2|sys/saveas,b_b3V0YnVja2V0,o_b3V0b2JqLnthdXRvZXh0fQo/notify,topic_QXVkaW9Db252ZXJ0Use OSS SDKs
Java
Requires OSS SDK for Java V3.17.4 or later.
The following example transcodes src.mp3 to AAC format, starting at 10 seconds, for a duration of 60 seconds, at 96 kbit/s, and saves the result to dest.aac in the same bucket.
import com.aliyun.oss.ClientBuilderConfiguration;
import com.aliyun.oss.OSS;
import com.aliyun.oss.OSSClientBuilder;
import com.aliyun.oss.common.auth.CredentialsProviderFactory;
import com.aliyun.oss.common.auth.EnvironmentVariableCredentialsProvider;
import com.aliyun.oss.common.comm.SignVersion;
import com.aliyun.oss.model.AsyncProcessObjectRequest;
import com.aliyun.oss.model.AsyncProcessObjectResult;
import com.aliyuncs.exceptions.ClientException;
import java.util.Base64;
public class Demo {
public static void main(String[] args) throws ClientException {
// Endpoint of the region where the bucket is located
String endpoint = "https://oss-cn-hangzhou.aliyuncs.com";
// Region ID of the bucket
String region = "cn-hangzhou";
// Load credentials from environment variables OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET
EnvironmentVariableCredentialsProvider credentialsProvider = CredentialsProviderFactory.newEnvironmentVariableCredentialsProvider();
String bucketName = "examplebucket";
String targetKey = "dest.aac";
String sourceKey = "src.mp3";
ClientBuilderConfiguration clientBuilderConfiguration = new ClientBuilderConfiguration();
clientBuilderConfiguration.setSignatureVersion(SignVersion.V4);
OSS ossClient = OSSClientBuilder.create()
.endpoint(endpoint)
.credentialsProvider(credentialsProvider)
.clientConfiguration(clientBuilderConfiguration)
.region(region)
.build();
try {
// Transcoding parameters: start at 10 s, duration 60 s, output AAC at 96 kbit/s
String style = String.format("audio/convert,ss_10000,t_60000,f_aac,ab_96000");
// Base64-encode the bucket name and output object key for sys/saveas
String bucketEncoded = Base64.getUrlEncoder().withoutPadding().encodeToString(bucketName.getBytes());
String targetEncoded = Base64.getUrlEncoder().withoutPadding().encodeToString(targetKey.getBytes());
String process = String.format("%s|sys/saveas,b_%s,o_%s/notify,topic_QXVkaW9Db252ZXJ0", style, bucketEncoded, targetEncoded);
AsyncProcessObjectRequest request = new AsyncProcessObjectRequest(bucketName, sourceKey, process);
AsyncProcessObjectResult response = ossClient.asyncProcessObject(request);
System.out.println("EventId: " + response.getEventId());
System.out.println("RequestId: " + response.getRequestId());
System.out.println("TaskId: " + response.getTaskId());
} finally {
ossClient.shutdown();
}
}
}Python
Requires OSS SDK for Python V2.18.4 or later.
The following example transcodes src.mp3 to AAC format, starting at 10 seconds, for a duration of 60 seconds, at 96 kbit/s, and saves the result to dest.aac in the same bucket.
# -*- coding: utf-8 -*-
import base64
import oss2
from oss2.credentials import EnvironmentVariableCredentialsProvider
def main():
# Load credentials from environment variables
auth = oss2.Auth(EnvironmentVariableCredentialsProvider())
endpoint = 'https://oss-cn-hangzhou.aliyuncs.com'
bucket = oss2.Bucket(auth, endpoint, 'examplebucket')
source_key = 'src.mp3'
target_key = 'dest.aac'
# Transcoding parameters: start at 10 s, duration 60 s, output AAC at 96 kbit/s
style = 'audio/convert,ss_10000,t_60000,f_aac,ab_96000'
# Base64-encode the bucket name and output object key for sys/saveas
bucket_name_encoded = base64.urlsafe_b64encode('examplebucket'.encode()).decode().rstrip('=')
target_key_encoded = base64.urlsafe_b64encode(target_key.encode()).decode().rstrip('=')
process = f"{style}|sys/saveas,b_{bucket_name_encoded},o_{target_key_encoded}/notify,topic_QXVkaW9Db252ZXJ0"
try:
result = bucket.async_process_object(source_key, process)
print(f"EventId: {result.event_id}")
print(f"RequestId: {result.request_id}")
print(f"TaskId: {result.task_id}")
except Exception as e:
print(f"Error: {e}")
if __name__ == "__main__":
main()Go
Requires OSS SDK for Go V3.0.2 or later.
The following example transcodes src.mp3 to AAC format, starting at 10 seconds, for a duration of 60 seconds, at 96 kbit/s, and saves the result to dest.aac in the same bucket.
package main
import (
"encoding/base64"
"fmt"
"log"
"os"
"github.com/aliyun/aliyun-oss-go-sdk/oss"
)
func main() {
// Load credentials from environment variables OSS_ACCESS_KEY_ID, OSS_ACCESS_KEY_SECRET, and OSS_SESSION_TOKEN
provider, err := oss.NewEnvironmentVariableCredentialsProvider()
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
// Create an OSS client for the bucket's region
client, err := oss.New("https://oss-cn-hangzhou.aliyuncs.com", "", "", oss.SetCredentialsProvider(&provider), oss.AuthVersion(oss.AuthV4), oss.Region("cn-hangzhou"))
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
bucketName := "examplebucket"
bucket, err := client.Bucket(bucketName)
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
sourceKey := "src.mp3"
targetKey := "dest.aac"
// Transcoding parameters: start at 10 s, duration 60 s, output AAC at 96 kbit/s
style := "audio/convert,ss_10000,t_60000,f_aac,ab_96000"
// Base64-encode the bucket name and output object key for sys/saveas
bucketNameEncoded := base64.URLEncoding.EncodeToString([]byte(bucketName))
targetKeyEncoded := base64.URLEncoding.EncodeToString([]byte(targetKey))
process := fmt.Sprintf("%s|sys/saveas,b_%v,o_%v/notify,topic_QXVkaW9Db252ZXJ0", style, bucketNameEncoded, targetKeyEncoded)
result, err := bucket.AsyncProcessObject(sourceKey, process)
if err != nil {
log.Fatalf("Failed to async process object: %s", err)
}
fmt.Printf("EventId: %s\n", result.EventId)
fmt.Printf("RequestId: %s\n", result.RequestId)
fmt.Printf("TaskId: %s\n", result.TaskId)
}What's next
sys/saveas — save transcoded output to a specified OSS path
Use the notification feature — receive a callback when transcoding completes
Video transcoding — transcode video files and adjust parameters including bit depth