Audio transcoding converts audio files to your desired formats. This topic describes the parameters for audio transcoding and provides examples.
Use cases
-
Convert music file formats to ensure compatibility with various playback devices.
-
Optimize storage space by transcoding large, lossless audio files into highly compressed, lossy formats such as MP3.
-
Adapt to various network conditions for online streaming services by transcoding audio into multiple bitrates to ensure smooth playback on low-bandwidth connections.
-
Prepare audio assets for video production and post-production by transcoding them to match project requirements or distribution standards.
Usage notes
-
Audio transcoding supports only asynchronous processing, invoked using the
x-oss-async-processmethod. -
Before you use audio transcoding, you must associate an Intelligent Media Management (IMM) project with an OSS bucket. For more information, see Quick start and AttachOSSBucket.
-
Anonymous access will be denied.
-
You must have the required permissions to use the feature. For more information, see permissions.
-
Transcoding may fail if the default sampling rate or number of audio channels is incompatible with the target container format.
-
For audio transcoding, you can set the sampling bit depth only for the
flacformat. For video transcoding, you can set the bit depth by using thepixfmtparameter of the OSSx-oss-processaction. For more information, see Video transcoding.
Parameters
Action: audio/convert
The following table describes the parameters.
|
Parameter |
Type |
Required |
Description |
|
ss |
int |
No |
The start time of transcoding, in milliseconds. Valid values:
|
|
t |
int |
No |
The transcoding duration, in milliseconds. Valid values:
|
|
f |
string |
Yes |
The output container format. Valid values:
|
|
ar |
int |
No |
The output audio sampling rate. By default, the sampling rate of the source audio is used. Valid values:
Note
Supported sampling rates vary by format. MP3 supports sampling rates up to 48 kHz. Opus supports 8 kHz, 12 kHz, 16 kHz, 24 kHz, and 48 kHz. AC3 supports 32 kHz, 44.1 kHz, and 48 kHz. AMR supports only 8 kHz and 16 kHz. |
|
ac |
int |
No |
The number of output audio channels. By default, the number of channels in the source audio is used. Value range: 1 to 8. Note
The supported number of audio channels varies by format. MP3 supports only mono and stereo. AC3 supports up to 6 channels (5.1). AMR supports only mono. |
|
aq |
int |
No |
The audio quality. This parameter is mutually exclusive with the ab parameter. Value range: 0 to 100. |
|
ab |
int |
No |
The audio bitrate, in bits per second (bps). This parameter is mutually exclusive with the aq parameter. Value range: 1,000 to 10,000,000. |
|
abopt |
string |
No |
The audio bitrate option. Valid values:
|
|
adepth |
int |
No |
The sampling bit depth of the output audio. Valid values: 16 and 24. Note
This parameter is valid only when f is set to flac. |
The sys/saveas and notify parameters are also used for audio transcoding. For more information, see sys/saveas and Notifications.
Use the REST API
Convert MP3 to AAC
Job configuration
-
Container format: mp3 to aac
-
Source file: example.mp3
-
Transcoding duration: 60,000 milliseconds, starting from the 10,000th millisecond.
-
Audio settings: Keep the original sampling rate and number of audio channels, and set the audio bitrate to 96 Kbps.
-
Completion notification: Send a message using MNS.
Sample request
// Transcode the audio file named example.mp3.
POST /example.mp3?x-oss-async-process HTTP/1.1
Host: video-demo.oss-cn-hangzhou.aliyuncs.com
Date: Fri, 28 Oct 2022 06:40:10 GMT
Authorization: OSS4-HMAC-SHA256 Credential=LTAI********************/20250417/cn-hangzhou/oss/aliyun_v4_request,Signature=a7c3554c729d71929e0b84489addee6b2e8d5cb48595adfc51868c299c0c218e
x-oss-async-process=audio/convert,ss_10000,t_60000,f_aac,ab_96000|sys/saveas,b_b3V0YnVja2V0,o_b3V0b2JqcHJlZml4LnthdXRvZXh0fQo/notify,topic_QXVkaW9Db252ZXJ0
Convert WAV to Opus
Job configuration
-
Container format: wav to opus
-
Transcoding duration: The entire audio file.
-
Audio settings: 48 kHz sampling rate, two channels, and 96 Kbps audio bitrate.
-
Output path: oss://outbucket/outobject.opus
-
Completion notification: Send a message using MNS.
Sample request
// Transcode the audio file named example.wav.
POST /example.wav?x-oss-async-process HTTP/1.1
Host: video-demo.oss-cn-hangzhou.aliyuncs.com
Date: Fri, 28 Oct 2022 06:40:10 GMT
Authorization: OSS4-HMAC-SHA256 Credential=LTAI********************/20250417/cn-hangzhou/oss/aliyun_v4_request,Signature=a7c3554c729d71929e0b84489addee6b2e8d5cb48595adfc51868c299c0c218e
x-oss-async-process=audio/convert,f_opus,ab_96000,ar_48000,ac_2|sys/saveas,b_b3V0YnVja2V0, o_b3V0b2JqLnthdXRvZXh0fQo/notify,topic_QXVkaW9Db252ZXJ0
Use SDKs
You can use the Java, Python, and Go SDKs to perform asynchronous audio transcoding.
Java
Requires SDK for Java 3.17.4 or later.
import com.aliyun.oss.ClientBuilderConfiguration;
import com.aliyun.oss.OSS;
import com.aliyun.oss.OSSClientBuilder;
import com.aliyun.oss.common.auth.CredentialsProviderFactory;
import com.aliyun.oss.common.auth.EnvironmentVariableCredentialsProvider;
import com.aliyun.oss.common.comm.SignVersion;
import com.aliyun.oss.model.AsyncProcessObjectRequest;
import com.aliyun.oss.model.AsyncProcessObjectResult;
import com.aliyuncs.exceptions.ClientException;
import java.util.Base64;
public class Demo {
public static void main(String[] args) throws ClientException {
// yourEndpoint: The endpoint of the region where your bucket is located.
String endpoint = "https://oss-cn-hangzhou.aliyuncs.com";
// The region ID for the endpoint, for example, cn-hangzhou.
String region = "cn-hangzhou";
// Obtain access credentials from environment variables. Before you run this sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are configured.
EnvironmentVariableCredentialsProvider credentialsProvider = CredentialsProviderFactory.newEnvironmentVariableCredentialsProvider();
// Specify the bucket name.
String bucketName = "examplebucket";
// Specify the transcoded audio file.
String targetKey = "dest.aac";
// Specify the source audio file.
String sourceKey = "src.mp3";
// Create an OSSClient instance.
// When the OSSClient instance is no longer used, call the shutdown method to release resources.
ClientBuilderConfiguration clientBuilderConfiguration = new ClientBuilderConfiguration();
clientBuilderConfiguration.setSignatureVersion(SignVersion.V4);
OSS ossClient = OSSClientBuilder.create()
.endpoint(endpoint)
.credentialsProvider(credentialsProvider)
.clientConfiguration(clientBuilderConfiguration)
.region(region)
.build();
try {
// Build the audio processing style string and audio transcoding parameters.
String style = String.format("audio/convert,ss_10000,t_60000,f_aac,ab_96000");
// Build the asynchronous processing instruction.
String bucketEncoded = Base64.getUrlEncoder().withoutPadding().encodeToString(bucketName.getBytes());
String targetEncoded = Base64.getUrlEncoder().withoutPadding().encodeToString(targetKey.getBytes());
String process = String.format("%s|sys/saveas,b_%s,o_%s/notify,topic_QXVkaW9Db252ZXJ0", style, bucketEncoded, targetEncoded);
// Create an AsyncProcessObjectRequest object.
AsyncProcessObjectRequest request = new AsyncProcessObjectRequest(bucketName, sourceKey, process);
// Execute the asynchronous processing task.
AsyncProcessObjectResult response = ossClient.asyncProcessObject(request);
System.out.println("EventId: " + response.getEventId());
System.out.println("RequestId: " + response.getRequestId());
System.out.println("TaskId: " + response.getTaskId());
} finally {
// Shut down the OSSClient.
ossClient.shutdown();
}
}
}
Python
Requires SDK for Python 2.18.4 or later.
# -*- coding: utf-8 -*-
import base64
import oss2
from oss2.credentials import EnvironmentVariableCredentialsProvider
def main():
# Obtain access credentials from environment variables. Before you run this sample code, make sure that the environment variables are configured.
auth = oss2.ProviderAuth(EnvironmentVariableCredentialsProvider())
# The endpoint of the region where the bucket is located. For example, for China (Hangzhou), use https://oss-cn-hangzhou.aliyuncs.com.
endpoint = 'https://oss-cn-hangzhou.aliyuncs.com'
# Specify the bucket name, for example, examplebucket.
bucket = oss2.Bucket(auth, endpoint, 'examplebucket')
# Specify the name of the source audio file.
source_key = 'src.mp3'
# Specify the name of the transcoded audio file.
target_key = 'dest.aac'
# Build the audio processing style string and audio transcoding parameters.
audio_style = 'audio/convert,ss_10000,t_60000,f_aac,ab_96000'
# Build the processing instruction, including the save path and the Base64-encoded bucket name and target object name.
bucket_name_encoded = base64.urlsafe_b64encode('examplebucket'.encode()).decode().rstrip('=')
target_key_encoded = base64.urlsafe_b64encode(target_key.encode()).decode().rstrip('=')
process = f"{audio_style}|sys/saveas,b_{bucket_name_encoded},o_{target_key_encoded}/notify,topic_QXVkaW9Db252ZXJ0"
try:
# Execute the asynchronous processing task.
result = bucket.async_process_object(source_key, process)
print(f"EventId: {result.event_id}")
print(f"RequestId: {result.request_id}")
print(f"TaskId: {result.task_id}")
except Exception as e:
print(f"Error: {e}")
if __name__ == "__main__":
main()
Go
Requires SDK for Go 3.0.2 or later.
package main
import (
"encoding/base64"
"fmt"
"log"
"os"
"github.com/aliyun/aliyun-oss-go-sdk/oss"
)
func main() {
// Obtain temporary access credentials from environment variables. Before you run this sample code, make sure that the OSS_ACCESS_KEY_ID, OSS_ACCESS_KEY_SECRET, and OSS_SESSION_TOKEN environment variables are configured.
provider, err := oss.NewEnvironmentVariableCredentialsProvider()
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
// Create an OSSClient instance.
// yourEndpoint: The endpoint of the region where your bucket is located. For example, for China (Hangzhou), use https://oss-cn-hangzhou.aliyuncs.com. Set this parameter based on the actual region.
// yourRegion: The region ID, for example, cn-hangzhou.
client, err := oss.New("https://oss-cn-hangzhou.aliyuncs.com", "", "", oss.SetCredentialsProvider(&provider), oss.AuthVersion(oss.AuthV4), oss.Region("cn-hangzhou"))
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
// Specify the bucket name, for example, examplebucket.
bucketName := "examplebucket"
bucket, err := client.Bucket(bucketName)
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
// Specify the name of the source audio file.
sourceKey := "src.mp3"
// Specify the name of the transcoded audio file.
targetKey := "dest.aac"
// Build the audio processing style string and audio transcoding parameters.
audioStyle := "audio/convert,ss_10000,t_60000,f_aac,ab_96000"
// Build the processing instruction, including the save path and the Base64-encoded bucket name and target object name.
bucketNameEncoded := base64.URLEncoding.EncodeToString([]byte(bucketName))
targetKeyEncoded := base64.URLEncoding.EncodeToString([]byte(targetKey))
process := fmt.Sprintf("%s|sys/saveas,b_%v,o_%v/notify,topic_QXVkaW9Db252ZXJ0", audioStyle, bucketNameEncoded, targetKeyEncoded)
// Execute the asynchronous processing task.
result, err := bucket.AsyncProcessObject(sourceKey, process)
if err != nil {
log.Fatalf("Failed to async process object: %s", err)
}
fmt.Printf("EventId: %s\n", result.EventId)
fmt.Printf("RequestId: %s\n", result.RequestId)
fmt.Printf("TaskId: %s\n", result.TaskId)
}