The video concatenation feature lets you combine multiple videos into a single video and convert it to a specified format.
Feature introduction
Video merging is the capability to combine multiple video clips into a complete video and convert it to the required format.

Scenarios
Film production: In the production process of movies, TV series, and short films, video merging is one of the core steps that helps editors integrate different shots and scenes to build a complete narrative structure.
Content creation: On short video social media platforms, content creators often use video merging technologies to produce vlogs, tutorials, or themed videos, enhancing the attractiveness and visibility of their content.
Education and training: Teachers and trainers can create instructional videos by merging different video clips to combine theory and practice, thereby promoting student understanding and learning.
Sporting event playback: In sports broadcasts, video merging technologies are used to produce highlight reels to help the audience review exciting moments in the event.
How to use
Prerequisites
The Intelligent Media Management (IMM) service is activated. For more information, see Activate a product.
An IMM project is attached. To attach a project in the Object Storage Service (OSS) console, see Step 1: Attach an IMM project. To attach a project by calling an API operation, see AttachOSSBucket - Attach an OSS bucket.
Concatenate videos
You can only use the Java, Python, or Go SDK to concatenate videos through asynchronous processing.
Java
Use Java SDK 3.17.4 or later.
import com.aliyun.oss.ClientBuilderConfiguration;
import com.aliyun.oss.OSS;
import com.aliyun.oss.OSSClientBuilder;
import com.aliyun.oss.common.auth.CredentialsProviderFactory;
import com.aliyun.oss.common.auth.EnvironmentVariableCredentialsProvider;
import com.aliyun.oss.common.comm.SignVersion;
import com.aliyun.oss.model.AsyncProcessObjectRequest;
import com.aliyun.oss.model.AsyncProcessObjectResult;
import com.aliyuncs.exceptions.ClientException;
import java.util.Base64;
public class Demo {
public static void main(String[] args) throws ClientException {
// Replace yourEndpoint with the Endpoint of the region where the bucket is located.
String endpoint = "https://oss-cn-hangzhou.aliyuncs.com";
// Specify a common Alibaba Cloud region ID, such as cn-hangzhou.
String region = "cn-hangzhou";
// Obtain access credentials from environment variables. Before you run the sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are configured.
EnvironmentVariableCredentialsProvider credentialsProvider = CredentialsProviderFactory.newEnvironmentVariableCredentialsProvider();
// Specify the bucket name.
String bucketName = "examplebucket";
// Specify the name of the concatenated video file.
String targetObject = "dest.mp4";
// Specify the name of the source video file.
String sourceVideo = "src.mp4";
// Specify the names of the video files to concatenate.
String video1 = "concat1.mp4";
String video2 = "concat2.mp4";
// Create an OSSClient instance.
// When the OSSClient instance is no longer used, call the shutdown method to release resources.
ClientBuilderConfiguration clientBuilderConfiguration = new ClientBuilderConfiguration();
clientBuilderConfiguration.setSignatureVersion(SignVersion.V4);
OSS ossClient = OSSClientBuilder.create()
.endpoint(endpoint)
.credentialsProvider(credentialsProvider)
.clientConfiguration(clientBuilderConfiguration)
.region(region)
.build();
try {
// Encode the video file names.
String video1Encoded = Base64.getUrlEncoder().withoutPadding().encodeToString(video1.getBytes());
String video2Encoded = Base64.getUrlEncoder().withoutPadding().encodeToString(video2.getBytes());
// Build the video processing style string and the video concatenation parameters.
String style = String.format("video/concat,ss_0,f_mp4,vcodec_h264,fps_25,vb_1000000,acodec_aac,ab_96000,ar_48000,ac_2,align_1/pre,o_%s/sur,o_%s,t_0", video1Encoded, video2Encoded);
// Build the asynchronous processing instruction.
String bucketEncoded = Base64.getUrlEncoder().withoutPadding().encodeToString(bucketName.getBytes());
String targetEncoded = Base64.getUrlEncoder().withoutPadding().encodeToString(targetObject.getBytes());
String process = String.format("%s|sys/saveas,b_%s,o_%s/notify,topic_QXVkaW9Db252ZXJ0", style, bucketEncoded, targetEncoded);
// Create an AsyncProcessObjectRequest object.
AsyncProcessObjectRequest request = new AsyncProcessObjectRequest(bucketName, sourceVideo, process);
// Execute the asynchronous processing task.
AsyncProcessObjectResult response = ossClient.asyncProcessObject(request);
System.out.println("EventId: " + response.getEventId());
System.out.println("RequestId: " + response.getRequestId());
System.out.println("TaskId: " + response.getTaskId());
} finally {
// Shut down the OSSClient.
ossClient.shutdown();
}
}
}Python
Use Python SDK 2.18.4 or later.
# -*- coding: utf-8 -*-
import base64
import oss2
from oss2.credentials import EnvironmentVariableCredentialsProvider
def main():
# Obtain temporary access credentials from environment variables. Before you run the sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are configured.
auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider())
# Replace the Endpoint with the one for the region where the bucket is located. For example, if the bucket is in the China (Hangzhou) region, set the Endpoint to https://oss-cn-hangzhou.aliyuncs.com.
endpoint = 'https://oss-cn-hangzhou.aliyuncs.com'
# Specify a common Alibaba Cloud region ID, such as cn-hangzhou.
region = 'cn-hangzhou'
# Specify the bucket name, such as examplebucket.
bucket = oss2.Bucket(auth, endpoint, 'examplebucket', region=region)
# Specify the name of the concatenated video.
target_object = 'out.mp4'
# Specify the name of the source video file.
source_video = 'emrfinal.mp4'
# Specify the names of the video files to concatenate.
video1 = 'osshdfs.mp4'
video2 = 'product.mp4'
# Build the video processing style string and the video concatenation parameters.
video1_encoded = base64.urlsafe_b64encode(video1.encode()).decode().rstrip('=')
video2_encoded = base64.urlsafe_b64encode(video2.encode()).decode().rstrip('=')
style = f"video/concat,ss_0,f_mp4,vcodec_h264,fps_25,vb_1000000,acodec_aac,ab_96000,ar_48000,ac_2,align_1/pre,o_{video1_encoded}/sur,o_{video2_encoded},t_0"
# Build the asynchronous processing instruction.
bucket_encoded = base64.urlsafe_b64encode('examplebucket'.encode()).decode().rstrip('=')
target_encoded = base64.urlsafe_b64encode(target_object.encode()).decode().rstrip('=')
process = f"{style}|sys/saveas,b_{bucket_encoded},o_{target_encoded}/notify,topic_QXVkaW9Db252ZXJ0"
print(process)
# Execute the asynchronous processing task.
try:
result = bucket.async_process_object(source_video, process)
print(f"EventId: {result.event_id}")
print(f"RequestId: {result.request_id}")
print(f"TaskId: {result.task_id}")
except Exception as e:
print(f"Error: {e}")
if __name__ == "__main__":
main()
Go
Use Go SDK 3.0.2 or later.
package main
import (
"encoding/base64"
"fmt"
"os"
"strings"
"github.com/aliyun/aliyun-oss-go-sdk/oss"
)
func main() {
// Obtain temporary access credentials from environment variables. Before you run the sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are configured.
provider, err := oss.NewEnvironmentVariableCredentialsProvider()
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
// Create an OSSClient instance.
// Replace yourEndpoint with the Endpoint of the bucket. For example, if the bucket is in the China (Hangzhou) region, set the Endpoint to https://oss-cn-hangzhou.aliyuncs.com. For other regions, specify the actual Endpoint.
// Replace yourRegion with a common Alibaba Cloud region ID, such as cn-hangzhou.
client, err := oss.New("yourEndpoint", "", "", oss.SetCredentialsProvider(&provider), oss.AuthVersion(oss.AuthV4), oss.Region("yourRegion"))
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
// Specify the bucket name, such as examplebucket.
bucketName := "examplebucket"
bucket, err := client.Bucket(bucketName)
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
// Specify the name of the concatenated video file.
targetObject := "dest.mp4"
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
// Specify the name of the source video file.
sourcevideo := "src.mp4"
// Specify the names of the video files to concatenate.
video1 := "concat1.mp4"
video2 := "concat2.mp4"
// Build the video processing style string and the video concatenation parameters.
style := fmt.Sprintf("video/concat,ss_0,f_mp4,vcodec_h264,fps_25,vb_1000000,acodec_aac,ab_96000,ar_48000,ac_2,align_1/pre,o_%s/sur,o_%s,t_0", strings.TrimRight(base64.URLEncoding.EncodeToString([]byte(video1)), "="), strings.TrimRight(base64.URLEncoding.EncodeToString([]byte(video2)), "="))
// Build the asynchronous processing instruction.
process := fmt.Sprintf("%s|sys/saveas,b_%v,o_%v/notify,topic_QXVkaW9Db252ZXJ0", style, strings.TrimRight(base64.URLEncoding.EncodeToString([]byte(bucketName)), "="), strings.TrimRight(base64.URLEncoding.EncodeToString([]byte(targetObject)), "="))
fmt.Printf("%#v\n", process)
rs, err := bucket.AsyncProcessObject(sourcevideo, process)
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
fmt.Printf("EventId:%s\n", rs.EventId)
fmt.Printf("RequestId:%s\n", rs.RequestId)
fmt.Printf("TaskId:%s\n", rs.TaskId)
}Asynchronous processing requests do not return processing results. To obtain the results of an asynchronous task, use Simple Message Queue (SMQ), formerly known as MNS. For more information, see Message notifications.
Parameter description
Action: video/concat
The following table describes the parameters.
Concatenation parameters
The video/concat operation concatenates videos in the order that pre and sur appear in the request string. Details are as follows:
/pre: The video file to concatenate at the beginning./sur: The video file to concatenate at the end.
Parameter | Type | Required | Description |
ss | int | No | The start time for concatenating the prefix or suffix video, in milliseconds. Valid values:
|
t | int | No | The duration for concatenating the prefix or suffix video, in milliseconds. Valid values:
|
o | string | Yes | The OSS object in the current bucket. The object name must be Base64 URL-safe encoded. |
Transcoding parameters
Parameter | Type | Required | Description |
ss | int | No | The transcoding start time for the video being concatenated, in milliseconds. Valid values:
|
t | int | No | The transcoding duration for the video being concatenated, in milliseconds. Valid values:
|
f | string | Yes | The video container. Valid values:
|
vn | int | No | Specifies whether to disable the video stream. Valid values:
|
vcodec | string | Yes | The video codec (encoding format). Valid values:
Note The mxf and flv formats do not support H.265. |
fps | float | No | The video frame rate. By default, this parameter is the same as that of the source video specified by align. The value range is 0 to 240. |
fpsopt | int | No | The video frame rate option. Valid values:
Note This parameter must be configured with fps. |
pixfmt | string | No | The pixel sampling format. By default, this parameter is the same as that of the source video specified by align. Valid values:
|
s | string | No | The resolution.
|
sopt | int | No | The resolution option. Valid values:
Note This parameter must be configured with s. |
scaletype | string | No | The scaling method. Valid values:
|
arotate | int | No | The automatic rotation of resolution direction. Valid values:
|
g | int | No | The Group of Pictures (GOP) size. The default value is 150. The value range is 1 to 100000. |
vb | int | No | The video bitrate, in bits per second (bps). The value range is 10000 to 100000000. Note This parameter is mutually exclusive with crf. They represent different bitrate control algorithms. If neither is set, the video is encoded at the default bitrate for the output resolution. |
vbopt | int | No | The video bitrate option. Valid values:
Note This parameter must be configured with vb. |
crf | float | No | The constant rate factor. The value range is 0 to 51. A larger value indicates lower image quality. We recommend a value from 18 to 38. |
maxrate | int | No | The maximum bitrate, in bits per second (bps). The default value is 0. The value range is 10000 to 100000000. Note This parameter must be configured with crf. |
bufsize | int | No | The buffer size, in bits. The default value is 0. The value range is 10000 to 200000000. Note This parameter must be configured with crf. |
an | int | No | Specifies whether to disable the audio stream. Valid values:
|
acodec | string | Yes | The audio codec (encoding format). Valid values:
Note mp4 does not support pcm. mov does not support flac or opus. asf does not support opus. avi does not support opus. mxf supports only pcm. ts does not support flac, vorbis, amr, or pcm. flv does not support flac, vorbis, amr, opus, or pcm. |
ar | int | No | The audio sampling rate. By default, this parameter is the same as that of the source video specified by align. Valid values:
Note The supported sample rates vary by format. mp3 supports only 48 kHz and lower. opus supports 8 kHz, 12 kHz, 16 kHz, 24 kHz, and 48 kHz. ac3 supports 32 kHz, 44.1 kHz, and 48 kHz. amr supports only 8 kHz and 16 kHz. |
ac | int | No | The number of sound channels. By default, this parameter is the same as that of the source video specified by align. The value range is 1 to 8. Note The supported number of channels varies by format. mp3 supports only mono and stereo. ac3 supports up to 6 channels (5.1). amr supports only mono. |
aq | int | No | The audio compression quality. The value range is 0 to 100. Note This parameter is mutually exclusive with ab. If neither is set, the audio is encoded at the default bitrate of the encoder. |
ab | int | No | The audio bitrate, in bits per second (bps). The value range is 1000 to 10000000. |
abopt | string | No | The audio bitrate option. Valid values:
Note This parameter must be configured with ab. |
align | int | No | The ordinal number of the main video file (which provides the default transcoding parameters) in the concatenation list. The default value is 0, which aligns with the first video file in the concatenation list. |
adepth | int | No | The audio sampling bit depth. Valid values: 16 or 24. Note This parameter is valid only when acodec is set to flac. |
Video concatenation also uses the sys/saveas and notify parameters. For more information, see Save as and Message notifications.
Media sharding parameters
/segment: Sharding parameters
Parameter | Type | Required | Description |
f | string | Yes | The sharding format. Valid values:
|
t | int | Yes | The shard length, in milliseconds. The value range is 0 to 3600000. |
Media sharding supports only mp4 and ts containers.
Related API operations
Concatenate videos into an MP4 file
Concatenation information
Source video names: pre.mov, example.mkv, sur.mov
Concatenation duration and order:
Video name
Order
Duration
pre.mov
1
Entire video
example.mkv
2
From the 10th second to the end
sur.mov
3
From the beginning to the 10th second
Transcoding completion notification: An MNS message is sent.
Concatenated video information
Video format: h264
Video frame rate: 25 fps
Video bitrate: 1 Mbps
Audio format: aac
Audio configuration: 48 kHz sample rate, dual channel
Audio bitrate: 96 Kbps
File storage path
MP4 file: oss://outbucket/outobj.mp4
Processing example
// Concatenate the video file example.mkv.
POST /example.mkv?x-oss-async-process HTTP/1.1
Host: video-demo.oss-cn-hangzhou.aliyuncs.com
Date: Fri, 28 Oct 2022 06:40:10 GMT
Authorization: OSS4-HMAC-SHA256 Credential=LTAI********************/20250417/cn-hangzhou/oss/aliyun_v4_request,Signature=a7c3554c729d71929e0b84489addee6b2e8d5cb48595adfc51868c299c0c218e
x-oss-async-process=video/concat,ss_10000,f_mp4,vcodec_h264,fps_25,vb_1000000,acodec_aac,ab_96000,ar_48000,ac_2,align_1/pre,o_cHJlLm1vdgo/sur,o_c3VyMS5hYWMK,t_10000|sys/saveas,b_b3V0YnVja2V0,o_b3V0b2JqLnthdXRvZXh0fQo/notify,topic_QXVkaW9Db252ZXJ0Permissions
An Alibaba Cloud account has all permissions by default. A Resource Access Management (RAM) user or RAM role does not have any permissions by default. You must grant permissions to the RAM user or RAM role using a RAM policy or a bucket policy.
API | Action | Definition |
GetObject |
| Downloads an object. |
| When downloading an object, if you specify the object version through versionId, this permission is required. | |
| When downloading an object, if the object metadata contains X-Oss-Server-Side-Encryption: KMS, this permission is required. |
API | Action | Definition |
HeadObject |
| Queries the metadata of an object. |
API | Action | Definition |
PutObject |
| Uploads an object. |
| When uploading an object, if you specify object tags through | |
| When uploading an object, if the object metadata contains | |
|
API | Action | Definition |
CreateMediaConvertTask |
| Permission to use IMM for media transcoding. |
Billing
During video concatenation, because the IMM service is called, billable items are generated for both OSS and IMM. The details are as follows:
OSS side: You need to call the GetObject operation and add the x-oss-async-process parameter to concatenate videos. You also need to call the HeadObject operation to retrieve the object metadata. After concatenation, the PutObject operation is called to upload the generated video to the bucket. The following billable items are generated. For detailed pricing, see OSS Pricing:
API
Billable item
Description
GetObject
GET requests
You are charged request fees based on the number of successful requests.
Outbound traffic over the Internet
If you call the GetObject operation by using a public endpoint, such as oss-cn-hangzhou.aliyuncs.com, or an acceleration endpoint, such as oss-accelerate.aliyuncs.com, you are charged fees for outbound traffic over the Internet based on the data size.
Retrieval of IA objects
If IA objects are retrieved, you are charged IA data retrieval fees based on the size of the retrieved IA data.
Retrieval of Archive objects in a bucket for which real-time access is enabled
If you retrieve Archive objects in a bucket for which real-time access is enabled, you are charged Archive data retrieval fees based on the size of retrieved Archive objects.
Transfer acceleration fees
If you enable transfer acceleration and use an acceleration endpoint to access your bucket, you are charged transfer acceleration fees based on the data size.
API
Billable item
Description
PutObject
PUT requests
You are charged request fees based on the number of successful requests.
Storage fees
You are charged storage fees based on the storage class, size, and storage duration of the object.
API
Billable item
Description
HeadObject
GET requests
Request fees are calculated based on the number of successful requests.
IMM side: The following billable items are generated. For detailed pricing, see IMM billable items:
API
Billable item
Description
CreateMediaConvertTask
ApsaraVideo Media Processing fees
ApsaraVideo Media Processing fees are calculated based on the definition and actual duration (in seconds) of the concatenated video.
Notes
Video concatenation supports only asynchronous processing (using the x-oss-async-process method).
Anonymous access will be denied.
When transcoding with the default sample rate or number of sound channels, concatenation may fail due to compatibility issues with the target video container.
A maximum of 11 videos can be concatenated at a time.