What is video concatenation - Object Storage Service - Alibaba Cloud Documentation Center

The video concatenation feature lets you combine multiple videos into a single video and convert it to a specified format.

Feature introduction

Video merging is the capability to combine multiple video clips into a complete video and convert it to the required format.

002

Scenarios

Film production: In the production process of movies, TV series, and short films, video merging is one of the core steps that helps editors integrate different shots and scenes to build a complete narrative structure.
Content creation: On short video social media platforms, content creators often use video merging technologies to produce vlogs, tutorials, or themed videos, enhancing the attractiveness and visibility of their content.
Education and training: Teachers and trainers can create instructional videos by merging different video clips to combine theory and practice, thereby promoting student understanding and learning.
Sporting event playback: In sports broadcasts, video merging technologies are used to produce highlight reels to help the audience review exciting moments in the event.

How to use

Prerequisites

The Intelligent Media Management (IMM) service is activated. For more information, see Activate a product.
An IMM project is attached. To attach a project in the Object Storage Service (OSS) console, see Step 1: Attach an IMM project. To attach a project by calling an API operation, see AttachOSSBucket - Attach an OSS bucket.

Concatenate videos

You can only use the Java, Python, or Go SDK to concatenate videos through asynchronous processing.

Java

Use Java SDK 3.17.4 or later.

import com.aliyun.oss.ClientBuilderConfiguration;
import com.aliyun.oss.OSS;
import com.aliyun.oss.OSSClientBuilder;
import com.aliyun.oss.common.auth.CredentialsProviderFactory;
import com.aliyun.oss.common.auth.EnvironmentVariableCredentialsProvider;
import com.aliyun.oss.common.comm.SignVersion;
import com.aliyun.oss.model.AsyncProcessObjectRequest;
import com.aliyun.oss.model.AsyncProcessObjectResult;
import com.aliyuncs.exceptions.ClientException;

import java.util.Base64;

public class Demo {
    public static void main(String[] args) throws ClientException {
        // Replace yourEndpoint with the Endpoint of the region where the bucket is located.
        String endpoint = "https://oss-cn-hangzhou.aliyuncs.com";
        // Specify a common Alibaba Cloud region ID, such as cn-hangzhou.
        String region = "cn-hangzhou";
        // Obtain access credentials from environment variables. Before you run the sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are configured.
        EnvironmentVariableCredentialsProvider credentialsProvider = CredentialsProviderFactory.newEnvironmentVariableCredentialsProvider();
        // Specify the bucket name.
        String bucketName = "examplebucket";
        // Specify the name of the concatenated video file.
        String targetObject = "dest.mp4";
        // Specify the name of the source video file.
        String sourceVideo = "src.mp4";
        // Specify the names of the video files to concatenate.
        String video1 = "concat1.mp4";
        String video2 = "concat2.mp4";
        // Create an OSSClient instance.
        // When the OSSClient instance is no longer used, call the shutdown method to release resources.
        ClientBuilderConfiguration clientBuilderConfiguration = new ClientBuilderConfiguration();
        clientBuilderConfiguration.setSignatureVersion(SignVersion.V4);
        OSS ossClient = OSSClientBuilder.create()
                .endpoint(endpoint)
                .credentialsProvider(credentialsProvider)
                .clientConfiguration(clientBuilderConfiguration)
                .region(region)
                .build();

        try {
            // Encode the video file names.
            String video1Encoded = Base64.getUrlEncoder().withoutPadding().encodeToString(video1.getBytes());
            String video2Encoded = Base64.getUrlEncoder().withoutPadding().encodeToString(video2.getBytes());
            // Build the video processing style string and the video concatenation parameters.
            String style = String.format("video/concat,ss_0,f_mp4,vcodec_h264,fps_25,vb_1000000,acodec_aac,ab_96000,ar_48000,ac_2,align_1/pre,o_%s/sur,o_%s,t_0", video1Encoded, video2Encoded);
            // Build the asynchronous processing instruction.
            String bucketEncoded = Base64.getUrlEncoder().withoutPadding().encodeToString(bucketName.getBytes());
            String targetEncoded = Base64.getUrlEncoder().withoutPadding().encodeToString(targetObject.getBytes());
            String process = String.format("%s|sys/saveas,b_%s,o_%s/notify,topic_QXVkaW9Db252ZXJ0", style, bucketEncoded, targetEncoded);
            // Create an AsyncProcessObjectRequest object.
            AsyncProcessObjectRequest request = new AsyncProcessObjectRequest(bucketName, sourceVideo, process);
            // Execute the asynchronous processing task.
            AsyncProcessObjectResult response = ossClient.asyncProcessObject(request);
            System.out.println("EventId: " + response.getEventId());
            System.out.println("RequestId: " + response.getRequestId());
            System.out.println("TaskId: " + response.getTaskId());

        } finally {
            // Shut down the OSSClient.
            ossClient.shutdown();
        }
    }
}

Python

Use Python SDK 2.18.4 or later.

# -*- coding: utf-8 -*-
import base64
import oss2
from oss2.credentials import EnvironmentVariableCredentialsProvider

def main():
    # Obtain temporary access credentials from environment variables. Before you run the sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are configured.
    auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider())
    # Replace the Endpoint with the one for the region where the bucket is located. For example, if the bucket is in the China (Hangzhou) region, set the Endpoint to https://oss-cn-hangzhou.aliyuncs.com.
    endpoint = 'https://oss-cn-hangzhou.aliyuncs.com'
    # Specify a common Alibaba Cloud region ID, such as cn-hangzhou.
    region = 'cn-hangzhou'

    # Specify the bucket name, such as examplebucket.
    bucket = oss2.Bucket(auth, endpoint, 'examplebucket', region=region)

    # Specify the name of the concatenated video.
    target_object = 'out.mp4'
    # Specify the name of the source video file.
    source_video = 'emrfinal.mp4'
    # Specify the names of the video files to concatenate.
    video1 = 'osshdfs.mp4'
    video2 = 'product.mp4'
    # Build the video processing style string and the video concatenation parameters.
    video1_encoded = base64.urlsafe_b64encode(video1.encode()).decode().rstrip('=')
    video2_encoded = base64.urlsafe_b64encode(video2.encode()).decode().rstrip('=')
    style = f"video/concat,ss_0,f_mp4,vcodec_h264,fps_25,vb_1000000,acodec_aac,ab_96000,ar_48000,ac_2,align_1/pre,o_{video1_encoded}/sur,o_{video2_encoded},t_0"
    # Build the asynchronous processing instruction.
    bucket_encoded = base64.urlsafe_b64encode('examplebucket'.encode()).decode().rstrip('=')
    target_encoded = base64.urlsafe_b64encode(target_object.encode()).decode().rstrip('=')
    process = f"{style}|sys/saveas,b_{bucket_encoded},o_{target_encoded}/notify,topic_QXVkaW9Db252ZXJ0"
    print(process)
    # Execute the asynchronous processing task.
    try:
        result = bucket.async_process_object(source_video, process)
        print(f"EventId: {result.event_id}")
        print(f"RequestId: {result.request_id}")
        print(f"TaskId: {result.task_id}")
    except Exception as e:
        print(f"Error: {e}")


if __name__ == "__main__":
    main()

Go

Use Go SDK 3.0.2 or later.

package main

import (
    "encoding/base64"
    "fmt"
    "os"
    "strings"

    "github.com/aliyun/aliyun-oss-go-sdk/oss"
)

func main() {
    // Obtain temporary access credentials from environment variables. Before you run the sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are configured.
    provider, err := oss.NewEnvironmentVariableCredentialsProvider()
    if err != nil {
    fmt.Println("Error:", err)
    os.Exit(-1)
    }
    // Create an OSSClient instance.
    // Replace yourEndpoint with the Endpoint of the bucket. For example, if the bucket is in the China (Hangzhou) region, set the Endpoint to https://oss-cn-hangzhou.aliyuncs.com. For other regions, specify the actual Endpoint.
    // Replace yourRegion with a common Alibaba Cloud region ID, such as cn-hangzhou.
    client, err := oss.New("yourEndpoint", "", "", oss.SetCredentialsProvider(&provider), oss.AuthVersion(oss.AuthV4), oss.Region("yourRegion"))
    if err != nil {
    fmt.Println("Error:", err)
    os.Exit(-1)
    }
    // Specify the bucket name, such as examplebucket.
    bucketName := "examplebucket"

    bucket, err := client.Bucket(bucketName)
    if err != nil {
    fmt.Println("Error:", err)
    os.Exit(-1)
    }
    // Specify the name of the concatenated video file.
    targetObject := "dest.mp4"
    if err != nil {
    fmt.Println("Error:", err)
    os.Exit(-1)
    }
    // Specify the name of the source video file.
    sourcevideo := "src.mp4"
    // Specify the names of the video files to concatenate.
    video1 := "concat1.mp4"
    video2 := "concat2.mp4"
    // Build the video processing style string and the video concatenation parameters.    
    style := fmt.Sprintf("video/concat,ss_0,f_mp4,vcodec_h264,fps_25,vb_1000000,acodec_aac,ab_96000,ar_48000,ac_2,align_1/pre,o_%s/sur,o_%s,t_0", strings.TrimRight(base64.URLEncoding.EncodeToString([]byte(video1)), "="), strings.TrimRight(base64.URLEncoding.EncodeToString([]byte(video2)), "="))
    // Build the asynchronous processing instruction.
    process := fmt.Sprintf("%s|sys/saveas,b_%v,o_%v/notify,topic_QXVkaW9Db252ZXJ0", style, strings.TrimRight(base64.URLEncoding.EncodeToString([]byte(bucketName)), "="), strings.TrimRight(base64.URLEncoding.EncodeToString([]byte(targetObject)), "="))
    fmt.Printf("%#v\n", process)
    rs, err := bucket.AsyncProcessObject(sourcevideo, process)
    if err != nil {
    fmt.Println("Error:", err)
    os.Exit(-1)
    }
    fmt.Printf("EventId:%s\n", rs.EventId)
    fmt.Printf("RequestId:%s\n", rs.RequestId)
    fmt.Printf("TaskId:%s\n", rs.TaskId)
}

Note

Asynchronous processing requests do not return processing results. To obtain the results of an asynchronous task, use Simple Message Queue (SMQ), formerly known as MNS. For more information, see Message notifications.

Parameter description

Action: video/concat

The following table describes the parameters.

Concatenation parameters

The video/concat operation concatenates videos in the order that pre and sur appear in the request string. Details are as follows:

/pre: The video file to concatenate at the beginning.
/sur: The video file to concatenate at the end.

Parameter	Type	Required	Description
ss	int	No	The start time for concatenating the prefix or suffix video, in milliseconds. Valid values: 0 (default): Starts from the beginning. A value greater than 0: Starts from the ss millisecond.
t	int	No	The duration for concatenating the prefix or suffix video, in milliseconds. Valid values: 0 (default): Continues to the end. A value greater than 0: Lasts for t milliseconds.
o	string	Yes	The OSS object in the current bucket. The object name must be Base64 URL-safe encoded.

Transcoding parameters

Parameter	Type	Required	Description
ss	int	No	The transcoding start time for the video being concatenated, in milliseconds. Valid values: 0 (default): Starts from the beginning. A value greater than 0: Starts from the ss millisecond.
t	int	No	The transcoding duration for the video being concatenated, in milliseconds. Valid values: 0 (default): Continues to the end. A value greater than 0: Lasts for t milliseconds.
f	string	Yes	The video container. Valid values: mp4, mkv, mov, asf, avi, mxf, ts, flv, webm
vn	int	No	Specifies whether to disable the video stream. Valid values: 0 (default): Does not disable the video stream. 1: Disabled.
vcodec	string	Yes	The video codec (encoding format). Valid values: h264: H.264 encoding format. h265: H.265 encoding format. vp9: VP9 encoding format. Note The mxf and flv formats do not support H.265.
fps	float	No	The video frame rate. By default, this parameter is the same as that of the source video specified by align. The value range is 0 to 240.
fpsopt	int	No	The video frame rate option. Valid values: 0: Always uses the target frame rate. 1: If the frame rate of a source video in the concatenation list is less than the value of fps, the smallest source video frame rate in the list is used. 2: If the frame rate of a source video in the concatenation list is less than the value of fps, the operation fails. Note This parameter must be configured with fps.
pixfmt	string	No	The pixel sampling format. By default, this parameter is the same as that of the source video specified by align. Valid values: yuv420p yuva420p yuv420p10le yuv422p yuv422p10le yuv444p yuv444p10le
s	string	No	The resolution. The format is `w x h`, which means width × height. The width and height must be multiples of 2 and range from 64 to 4096. For example, 4096x4096 and 64x64 are valid.
sopt	int	No	The resolution option. Valid values: 0: Always uses the target resolution. 1: If the resolution of a source video in the concatenation list is smaller than the value of s, the smallest source video resolution in the list is used. 2: If the resolution of a source video in the concatenation list is smaller than the value of s, the operation fails. Note This parameter must be configured with s.
scaletype	string	No	The scaling method. Valid values: crop: Scales and crops. stretch (default): Stretches to fill. fill: Scales and pads with black bars. fit: Scales without padding with black bars. The aspect ratio is preserved.
arotate	int	No	The automatic rotation of resolution direction. Valid values: 0 (default): Shutdown. 1. Turn it on.
g	int	No	The Group of Pictures (GOP) size. The default value is 150. The value range is 1 to 100000.
vb	int	No	The video bitrate, in bits per second (bps). The value range is 10000 to 100000000. Note This parameter is mutually exclusive with crf. They represent different bitrate control algorithms. If neither is set, the video is encoded at the default bitrate for the output resolution.
vbopt	int	No	The video bitrate option. Valid values: 0: Always uses the target video bitrate. 1: If the bitrate of a source video in the concatenation list is less than the value of vb, the smallest source video bitrate in the list is used. 2: If the bitrate of a source video in the concatenation list is less than the value of vb, the operation fails. Note This parameter must be configured with vb.
crf	float	No	The constant rate factor. The value range is 0 to 51. A larger value indicates lower image quality. We recommend a value from 18 to 38.
maxrate	int	No	The maximum bitrate, in bits per second (bps). The default value is 0. The value range is 10000 to 100000000. Note This parameter must be configured with crf.
bufsize	int	No	The buffer size, in bits. The default value is 0. The value range is 10000 to 200000000. Note This parameter must be configured with crf.
an	int	No	Specifies whether to disable the audio stream. Valid values: 0 (default): Does not disable the audio stream. 1: Disabled.
acodec	string	Yes	The audio codec (encoding format). Valid values: mp3 aac flac vorbis ac3 opus pcm Note mp4 does not support pcm. mov does not support flac or opus. asf does not support opus. avi does not support opus. mxf supports only pcm. ts does not support flac, vorbis, amr, or pcm. flv does not support flac, vorbis, amr, opus, or pcm.
ar	int	No	The audio sampling rate. By default, this parameter is the same as that of the source video specified by align. Valid values: 8000 11025 12000 16000 22050 24000 32000 44100 48000 64000 88200 96000 Note The supported sample rates vary by format. mp3 supports only 48 kHz and lower. opus supports 8 kHz, 12 kHz, 16 kHz, 24 kHz, and 48 kHz. ac3 supports 32 kHz, 44.1 kHz, and 48 kHz. amr supports only 8 kHz and 16 kHz.
ac	int	No	The number of sound channels. By default, this parameter is the same as that of the source video specified by align. The value range is 1 to 8. Note The supported number of channels varies by format. mp3 supports only mono and stereo. ac3 supports up to 6 channels (5.1). amr supports only mono.
aq	int	No	The audio compression quality. The value range is 0 to 100. Note This parameter is mutually exclusive with ab. If neither is set, the audio is encoded at the default bitrate of the encoder.
ab	int	No	The audio bitrate, in bits per second (bps). The value range is 1000 to 10000000.
abopt	string	No	The audio bitrate option. Valid values: 0 (default): Always uses the target audio bitrate. 1: If the bitrate of a source audio in the concatenation list is less than the value of ab, the smallest source audio bitrate in the list is used. 2: If the bitrate of a source audio in the concatenation list is less than the value of ab, the operation fails. Note This parameter must be configured with ab.
align	int	No	The ordinal number of the main video file (which provides the default transcoding parameters) in the concatenation list. The default value is 0, which aligns with the first video file in the concatenation list.
adepth	int	No	The audio sampling bit depth. Valid values: 16 or 24. Note This parameter is valid only when acodec is set to flac.

Note

Video concatenation also uses the sys/saveas and notify parameters. For more information, see Save as and Message notifications.

Media sharding parameters

/segment: Sharding parameters

Parameter

Type

Required

Description

string

Yes

The sharding format. Valid values:

hls
dash

int

Yes

The shard length, in milliseconds. The value range is 0 to 3600000.

Note

Media sharding supports only mp4 and ts containers.

Related API operations

Concatenate videos into an MP4 file

Concatenation information

Source video names: pre.mov, example.mkv, sur.mov
Concatenation duration and order:
Video name
Order
Duration
pre.mov
1
Entire video
example.mkv
2
From the 10th second to the end
sur.mov
3
From the beginning to the 10th second
Transcoding completion notification: An MNS message is sent.
Concatenated video information
- Video format: h264
- Video frame rate: 25 fps
- Video bitrate: 1 Mbps
- Audio format: aac
- Audio configuration: 48 kHz sample rate, dual channel
- Audio bitrate: 96 Kbps
File storage path
- MP4 file: oss://outbucket/outobj.mp4

Processing example

// Concatenate the video file example.mkv.
POST /example.mkv?x-oss-async-process HTTP/1.1
Host: video-demo.oss-cn-hangzhou.aliyuncs.com
Date: Fri, 28 Oct 2022 06:40:10 GMT
Authorization: OSS4-HMAC-SHA256 Credential=LTAI********************/20250417/cn-hangzhou/oss/aliyun_v4_request,Signature=a7c3554c729d71929e0b84489addee6b2e8d5cb48595adfc51868c299c0c218e
 
x-oss-async-process=video/concat,ss_10000,f_mp4,vcodec_h264,fps_25,vb_1000000,acodec_aac,ab_96000,ar_48000,ac_2,align_1/pre,o_cHJlLm1vdgo/sur,o_c3VyMS5hYWMK,t_10000|sys/saveas,b_b3V0YnVja2V0,o_b3V0b2JqLnthdXRvZXh0fQo/notify,topic_QXVkaW9Db252ZXJ0

Permissions

An Alibaba Cloud account has all permissions by default. A Resource Access Management (RAM) user or RAM role does not have any permissions by default. You must grant permissions to the RAM user or RAM role using a RAM policy or a bucket policy.

API	Action	Definition
GetObject	`oss:GetObject`	Downloads an object.
	`oss:GetObjectVersion`	When downloading an object, if you specify the object version through versionId, this permission is required.
	`kms:Decrypt`	When downloading an object, if the object metadata contains X-Oss-Server-Side-Encryption: KMS, this permission is required.

API	Action	Definition
HeadObject	`oss:GetObject`	Queries the metadata of an object.

API	Action	Definition
PutObject	`oss:PutObject`	Uploads an object.
	`oss:PutObjectTagging`	When uploading an object, if you specify object tags through `x-oss-tagging`, this permission is required.
	`kms:GenerateDataKey`	When uploading an object, if the object metadata contains `X-Oss-Server-Side-Encryption: KMS`, these two permissions are required.
	`kms:Decrypt`

API	Action	Definition
CreateMediaConvertTask	`imm:CreateMediaConvertTask`	Permission to use IMM for media transcoding.

Billing

During video concatenation, because the IMM service is called, billable items are generated for both OSS and IMM. The details are as follows:

OSS side: You need to call the GetObject operation and add the x-oss-async-process parameter to concatenate videos. You also need to call the HeadObject operation to retrieve the object metadata. After concatenation, the PutObject operation is called to upload the generated video to the bucket. The following billable items are generated. For detailed pricing, see OSS Pricing:

API	Billable item	Description
GetObject	GET requests	You are charged request fees based on the number of successful requests.
	Outbound traffic over the Internet	If you call the GetObject operation by using a public endpoint, such as oss-cn-hangzhou.aliyuncs.com, or an acceleration endpoint, such as oss-accelerate.aliyuncs.com, you are charged fees for outbound traffic over the Internet based on the data size.
	Retrieval of IA objects	If IA objects are retrieved, you are charged IA data retrieval fees based on the size of the retrieved IA data.
	Retrieval of Archive objects in a bucket for which real-time access is enabled	If you retrieve Archive objects in a bucket for which real-time access is enabled, you are charged Archive data retrieval fees based on the size of retrieved Archive objects.
	Transfer acceleration fees	If you enable transfer acceleration and use an acceleration endpoint to access your bucket, you are charged transfer acceleration fees based on the data size.

API	Billable item	Description
PutObject	PUT requests	You are charged request fees based on the number of successful requests.
PutObject	Storage fees	You are charged storage fees based on the storage class, size, and storage duration of the object.

API	Billable item	Description
HeadObject	GET requests	Request fees are calculated based on the number of successful requests.

IMM side: The following billable items are generated. For detailed pricing, see IMM billable items:

API	Billable item	Description
CreateMediaConvertTask	ApsaraVideo Media Processing fees	ApsaraVideo Media Processing fees are calculated based on the definition and actual duration (in seconds) of the concatenated video.

Notes

Video concatenation supports only asynchronous processing (using the x-oss-async-process method).
Anonymous access will be denied.
When transcoding with the default sample rate or number of sound channels, concatenation may fail due to compatibility issues with the target video container.
A maximum of 11 videos can be concatenated at a time.

Video name	Order	Duration
pre.mov	1	Entire video
example.mkv	2	From the 10th second to the end
sur.mov	3	From the beginning to the 10th second