All Products
Search
Document Center

Object Storage Service:Video concatenation

Last Updated:Mar 20, 2026

Combine multiple videos stored in an OSS bucket into a single output file with transcoding control. The operation is submitted as an asynchronous task via the OSS SDK or REST API.

002

Use cases

  • Film and video production: Stitch shots and scenes from different clips into a complete narrative sequence.

  • Short-form content: Combine clips into vlogs, tutorials, or themed videos for social media platforms.

  • Education and training: Merge theory and demonstration clips into a single instructional video.

  • Sports highlights: Assemble highlight reels from multiple footage segments.

How it works

The video/concat action concatenates videos in the order that /pre and /sur segments appear in the processing string, then transcodes the result to the specified format.

All examples in this topic use the x-oss-async-process header to submit the task as an asynchronous job.

Prerequisites

Before you begin, ensure that you have:

Limitations

Review these constraints before submitting a task:

  • Asynchronous only: Video concatenation supports only asynchronous processing (x-oss-async-process). Synchronous requests are not supported.

  • Maximum input videos: A single task can concatenate up to 11 videos.

  • Anonymous access: Requests from anonymous users are denied.

  • Container compatibility: When using default audio sample rates or audio channel counts, concatenation may fail if those defaults are incompatible with the target container. Specify explicit values for ar and ac to avoid this.

  • Codec and container restrictions: See Transcoding parameters for codec-to-container compatibility rules.

Concatenate videos

Use Java SDK 3.17.4 or later, Python SDK 2.18.4 or later, or Go SDK 3.0.2 or later.

Java

import com.aliyun.oss.ClientBuilderConfiguration;
import com.aliyun.oss.OSS;
import com.aliyun.oss.OSSClientBuilder;
import com.aliyun.oss.common.auth.CredentialsProviderFactory;
import com.aliyun.oss.common.auth.EnvironmentVariableCredentialsProvider;
import com.aliyun.oss.common.comm.SignVersion;
import com.aliyun.oss.model.AsyncProcessObjectRequest;
import com.aliyun.oss.model.AsyncProcessObjectResult;
import com.aliyuncs.exceptions.ClientException;

import java.util.Base64;

public class Demo {
    public static void main(String[] args) throws ClientException {
        // Replace with the endpoint for the region where your bucket is located.
        String endpoint = "https://oss-cn-hangzhou.aliyuncs.com";
        // Specify the region ID, such as cn-hangzhou.
        String region = "cn-hangzhou";
        // Read credentials from environment variables OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET.
        EnvironmentVariableCredentialsProvider credentialsProvider = CredentialsProviderFactory.newEnvironmentVariableCredentialsProvider();
        // Specify the bucket name.
        String bucketName = "examplebucket";
        // Specify the output video name.
        String targetObject = "dest.mp4";
        // Specify the source video (the main file used as the transcoding base).
        String sourceVideo = "src.mp4";
        // Specify the videos to prepend and append.
        String video1 = "concat1.mp4";
        String video2 = "concat2.mp4";

        ClientBuilderConfiguration clientBuilderConfiguration = new ClientBuilderConfiguration();
        clientBuilderConfiguration.setSignatureVersion(SignVersion.V4);
        OSS ossClient = OSSClientBuilder.create()
                .endpoint(endpoint)
                .credentialsProvider(credentialsProvider)
                .clientConfiguration(clientBuilderConfiguration)
                .region(region)
                .build();

        try {
            // Base64 URL-safe encode the video object names (required by the /pre and /sur parameters).
            String video1Encoded = Base64.getUrlEncoder().withoutPadding().encodeToString(video1.getBytes());
            String video2Encoded = Base64.getUrlEncoder().withoutPadding().encodeToString(video2.getBytes());

            // Build the processing style string:
            // - /pre,o_<encoded>: prepend video1 to the beginning
            // - /sur,o_<encoded>,t_0: append video2 to the end (t_0 means use the full clip)
            // - Other parameters control the output encoding.
            String style = String.format(
                "video/concat,ss_0,f_mp4,vcodec_h264,fps_25,vb_1000000,acodec_aac,ab_96000,ar_48000,ac_2,align_1" +
                "/pre,o_%s/sur,o_%s,t_0",
                video1Encoded, video2Encoded
            );

            // Append save-as and notification instructions.
            String bucketEncoded = Base64.getUrlEncoder().withoutPadding().encodeToString(bucketName.getBytes());
            String targetEncoded = Base64.getUrlEncoder().withoutPadding().encodeToString(targetObject.getBytes());
            String process = String.format(
                "%s|sys/saveas,b_%s,o_%s/notify,topic_QXVkaW9Db252ZXJ0",
                style, bucketEncoded, targetEncoded
            );

            // Submit the asynchronous task.
            AsyncProcessObjectRequest request = new AsyncProcessObjectRequest(bucketName, sourceVideo, process);
            AsyncProcessObjectResult response = ossClient.asyncProcessObject(request);
            System.out.println("EventId: " + response.getEventId());
            System.out.println("RequestId: " + response.getRequestId());
            System.out.println("TaskId: " + response.getTaskId());
        } finally {
            ossClient.shutdown();
        }
    }
}

Python

# -*- coding: utf-8 -*-
import base64
import oss2
from oss2.credentials import EnvironmentVariableCredentialsProvider

def main():
    # Read credentials from environment variables OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET.
    auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider())
    # Replace with the endpoint for the region where your bucket is located.
    endpoint = 'https://oss-cn-hangzhou.aliyuncs.com'
    region = 'cn-hangzhou'

    bucket = oss2.Bucket(auth, endpoint, 'examplebucket', region=region)

    # Specify the output video name.
    target_object = 'out.mp4'
    # Specify the source video (the main file used as the transcoding base).
    source_video = 'emrfinal.mp4'
    # Specify the videos to prepend and append.
    video1 = 'osshdfs.mp4'
    video2 = 'product.mp4'

    # Base64 URL-safe encode the video object names (required by the /pre and /sur parameters).
    video1_encoded = base64.urlsafe_b64encode(video1.encode()).decode().rstrip('=')
    video2_encoded = base64.urlsafe_b64encode(video2.encode()).decode().rstrip('=')

    # Build the processing style string:
    # - /pre,o_<encoded>: prepend video1 to the beginning
    # - /sur,o_<encoded>,t_0: append video2 to the end (t_0 means use the full clip)
    # - Other parameters control the output encoding.
    style = (
        f"video/concat,ss_0,f_mp4,vcodec_h264,fps_25,vb_1000000,acodec_aac,ab_96000,ar_48000,ac_2,align_1"
        f"/pre,o_{video1_encoded}/sur,o_{video2_encoded},t_0"
    )

    # Append save-as and notification instructions.
    bucket_encoded = base64.urlsafe_b64encode('examplebucket'.encode()).decode().rstrip('=')
    target_encoded = base64.urlsafe_b64encode(target_object.encode()).decode().rstrip('=')
    process = f"{style}|sys/saveas,b_{bucket_encoded},o_{target_encoded}/notify,topic_QXVkaW9Db252ZXJ0"

    # Submit the asynchronous task.
    try:
        result = bucket.async_process_object(source_video, process)
        print(f"EventId: {result.event_id}")
        print(f"RequestId: {result.request_id}")
        print(f"TaskId: {result.task_id}")
    except Exception as e:
        print(f"Error: {e}")

if __name__ == "__main__":
    main()

Go

package main

import (
    "encoding/base64"
    "fmt"
    "os"
    "strings"

    "github.com/aliyun/aliyun-oss-go-sdk/oss"
)

func main() {
    // Read credentials from environment variables OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET.
    provider, err := oss.NewEnvironmentVariableCredentialsProvider()
    if err != nil {
        fmt.Println("Error:", err)
        os.Exit(-1)
    }

    // Replace yourEndpoint with the endpoint for the region where your bucket is located,
    // and yourRegion with the corresponding region ID (e.g., cn-hangzhou).
    client, err := oss.New(
        "yourEndpoint", "", "",
        oss.SetCredentialsProvider(&provider),
        oss.AuthVersion(oss.AuthV4),
        oss.Region("yourRegion"),
    )
    if err != nil {
        fmt.Println("Error:", err)
        os.Exit(-1)
    }

    bucketName := "examplebucket"
    bucket, err := client.Bucket(bucketName)
    if err != nil {
        fmt.Println("Error:", err)
        os.Exit(-1)
    }

    // Specify the output video name.
    targetObject := "dest.mp4"
    // Specify the source video (the main file used as the transcoding base).
    sourcevideo := "src.mp4"
    // Specify the videos to prepend and append.
    video1 := "concat1.mp4"
    video2 := "concat2.mp4"

    // Base64 URL-safe encode the video object names (required by the /pre and /sur parameters).
    encode := func(s string) string {
        return strings.TrimRight(base64.URLEncoding.EncodeToString([]byte(s)), "=")
    }

    // Build the processing style string:
    // - /pre,o_<encoded>: prepend video1 to the beginning
    // - /sur,o_<encoded>,t_0: append video2 to the end (t_0 means use the full clip)
    // - Other parameters control the output encoding.
    style := fmt.Sprintf(
        "video/concat,ss_0,f_mp4,vcodec_h264,fps_25,vb_1000000,acodec_aac,ab_96000,ar_48000,ac_2,align_1/pre,o_%s/sur,o_%s,t_0",
        encode(video1), encode(video2),
    )

    // Append save-as and notification instructions.
    process := fmt.Sprintf(
        "%s|sys/saveas,b_%s,o_%s/notify,topic_QXVkaW9Db252ZXJ0",
        style, encode(bucketName), encode(targetObject),
    )

    // Submit the asynchronous task.
    rs, err := bucket.AsyncProcessObject(sourcevideo, process)
    if err != nil {
        fmt.Println("Error:", err)
        os.Exit(-1)
    }
    fmt.Printf("EventId: %s\n", rs.EventId)
    fmt.Printf("RequestId: %s\n", rs.RequestId)
    fmt.Printf("TaskId: %s\n", rs.TaskId)
}

The response contains a TaskId. The API does not return the processing result directly. To receive a completion notification, configure Simple Message Queue (SMQ), formerly known as MNS. For details, see Message notifications.

For details on the sys/saveas and notify parameters, see Save as and Message notifications.

Parameter reference

Clip parameters

Clip parameters are set per /pre or /sur segment. They control which portion of each clip is used before concatenation.

ParameterTypeRequiredDescription
ostringYesThe OSS object name of the clip. Must be Base64 URL-safe encoded.
ssintNoStart offset within the clip, in milliseconds. 0 (default) starts from the beginning.
tintNoDuration to use from the clip, in milliseconds. 0 (default) uses the clip to the end.

Transcoding parameters

Transcoding parameters are set at the action level (video/concat,...) and control the output file encoding.

Video

ParameterTypeRequiredDescription
ssintNoTranscoding start time for the source video, in milliseconds. 0 (default) starts from the beginning.
tintNoTranscoding duration for the source video, in milliseconds. 0 (default) continues to the end.
fstringYesOutput container format. Valid values: mp4, mkv, mov, asf, avi, mxf, ts, flv, webm.
vcodecstringYesVideo codec. Valid values: h264, h265, vp9. mxf and flv do not support h265.
vnintNoDisable the video stream. 0 (default) keeps the video stream. 1 disables it.
fpsfloatNoOutput frame rate. Range: 0–240. Defaults to the frame rate of the source video selected by align.
fpsoptintNoFrame rate option when a source clip's frame rate is less than fps. 0: always use target fps. 1: use the lowest source fps. 2: fail the task. Requires fps.
pixfmtstringNoPixel format. Defaults to the format of the source video selected by align. Valid values: yuv420p, yuva420p, yuv420p10le, yuv422p, yuv422p10le, yuv444p, yuv444p10le.
sstringNoOutput resolution in w x h format. Width and height must be multiples of 2, in the range 64–4096. Example: 1920x1080.
soptintNoResolution option when a source clip's resolution is smaller than s. 0: always use target resolution. 1: use the smallest source resolution. 2: fail the task. Requires s.
scaletypestringNoScaling method. stretch (default): stretch to fill. crop: scale and crop. fill: scale and pad with black bars. fit: scale to fit, no padding.
arotateintNoAuto-rotate resolution based on video orientation metadata. 0 (default): off. 1: on.
gintNoGroup of Pictures (GOP) size. Default: 150. Range: 1–100000.
vbintNoVideo bitrate, in bps. Range: 10000–100000000. Mutually exclusive with crf.
vboptintNoVideo bitrate option when a source clip's bitrate is less than vb. 0: always use target bitrate. 1: use the lowest source bitrate. 2: fail the task. Requires vb.
crffloatNoConstant rate factor (CRF). Range: 0–51. Lower values produce higher quality. Recommended range: 18–38. Mutually exclusive with vb.
maxrateintNoMaximum bitrate, in bps. Default: 0. Range: 10000–100000000. Requires crf.
bufsizeintNoBuffer size, in bits. Default: 0. Range: 10000–200000000. Requires crf.

Audio

ParameterTypeRequiredDescription
acodecstringYesAudio codec. Valid values: mp3, aac, flac, vorbis, ac3, opus, pcm. See codec compatibility notes below.
anintNoDisable the audio stream. 0 (default) keeps the audio stream. 1 disables it.
arintNoAudio sample rate, in Hz. Defaults to the sample rate of the source video selected by align. Valid values: 8000, 11025, 12000, 16000, 22050, 24000, 32000, 44100, 48000, 64000, 88200, 96000.
acintNoNumber of audio channels. Range: 1–8. Defaults to the channel count of the source video selected by align.
abintNoAudio bitrate, in bps. Range: 1000–10000000. Mutually exclusive with aq.
aboptstringNoAudio bitrate option when a source clip's audio bitrate is less than ab. 0 (default): always use target bitrate. 1: use the lowest source bitrate. 2: fail the task. Requires ab.
aqintNoAudio quality. Range: 0–100. Mutually exclusive with ab.
adepthintNoAudio bit depth. Valid values: 16, 24. Valid only when acodec is flac.
alignintNoIndex of the video in the concatenation list to use as the default source for transcoding parameters (frame rate, resolution, sample rate, channel count). Default: 0 (the first clip in the list).

Codec and container compatibility:

ContainerUnsupported codecs
mp4pcm
movflac, opus
asfopus
aviopus
mxfAll audio codecs except pcm (audio); h265 (video)
tsflac, vorbis, amr, pcm
flvflac, vorbis, amr, opus, pcm
mxf, flvh265 (video codec)

Audio codec sample rate and channel constraints:

CodecSample rate constraintsChannel constraints
mp348000 Hz and below onlyMono or stereo only
opus8000, 12000, 16000, 24000, 48000 Hz
ac332000, 44100, 48000 HzUp to 6 channels (5.1)
amr8000, 16000 Hz onlyMono only

Segmenting parameters

Use the /segment action to split the concatenated output into HLS (HTTP Live Streaming) or DASH (Dynamic Adaptive Streaming over HTTP) segments. Segmenting supports only mp4 and ts containers.

ParameterTypeRequiredDescription
fstringYesSegment format. Valid values: hls, dash.
tintYesSegment length, in milliseconds. Range: 0–3600000.

Example: Concatenate three clips into an MP4 file

This example concatenates three videos — pre.mov (full clip), example.mkv (from the 10-second mark to the end), and sur.mov (first 10 seconds) — and saves the result as oss://outbucket/outobj.mp4.

Output settings: H.264, 25 fps, 1 Mbps video bitrate; AAC audio, 48 kHz, dual-channel, 96 Kbps.

POST /example.mkv?x-oss-async-process HTTP/1.1
Host: video-demo.oss-cn-hangzhou.aliyuncs.com
Date: Fri, 28 Oct 2022 06:40:10 GMT
Authorization: OSS4-HMAC-SHA256 Credential=LTAI********************/20250417/cn-hangzhou/oss/aliyun_v4_request,Signature=a7c3554c729d71929e0b84489addee6b2e8d5cb48595adfc51868c299c0c218e

x-oss-async-process=video/concat,ss_10000,f_mp4,vcodec_h264,fps_25,vb_1000000,acodec_aac,ab_96000,ar_48000,ac_2,align_1/pre,o_cHJlLm1vdgo/sur,o_c3VyMS5hYWMK,t_10000|sys/saveas,b_b3V0YnVja2V0,o_b3V0b2JqLnthdXRvZXh0fQo/notify,topic_QXVkaW9Db252ZXJ0

How the style string maps to the example:

SegmentParameterValueMeaning
Actionss_1000010000 msStart the source video (example.mkv) from the 10-second mark
Actionalign_11Use example.mkv (index 1 in the list) as the transcoding parameter source
/preo_cHJlLm1vdgopre.mov (encoded)Prepend the full pre.mov
/suro_c3VyMS5hYWMKsur.mov (encoded)Append the first 10 seconds of sur.mov
/surt_1000010000 msLimit sur.mov to the first 10 seconds

Permissions

An Alibaba Cloud account has all permissions by default. A Resource Access Management (RAM) user or RAM role requires explicit grants via a RAM policy or bucket policy.

APIActionWhen required
GetObjectoss:GetObjectAlways required to read source videos
GetObjectoss:GetObjectVersionRequired when accessing a specific object version
GetObjectkms:DecryptRequired when the object uses KMS server-side encryption
HeadObjectoss:GetObjectRequired to query object metadata
PutObjectoss:PutObjectRequired to write the concatenated output
PutObjectoss:PutObjectTaggingRequired when using x-oss-tagging
PutObjectkms:GenerateDataKey, kms:DecryptRequired when the output uses KMS encryption
CreateMediaConvertTaskimm:CreateMediaConvertTaskRequired to invoke IMM for transcoding

Billing

Video concatenation calls both OSS and IMM, generating billable items on both sides.

OSS charges (for pricing, see OSS Pricing):

APIBillable itemNotes
GetObjectGET requestsCharged per successful request
GetObjectOutbound Internet trafficCharged when using a public endpoint (e.g., oss-cn-hangzhou.aliyuncs.com) or acceleration endpoint
GetObjectInfrequent Access (IA) data retrievalCharged when retrieving IA objects
GetObjectArchive data retrievalCharged when retrieving Archive objects in a bucket with real-time access enabled
GetObjectTransfer accelerationCharged when transfer acceleration is enabled and an acceleration endpoint is used
PutObjectPUT requestsCharged per successful request
PutObjectStorageCharged based on storage class, size, and duration of the output object
HeadObjectGET requestsCharged per successful request

IMM charges (for pricing, see IMM billable items):

APIBillable itemCalculation
CreateMediaConvertTaskApsaraVideo Media ProcessingBased on the definition and actual duration (in seconds) of the concatenated output

What's next