By using the multipart upload feature provided by OSS, you can split a large object into multiple parts and upload them separately. After all parts are uploaded, call the CompleteMultipartUpload operation to combine these parts into a single object to implement resumable upload.

Multipart upload process

To implement multipart upload, perform the following operations:

  1. Initiate a multipart upload task.

    Call bucket.init_multipart_upload to obtain a unique upload ID in OSS.

  2. Upload the parts.

    Call bucket.upload_part to upload the parts.

    Note
    • Part numbers identify the relative positions of parts in an object that share the same upload ID. If you have uploaded a part and used its part number again to upload another part, the latter part overwrites the former part.
    • OSS includes the MD5 hash of part data in the ETag header and returns the MD5 hash to the user.
    • OSS calculates the MD5 hash of uploaded data and compares it with the MD5 hash calculated by the SDK. If the two hashes are different, the InvalidDigest error code is returned.
  3. Complete the multipart upload task.

    After all parts are uploaded, call bucket.complete_multipart_upload to combine the parts into a complete object.

Complete sample code of multipart upload

The following code provides a complete example that describes the process of multipart upload:

# -*- coding: utf-8 -*-
import os
from oss2 import SizedFileAdapter, determine_part_size
from oss2.models import PartInfo
import oss2

# Security risks may arise if you use the AccessKey pair of an Alibaba Cloud account to log on to OSS because the account has permissions on all API operations. We recommend that you use your RAM user's credentials to call API operations or perform routine operations and maintenance. To create a RAM user, log on to the RAM console.
auth = oss2.Auth('<yourAccessKeyId>', '<yourAccessKeySecret>')
# The endpoint of the China (Hangzhou) region is used in this example. Specify the actual endpoint.
bucket = oss2.Bucket(auth, 'http://oss-cn-hangzhou.aliyuncs.com', '<yourBucketName>')

key = '<yourObjectName>'
filename = '<yourLocalFile>'

total_size = os.path.getsize(filename)
# Use the determine_part_size method to determine the size of each part.
part_size = determine_part_size(total_size, preferred_size=100 * 1024)

# Initiate a multipart upload task.
# To set the storage class for the object when you initiate the multipart upload task, set the headers parameter in the init_multipart_upload method.
# headers = dict()
# headers["x-oss-storage-class"] = "Standard"
# upload_id = bucket.init_multipart_upload(key, headers=headers).upload_id
upload_id = bucket.init_multipart_upload(key).upload_id
parts = []

# Upload each part sequentially.
with open(filename, 'rb') as fileobj:
    part_number = 1
    offset = 0
    while offset < total_size:
        num_to_upload = min(part_size, total_size - offset)
        # The SizedFileAdapter(fileobj, size) method generates a new object and recalculates the length of the append object.
        result = bucket.upload_part(key, upload_id, part_number,
                                    SizedFileAdapter(fileobj, num_to_upload))
        parts.append(PartInfo(part_number, result.etag))

        offset += num_to_upload
        part_number += 1

# Complete the multipart upload task.
# To configure ACL for the object when you complete the multipart upload task, set the headers parameter in the complete_multipart_upload function.
# headers = dict()
# headers["x-oss-object-acl"] = oss2.OBJECT_ACL_PRIVATE
# bucket.complete_multipart_upload(key, upload_id, parts, headers=headers)
bucket.complete_multipart_upload(key, upload_id, parts)

# Verify multipart upload.
with open(filename, 'rb') as fileobj:
    assert bucket.get_object(key).read() == fileobj.read()
            
Note We recommend that you increase the part size when network conditions are good. Decrease the part size when network conditions are poor.

For more information about multipart upload, see InitiateMultipartUpload and CompleteMultipartUpload.

Cancel a multipart upload task

You can call bucket.abort_multipart_upload to cancel a multipart upload task. If you cancel a multipart upload task, you cannot use the upload ID to upload any part. The uploaded parts are deleted.

The following code provides an example on how to cancel a multipart upload task:

# -*- coding: utf-8 -*-
import os
import oss2

# Security risks may arise if you use the AccessKey pair of an Alibaba Cloud account to log on to OSS because the account has permissions on all API operations. We recommend that you use your RAM user's credentials to call API operations or perform routine operations and maintenance. To create a RAM user, log on to the RAM console.
auth = oss2.Auth('<yourAccessKeyId>', '<yourAccessKeySecret>')
# The endpoint of the China (Hangzhou) region is used in this example. Specify the actual endpoint.
bucket = oss2.Bucket(auth, 'http://oss-cn-hangzhou.aliyuncs.com', '<yourBucketName>')

key = '<yourObjectName>'
# Obtain the upload ID returned by init_multipart_upload.
upload_id = '<yourUploadId>'

# When you cancel the multipart upload task with the specified upload ID, the uploaded parts are deleted.
bucket.abort_multipart_upload(key, upload_id)

For more information about how to cancel a multipart upload task, see AbortMultipartUpload.

List uploaded parts

The following code provides an example on how to list uploaded parts:

# -*- coding: utf-8 -*-
import os
import oss2

# Security risks may arise if you use the AccessKey pair of an Alibaba Cloud account to log on to OSS because the account has permissions on all API operations. We recommend that you use your RAM user's credentials to call API operations or perform routine operations and maintenance. To create a RAM user, log on to the RAM console.
auth = oss2.Auth('<yourAccessKeyId>', '<yourAccessKeySecret>')
# The endpoint of the China (Hangzhou) region is used in this example. Specify the actual endpoint.
bucket = oss2.Bucket(auth, 'http://oss-cn-hangzhou.aliyuncs.com', '<yourBucketName>')

key = '<yourObjectName>'
# Obtain the upload ID returned by init_multipart_upload.
upload_id = '<yourUploadId>'

# List uploaded parts using the specified upload ID.
for part_info in oss2.PartIterator(bucket, key, upload_id):
    print('part_number:', part_info.part_number)
    print('etag:', part_info.etag)
    print('size:', part_info.size)

For more information about how to list uploaded parts, see ListParts.

List multipart upload tasks

  • List multipart upload tasks of a specified object

    The following code provides an example on how to list multipart upload tasks of a specified object:

    # -*- coding: utf-8 -*-
    import os
    import oss2
    
    # Security risks may arise if you use the AccessKey pair of an Alibaba Cloud account to log on to OSS because the account has permissions on all API operations. We recommend that you use your RAM user's credentials to call API operations or perform routine operations and maintenance. To create a RAM user, log on to the RAM console.
    auth = oss2.Auth('<yourAccessKeyId>', '<yourAccessKeySecret>')
    # The endpoint of the China (Hangzhou) region is used in this example. Specify the actual endpoint.
    bucket = oss2.Bucket(auth, 'http://oss-cn-hangzhou.aliyuncs.com', '<yourBucketName>')
    key = '<yourObjectName>'
    
    # List all multipart upload tasks of an object. Each time init_multipart_upload is called for the same object, different upload IDs are returned.
    # One upload ID corresponds to one multipart upload task.
    for upload_info in oss2.ObjectUploadIterator(bucket, key):
        print('key:', upload_info.key)
        print('upload_id:', upload_info.upload_id)
  • List all multipart upload tasks in a bucket

    The following code provides an example on how to list all multipart upload tasks in a bucket:

    # -*- coding: utf-8 -*-
    import os
    import oss2
    
    # Security risks may arise if you use the AccessKey pair of an Alibaba Cloud account to log on to OSS because the account has permissions on all API operations. We recommend that you use your RAM user's credentials to call API operations or perform routine operations and maintenance. To create a RAM user, log on to the RAM console.
    auth = oss2.Auth('<yourAccessKeyId>', '<yourAccessKeySecret>')
    # The endpoint of the China (Hangzhou) region is used in this example. Specify the actual endpoint.
    bucket = oss2.Bucket(auth, 'http://oss-cn-hangzhou.aliyuncs.com', '<yourBucketName>')
    
    # List all multipart upload tasks in a bucket.
    for upload_info in oss2.MultipartUploadIterator(bucket):
        print('key:', upload_info.key)
        print('upload_id:', upload_info.upload_id)
  • List all multipart upload tasks for objects whose names contain a specified prefix in a bucket

    The following code provides an example on how to list multipart upload tasks for objects whose names contain a specified prefix in a bucket:

    # -*- coding: utf-8 -*-
    import os
    import oss2
    
    # Security risks may arise if you use the AccessKey pair of an Alibaba Cloud account to log on to OSS because the account has permissions on all API operations. We recommend that you use your RAM user's credentials to call API operations or perform routine operations and maintenance. To create a RAM user, log on to the RAM console.
    auth = oss2.Auth('<yourAccessKeyId>', '<yourAccessKeySecret>')
    # The endpoint of the China (Hangzhou) region is used in this example. Specify the actual endpoint.
    bucket = oss2.Bucket(auth, 'http://oss-cn-hangzhou.aliyuncs.com', '<yourBucketName>')
    
    # List multipart upload tasks for objects whose names contain the test prefix in a bucket.
    for upload_info in oss2.MultipartUploadIterator(bucket, prefix='test'):
        print('key:', upload_info.key)
        print('upload_id:', upload_info.upload_id)

For more information about how to list multipart upload tasks, see ListMultipartUploads.