By using the multipart upload feature provided by OSS, you can split a large object into multiple parts and upload them separately. After all parts are uploaded, call the CompleteMultipartUpload operation to combine these parts into a single object to implement resumable upload.

Process

To upload an object by using multipart upload, perform the following steps:

  1. Initiate a multipart upload task.

    Call the bucket.init_multipart_upload method to obtain an unique upload ID in Object Storage Service (OSS).

  2. Upload parts.

    Call the bucket.upload_part method to upload the parts.

    Note
    • Part numbers identify the relative positions of parts in an object that share the same upload ID. If you upload a part by using an existing part number, the existing part is overwritten.
    • OSS includes the MD5 hash of part data in the ETag header and returns the MD5 hash to the user.
    • OSS calculates the MD5 hash of uploaded data and compares the MD5 hash with the MD5 hash calculated by the SDK. If the two hashes are different, the InvalidDigest error code is returned.
  3. Complete the multipart upload task.

    After you upload all the parts, call the bucket.complete_multipart_upload method to combine the parts into a complete object.

Complete sample code

The following code provides a complete example that describes the process of multipart upload:

# -*- coding: utf-8 -*-
import os
from oss2 import SizedFileAdapter, determine_part_size
from oss2.models import PartInfo
import oss2

# Security risks may arise if you use the AccessKey pair of an Alibaba Cloud account to access OSS because the account has permissions on all API operations. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console. 
auth = oss2.Auth('yourAccessKeyId', 'yourAccessKeySecret')
# In this example, the endpoint of the China (Hangzhou) region is used. Specify the endpoint based on your business requirements. 
# Specify the name of the bucket. Example: examplebucket. 
bucket = oss2.Bucket(auth, 'https://oss-cn-hangzhou.aliyuncs.com', 'examplebucket')
# Specify the full path of the object. The full path cannot contain the bucket name. Example: exampledir/exampleobject.txt. 
key = 'exampledir/exampleobject.txt'
# Specify the full path of the local file that you want to upload. Example: D:\\localpath\\examplefile.txt. 
filename = 'D:\\localpath\\examplefile.txt'

total_size = os.path.getsize(filename)
# Use the determine_part_size method to determine the size of each part. 
part_size = determine_part_size(total_size, preferred_size=100 * 1024)

# Initiate a multipart upload task. 
# If you want to specify the storage class of the object when you initiate the multipart upload task, configure the related headers when you call the init_multipart_upload method. 
# headers = dict()
# Specify the caching behavior of the web page for the object. 
# headers['Cache-Control'] = 'no-cache'
# Specify the name of the object when it is downloaded. 
# headers['Content-Disposition'] = 'oss_MultipartUpload.txt'
# Specify the encoding format for the content of the object. 
# headers['Content-Encoding'] = 'utf-8'
# Specify the validity period. Unit: milliseconds. 
# headers['Expires'] = '1000'
# Specify whether to overwrite the existing object with the same name as the uploaded object when you initiate the multipart upload task. In this example, this parameter is set to true, which indicates that the existing object with the same name cannot be overwritten by the uploaded object. 
# headers['x-oss-forbid-overwrite'] = 'true'
# Specify the server-side encryption method that is used to encrypt each part of the object that you want to upload. 
# headers[OSS_SERVER_SIDE_ENCRYPTION] = SERVER_SIDE_ENCRYPTION_KMS
# Specify the algorithm that is used to encrypt the object. If you do not configure this parameter, objects are encrypted by using AES-256. 
# headers[OSS_SERVER_SIDE_DATA_ENCRYPTION] = SERVER_SIDE_ENCRYPTION_KMS
# Specify the ID of the Customer Master Key (CMK) that is managed by Key Management Service (KMS). 
# headers[OSS_SERVER_SIDE_ENCRYPTION_KEY_ID] = '9468da86-3509-4f8d-a61e-6eab1eac****'
# Specify the storage class of the object. 
# headers['x-oss-storage-class'] = oss2.BUCKET_STORAGE_CLASS_STANDARD
# Specify tags for the object. You can specify multiple tags for the object at the same time. 
# headers[OSS_OBJECT_TAGGING] = 'k1=v1&k2=v2&k3=v3'
# upload_id = bucket.init_multipart_upload(key, headers=headers).upload_id
upload_id = bucket.init_multipart_upload(key).upload_id
parts = []

# Upload the parts one by one. 
with open(filename, 'rb') as fileobj:
    part_number = 1
    offset = 0
    while offset < total_size:
        num_to_upload = min(part_size, total_size - offset)
        # Call the SizedFileAdapter(fileobj, size) method to generate a new object and recalculate the length of the append object. 
        result = bucket.upload_part(key, upload_id, part_number,
                                    SizedFileAdapter(fileobj, num_to_upload))
        parts.append(PartInfo(part_number, result.etag))

        offset += num_to_upload
        part_number += 1

# Complete the multipart upload task. 
# The following code provides an example on how to configure headers when you complete the multipart upload task: 
headers = dict()
# Specify the access control list (ACL) of the object. In this example, this parameter is set to OBJECT_ACL_PRIVATE, which indicates private. 
# headers["x-oss-object-acl"] = oss2.OBJECT_ACL_PRIVATE
# If you configure x-oss-complete-all:yes in the request, OSS lists all parts that are uploaded by using the current upload ID, sorts the parts by part number, and then performs the CompleteMultipartUpload operation. 
# If you configure x-oss-complete-all:yes in the request, the request body cannot be specified. Otherwise, an error occurs. 
headers["x-oss-complete-all"] = 'yes'
bucket.complete_multipart_upload(key, upload_id, parts, headers=headers)
# bucket.complete_multipart_upload(key, upload_id, parts)

# Verify the result of the multipart upload task. 
with open(filename, 'rb') as fileobj:
    assert bucket.get_object(key).read() == fileobj.read()            
Note We recommend that you increase the size of each part when network conditions are stable. We recommend that you decrease the size of each part when network conditions are unstable.

Cancel a multipart upload task

You can call the bucket.abort_multipart_upload method to cancel a multipart upload task. If a multipart upload task is canceled, the upload ID cannot be used to upload parts. In addition, the uploaded parts are deleted.

The following code provides an example on how to cancel a multipart upload task:

# -*- coding: utf-8 -*-
import os
import oss2

# Security risks may arise if you use the AccessKey pair of an Alibaba Cloud account to access OSS because the account has permissions on all API operations. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console. 
auth = oss2.Auth('yourAccessKeyId', 'yourAccessKeySecret')
# In this example, the endpoint of the China (Hangzhou) region is used. Specify the endpoint based on your business requirements. 
# Specify the name of the bucket. Example: examplebucket. 
bucket = oss2.Bucket(auth, 'https://oss-cn-hangzhou.aliyuncs.com', 'examplebucket')
# Specify the full path of the object. The full path cannot contain the bucket name. Example: exampledir/exampleobject.txt. 
key = 'exampledir/exampleobject.txt'
# Obtain the upload ID returned by the init_multipart_upload method. 
upload_id = 'yourUploadId'

# Cancel the multipart upload task that uses the specified upload ID. The uploaded parts are deleted. 
bucket.abort_multipart_upload(key, upload_id)

List the uploaded parts

The following code provides an example on how to list the uploaded parts:

# -*- coding: utf-8 -*-
import os
import oss2

# Security risks may arise if you use the AccessKey pair of an Alibaba Cloud account to access OSS because the account has permissions on all API operations. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console. 
auth = oss2.Auth('yourAccessKeyId', 'yourAccessKeySecret')
# In this example, the endpoint of the China (Hangzhou) region is used. Specify the endpoint based on your business requirements. 
bucket = oss2.Bucket(auth, 'https://oss-cn-hangzhou.aliyuncs.com', 'yourBucketName')
# Specify the full path of the object. The full path cannot contain the bucket name. Example: exampledir/exampleobject.txt. 
key = 'exampledir/exampleobject.txt'
# Obtain the upload ID returned by the init_multipart_upload method. 
upload_id = 'yourUploadId'

# List the uploaded parts that use the specified upload ID. 
for part_info in oss2.PartIterator(bucket, key, upload_id):
    print('part_number:', part_info.part_number)
    print('etag:', part_info.etag)
    print('size:', part_info.size)

List multipart upload tasks

  • List the multipart upload tasks of the specified object

    The following code provides an example on how to list the multipart upload tasks of the specified object:

    # -*- coding: utf-8 -*-
    import os
    import oss2
    
    # Security risks may arise if you use the AccessKey pair of an Alibaba Cloud account to access OSS because the account has permissions on all API operations. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console. 
    auth = oss2.Auth('yourAccessKeyId', 'yourAccessKeySecret')
    # In this example, the endpoint of the China (Hangzhou) region is used. Specify the endpoint based on your business requirements. 
    # Specify the name of the bucket. Example: examplebucket. 
    bucket = oss2.Bucket(auth, 'https://oss-cn-hangzhou.aliyuncs.com', 'examplebucket')
    # Specify the full path of the object. The full path cannot contain the bucket name. Example: exampledir/exampleobject.txt. 
    key = 'exampledir/exampleobject.txt'
    
    # List all multipart upload tasks of the object. Each time the init_multipart_upload method is called for the same object, a different upload ID is returned. 
    # An upload ID uniquely identifies a multipart upload task. 
    for upload_info in oss2.ObjectUploadIterator(bucket, key):
        print('key:', upload_info.key)
        print('upload_id:', upload_info.upload_id)
  • List all multipart upload tasks initiated for a bucket

    The following code provides an example on how to list all multipart upload tasks initiated for a bucket:

    # -*- coding: utf-8 -*-
    import os
    import oss2
    
    # Security risks may arise if you use the AccessKey pair of an Alibaba Cloud account to access OSS because the account has permissions on all API operations. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console. 
    auth = oss2.Auth('yourAccessKeyId', 'yourAccessKeySecret')
    # In this example, the endpoint of the China (Hangzhou) region is used. Specify the endpoint based on your business requirements. 
    # Specify the name of the bucket. Example: examplebucket. 
    bucket = oss2.Bucket(auth, 'https://oss-cn-hangzhou.aliyuncs.com', 'examplebucket')
    
    # List all multipart upload tasks initiated for the bucket. 
    for upload_info in oss2.MultipartUploadIterator(bucket):
        print('key:', upload_info.key)
        print('upload_id:', upload_info.upload_id)
  • List the multipart upload tasks of objects whose names contain the specified prefix in a bucket

    The following code provides an example on how to list the multipart upload tasks of objects whose names contain the specified prefix in a bucket:

    # -*- coding: utf-8 -*-
    import os
    import oss2
    # Security risks may arise if you use the AccessKey pair of an Alibaba Cloud account to access OSS because the account has permissions on all API operations. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console. 
    auth = oss2.Auth('yourAccessKeyId', 'yourAccessKeySecret')
    # In this example, the endpoint of the China (Hangzhou) region is used. Specify the endpoint based on your business requirements. 
    # Specify the name of the bucket. Example: examplebucket. 
    bucket = oss2.Bucket(auth, 'https://oss-cn-hangzhou.aliyuncs.com', 'examplebucket')
    
    # List the multipart upload tasks of objects whose names contain the test prefix in the bucket. 
    for upload_info in oss2.MultipartUploadIterator(bucket, prefix='test'):
        print('key:', upload_info.key)
        print('upload_id:', upload_info.upload_id)

References

  • The following API operations are required to perform multipart upload:
    • The API operation that you can call to initiate a multipart upload task. For more information, see InitiateMultipartUpload.
    • The API operation that you can call to upload data by part. For more information, see UploadPart.
    • The API operation that you can call to complete a multipart upload task. For more information, see CompleteMultipartUpload.
  • For more information about the API operation that you can call to cancel a multipart upload task, see AbortMultipartUpload.
  • For more information about the API operation that you can call to list uploaded parts, see ListParts.
  • For more information about the API operation that you can call to list ongoing multipart upload tasks, see ListMultipartUploads. Ongoing multipart upload tasks include tasks that have been initiated but are not completed and tasks that have been canceled.