Object Storage Service (OSS) provides the multipart upload feature. Multipart upload allows you to split a large object into multiple parts to upload. After these parts are uploaded, you can call CompleteMultipartUpload to combine the parts into a complete object to implement resumable upload.
Usage notes
- In this topic, the public endpoint of the China (Hangzhou) region is used. If you want to access OSS by using other Alibaba Cloud services in the same region as OSS, use an internal endpoint For more information about the regions and endpoints supported by OSS, see Regions and endpoints.
- In this topic, an OSSClient instance is created by using an OSS endpoint. If you want to create an OSSClient instance by using custom domain names or STS, see Initialization.
- The
oss:PutObject
permission is required to perform multipart upload. For more information, see Attach a custom policy to a RAM user.
Process
To upload an object by using multipart upload, perform the following steps:
- Initiate a multipart upload task.
Call the bucket.init_multipart_upload method to obtain a unique upload ID in OSS.
- Upload parts.
Call the bucket.upload_part method to upload the parts.
Note- If parts are uploaded by a multipart upload task that has a specific upload ID, part numbers are used to identify the relative positions of the parts in an object. If you upload a part and use its part number to upload another part, the latter part overwrites the former part.
- OSS includes the MD5 hash of each uploaded part in the ETag header in the response.
- OSS calculates the MD5 hash of uploaded data and compares the MD5 hash with the MD5 hash that is calculated by OSS SDK for Python. If the two hashes are different, OSS returns the InvalidDigest error code.
- Complete the multipart upload task.
After you upload all the parts, call the bucket.complete_multipart_upload method to combine the parts into a complete object.
Examples
After all parts are uploaded, you can combine all parts into a complete object by using one of the following methods:
- Combine all parts into a complete object by including part information in the request body
# -*- coding: utf-8 -*- import os from oss2 import SizedFileAdapter, determine_part_size from oss2.models import PartInfo import oss2 # The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. Using these credentials to perform operations in OSS is a high-risk operation. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console. auth = oss2.Auth('yourAccessKeyId', 'yourAccessKeySecret') # In this example, the endpoint of the China (Hangzhou) region is used. Specify your actual endpoint. # Specify the name of the bucket. Example: examplebucket. bucket = oss2.Bucket(auth, 'https://oss-cn-hangzhou.aliyuncs.com', 'examplebucket') # Specify the full path of the object. Do not include the bucket name in the full path. Example: exampledir/exampleobject.txt. key = 'exampledir/exampleobject.txt' # Specify the full path of the local file that you want to upload. Example: D:\\localpath\\examplefile.txt. filename = 'D:\\localpath\\examplefile.txt' total_size = os.path.getsize(filename) # Use the determine_part_size method to determine the size of each part. part_size = determine_part_size(total_size, preferred_size=100 * 1024) # Initiate a multipart upload task. # If you want to specify the storage class of the object when you initiate the multipart upload task, configure the related headers when you call the init_multipart_upload operation. # headers = dict() # Specify the caching behavior of the web page for the object. # headers['Cache-Control'] = 'no-cache' # Specify the name of the object when it is downloaded. # headers['Content-Disposition'] = 'oss_MultipartUpload.txt' # Specify the encoding format for the content of the object. # headers['Content-Encoding'] = 'utf-8' # Specify the validity period. Unit: milliseconds. # headers['Expires'] = '1000' # Specify whether the object that is uploaded by performing multipart upload overwrites the existing object that has the same name when the multipart upload task is initiated. In this example, this parameter is set to true, which indicates that the existing object with the same name cannot be overwritten by the uploaded object. # headers['x-oss-forbid-overwrite'] = 'true' # Specify the server-side encryption method that is used to encrypt each part of the uploaded object. # headers[OSS_SERVER_SIDE_ENCRYPTION] = SERVER_SIDE_ENCRYPTION_KMS # Specify the algorithm that you want to use to encrypt the object. If you do not configure this parameter, objects are encrypted by using AES-256. # headers[OSS_SERVER_SIDE_DATA_ENCRYPTION] = SERVER_SIDE_ENCRYPTION_KMS # Specify the ID of the Customer Master Key (CMK) that is managed by Key Management Service (KMS). # headers[OSS_SERVER_SIDE_ENCRYPTION_KEY_ID] = '9468da86-3509-4f8d-a61e-6eab1eac****' # Specify the storage class of the object. # headers['x-oss-storage-class'] = oss2.BUCKET_STORAGE_CLASS_STANDARD # Specify tags for the destination object. You can specify multiple tags for the destination object at the same time. # headers[OSS_OBJECT_TAGGING] = 'k1=v1&k2=v2&k3=v3' # upload_id = bucket.init_multipart_upload(key, headers=headers).upload_id upload_id = bucket.init_multipart_upload(key).upload_id parts = [] # Upload the parts one by one. with open(filename, 'rb') as fileobj: part_number = 1 offset = 0 while offset < total_size: num_to_upload = min(part_size, total_size - offset) # Call the SizedFileAdapter(fileobj, size) method to generate a new object and recalculate the position from which the append operation starts. result = bucket.upload_part(key, upload_id, part_number, SizedFileAdapter(fileobj, num_to_upload)) parts.append(PartInfo(part_number, result.etag)) offset += num_to_upload part_number += 1 # Complete the multipart upload task. # The following sample code provides an example on how to configure headers when you complete the multipart upload task: headers = dict() # Specify the ACL of the object. In this example, this parameter is set to OBJECT_ACL_PRIVATE, which specifies that the ACL of the object is private. # headers["x-oss-object-acl"] = oss2.OBJECT_ACL_PRIVATE bucket.complete_multipart_upload(key, upload_id, parts, headers=headers) # bucket.complete_multipart_upload(key, upload_id, parts) # Verify the result of the multipart upload task. with open(filename, 'rb') as fileobj: assert bucket.get_object(key).read() == fileobj.read()
Important We recommend that you increase the size of each part when network conditions are stable. Otherwise, decrease the size of each part. - Combine parts into a complete object by listing the parts on the serverNote If you want to combine parts into a complete object by listing the parts on the server, make sure that multiple parts have been uploaded by using the upload_id specified in the following sample code.
# -*- coding: utf-8 -*- import oss2 # The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. Using these credentials to perform operations in OSS is a high-risk operation. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console. auth = oss2.Auth('yourAccessKeyId', 'yourAccessKeySecret') # In this example, the endpoint of the China (Hangzhou) region is used. Specify your actual endpoint. # Specify the name of the bucket. Example: examplebucket. bucket = oss2.Bucket(auth, 'https://oss-cn-hangzhou.aliyuncs.com', 'examplebucket') # Specify the full path of the object. Do not include the bucket name in the full path. Example: exampledir/exampleobject.txt. key = 'exampledir/exampleobject.txt' # Specify the full path of the local file that you want to upload. Example: D:\\localpath\\examplefile.txt. filename = 'D:\\localpath\\examplefile.txt' # Specify the upload ID. upload_id = '0004B9894A22E5B1888A1E29F823****' # Complete the multipart upload task. # If you want to specify the ACL of the object when you complete the multipart upload task, configure the related headers in the complete_multipart_upload function. headers = dict() # headers["x-oss-object-acl"] = oss2.OBJECT_ACL_PRIVATE # If you specify x-oss-complete-all:yes in the request, OSS lists all parts that are uploaded by using the current upload ID, sorts the parts by part number, and then performs the CompleteMultipartUpload operation. # If x-oss-complete-all:yes is specified in the request, the request body cannot be specified. Otherwise, an error occurs. headers["x-oss-complete-all"] = 'yes' bucket.complete_multipart_upload(key, upload_id, None, headers=headers) # Verify the result of the multipart upload task. with open(filename, 'rb') as fileobj: assert bucket.get_object(key).read() == fileobj.read()
Cancel a multipart upload task
You can call the bucket.abort_multipart_upload method to cancel a multipart upload task. If a multipart upload task is canceled, the upload ID cannot be used to upload parts. In addition, the uploaded parts are deleted.
The following sample code provides an example on how to cancel a multipart upload task:
# -*- coding: utf-8 -*-
import os
import oss2
# The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. Using these credentials to perform operations in OSS is a high-risk operation. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
auth = oss2.Auth('yourAccessKeyId', 'yourAccessKeySecret')
# In this example, the endpoint of the China (Hangzhou) region is used. Specify your actual endpoint.
# Specify the name of the bucket. Example: examplebucket.
bucket = oss2.Bucket(auth, 'https://oss-cn-hangzhou.aliyuncs.com', 'examplebucket')
# Specify the full path of the object. Do not include the bucket name in the full path. Example: exampledir/exampleobject.txt.
key = 'exampledir/exampleobject.txt'
# Obtain the upload ID returned by the init_multipart_upload operation.
upload_id = 'yourUploadId'
# Cancel the multipart upload task that uses the specified upload ID. The uploaded parts are deleted.
bucket.abort_multipart_upload(key, upload_id)
List the uploaded parts
The following code provides an example on how to list the uploaded parts:
# -*- coding: utf-8 -*-
import os
import oss2
# The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. Using these credentials to perform operations in OSS is a high-risk operation. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console.
auth = oss2.Auth('yourAccessKeyId', 'yourAccessKeySecret')
# In this example, the endpoint of the China (Hangzhou) region is used. Specify your actual endpoint.
bucket = oss2.Bucket(auth, 'https://oss-cn-hangzhou.aliyuncs.com', 'yourBucketName')
# Specify the full path of the object. Do not include the bucket name in the full path. Example: exampledir/exampleobject.txt.
key = 'exampledir/exampleobject.txt'
# Obtain the upload ID returned by the init_multipart_upload operation.
upload_id = 'yourUploadId'
# List the uploaded parts that use the specified upload ID.
for part_info in oss2.PartIterator(bucket, key, upload_id):
print('part_number:', part_info.part_number)
print('etag:', part_info.etag)
print('size:', part_info.size)
List multipart upload tasks
- List the multipart upload tasks of a specified object
The following code provides an example on how to list the multipart upload tasks of a specified object:
# -*- coding: utf-8 -*- import os import oss2 # The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. Using these credentials to perform operations in OSS is a high-risk operation. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console. auth = oss2.Auth('yourAccessKeyId', 'yourAccessKeySecret') # In this example, the endpoint of the China (Hangzhou) region is used. Specify your actual endpoint. # Specify the name of the bucket. Example: examplebucket. bucket = oss2.Bucket(auth, 'https://oss-cn-hangzhou.aliyuncs.com', 'examplebucket') # Specify the full path of the object. Do not include the bucket name in the full path. Example: exampledir/exampleobject.txt. key = 'exampledir/exampleobject.txt' # List all multipart upload tasks of the object. Each time the init_multipart_upload operation is called for the same object, different upload IDs are returned. # An upload ID uniquely identifies a multipart upload task. for upload_info in oss2.ObjectUploadIterator(bucket, key): print('key:', upload_info.key) print('upload_id:', upload_info.upload_id)
- List all multipart upload tasks in a bucket
The following code provides an example on how to list all multipart upload tasks in a bucket:
# -*- coding: utf-8 -*- import os import oss2 # The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. Using these credentials to perform operations in OSS is a high-risk operation. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console. auth = oss2.Auth('yourAccessKeyId', 'yourAccessKeySecret') # In this example, the endpoint of the China (Hangzhou) region is used. Specify your actual endpoint. # Specify the name of the bucket. Example: examplebucket. bucket = oss2.Bucket(auth, 'https://oss-cn-hangzhou.aliyuncs.com', 'examplebucket') # List all multipart upload tasks in the bucket. for upload_info in oss2.MultipartUploadIterator(bucket): print('key:', upload_info.key) print('upload_id:', upload_info.upload_id)
- List the multipart upload tasks of objects whose names contain a specified prefix in a bucket
The following code provides an example on how to list the multipart upload tasks of objects whose names contain a specified prefix in a bucket:
# -*- coding: utf-8 -*- import os import oss2 # The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. Using these credentials to perform operations in OSS is a high-risk operation. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console. auth = oss2.Auth('yourAccessKeyId', 'yourAccessKeySecret') # In this example, the endpoint of the China (Hangzhou) region is used. Specify your actual endpoint. # Specify the name of the bucket. Example: examplebucket. bucket = oss2.Bucket(auth, 'https://oss-cn-hangzhou.aliyuncs.com', 'examplebucket') # List the multipart upload tasks of objects whose names contain the test prefix in the bucket. for upload_info in oss2.MultipartUploadIterator(bucket, prefix='test'): print('key:', upload_info.key) print('upload_id:', upload_info.upload_id)
FAQ
How do I delete parts?
You can use one of the following methods to delete parts:
- Automatic deletion
You can configure lifecycle rules to automatically delete parts at a scheduled time. For more information, see Configure lifecycle rules to delete expired parts.
- Manual deletion
You can call the AbortMultipartUpload operation to cancel a multipart upload task and delete the parts. For more information, see AbortMultipartUpload.
References
- The following API operations are required to perform multipart upload:
- The API operation that you can call to initiate a multipart upload task. For more information, see InitiateMultipartUpload.
- The API operation that you can call to upload data by part. For more information, see UploadPart.
- The API operation that you can call to complete a multipart upload task. For more information, see CompleteMultipartUpload.
- For more information about the API operation that you can call to cancel the multipart upload task, see AbortMultipartUpload.
- For more information about the API operation that you can call to list the uploaded parts, see ListParts.
- For more information about the API operation that you can call to list ongoing multipart upload tasks, see ListMultipartUploads. Ongoing multipart upload tasks are tasks that have been initiated but are not completed or canceled.