You may fail to download a large object if the network is unstable or other exceptions occur. In some cases, you may fail to download the object even after multiple attempts. To resolve this issue, Object Storage Service (OSS) provides the resumable download feature. In resumable download, OSS splits the object that you want to download into multiple parts and downloads each part separately. After all parts are downloaded, OSS combines the parts into a complete object.

Usage notes

  • In this topic, the public endpoint of the China (Hangzhou) region is used. If you want to access OSS by using other Alibaba Cloud services in the same region as OSS, use an internal endpoint For more information about the regions and endpoints supported by OSS, see Regions and endpoints.
  • In this topic, an OSSClient instance is created by using an OSS endpoint. If you want to create an OSSClient instance by using custom domain names or STS, see Initialization.
  • To use resumable download, you must have the oss:GetObject permission. For more information, see Attach a custom policy to a RAM user.

Procedure

To use resumable download, perform the following steps:

  1. Create a temporary local file with a name that consists of the original object name and a random suffix.
  2. Specify the Range header in the HTTP request so that the object is read based on the range. Then, the read content is written to the corresponding position of the temporary local file.
  3. After the download is complete, rename the temporary file as the destination file. If the destination file already exists, the downloaded data overwrites the data in the existing file. Otherwise, a new file is created.
Warning One piece of checkpoint information overwrites another on the local disk, or one temporary file name conflicts with another. Therefore, do not use multiple programs or threads to call the oss2.resumable_download method simultaneously to download the same object to a same destination file.

Sample code

The following code provides an example on how to perform resumable download:

# -*- coding: utf-8 -*-
import oss2
# The AccessKey pair of an Alibaba Cloud account has permissions on all API operations. Using these credentials to perform operations in OSS is a high-risk operation. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console. 
auth = oss2.Auth('yourAccessKeyId', 'yourAccessKeySecret')
# Specify the endpoint of the region in which the bucket is located. For example, if the bucket is located in the China (Hangzhou) region, set the endpoint to https://oss-cn-hangzhou.aliyuncs.com. 
# Specify the name of the bucket. Example: examplebucket. 
bucket = oss2.Bucket(auth, 'https://oss-cn-hangzhou.aliyuncs.com', 'examplebucket')

# Set yourObjectName to the full path of the object. The full path cannot contain the bucket name. Example: exampledir/exampleobject.txt. 
# Set yourLocalFile to the full path of the local file. Example: D:\\localpath\\examplefile.txt. 
oss2.resumable_download(bucket, 'exampledir/exampleobject.txt', 'D:\\localpath\\examplefile.txt')
# If you do not specify a directory by using the store parameter, the .py-oss-upload directory is created in the HOME directory to store the checkpoint information. 

# Optional. You can configure the following parameters in OSS SDK for Python version 2.1.0 and later. 
# import sys
# # If you cannot determine the length of data that you want to download, the value of total_bytes is None. 
# def percentage(consumed_bytes, total_bytes):
#     if total_bytes:
#         rate = int(100 * (float(consumed_bytes) / float(total_bytes)))
#         print('\r{0}% '.format(rate), end='')
#         sys.stdout.flush()
# # If you use the store parameter to specify a directory, the checkpoint information is stored in the specified directory. If you use the num_threads parameter to specify the number of concurrent download threads, make sure that the value of oss2.defaults.connection_pool_size is greater than or equal to the number of concurrent download threads. By default, the number of concurrent threads is 1. 
# oss2.resumable_download(bucket,  'exampledir/exampleobject.txt', 'D:\\localpath\\examplefile.txt',
#                       store=oss2.ResumableDownloadStore(root='/tmp'),
#                       # Specify that resumable download is used when the length of the object is greater than or equal to the value of the multipart_threshold parameter. The multipart_threshold parameter is optional and its default value is 10 MB. 
#                       multiget_threshold=100*1024,
#                       # Specify the size of each part. Unit: bytes. Valid values: 100 KB to 5 GB. Default value: 100 KB. 
#                       part_size=100*1024,
#                       # Configure the callback function that you want to use to indicate the progress of the resumable download task. 
#                       progress_callback=percentage,
#                       # If you use num_threads to set the number of cocurrent download threads, set oss2.defaults.connection_pool_size to a value that is greater than or equal to the number of concurrent download threads. By default, the number of concurrent threads is 1. 
#                       num_threads=4)

References

  • For more information about the complete sample code that is used to perform resumable download, visit GitHub.
  • For more information about the API operation that you can call to perform resumable download, see GetObject.