You may fail to download a large object if the network is unstable or other exceptions occur. In some cases, you may still fail to download the object even after multiple attempts. To resolve this issue, Object Storage Service (OSS) SDK for Python provides the resumable download feature. In resumable download, the object that you want to download is split into multiple parts and each part is separately downloaded. After all parts are downloaded, these parts are combined into a complete object.

Procedure

To use resumable download, perform the following steps:

  1. Create a temporary local file with a name that consists of the original object name and a random suffix.
  2. Specify the Range header in the HTTP request so that the object is read based on the range. Then, the read content is written to the corresponding position of the temporary local file.
  3. After the download is completed, rename the temporary file as the destination file. If the destination file already exists, the downloaded data overwrites the data in the existing file. Otherwise, a new file is created.
Warning Do not use multiple programs or threads to call the method simultaneously to download the same object to a same destination file. This is because one piece of checkpoint information overwrites another on the local disk, or one temporary file name conflicts with another.

Sample code

The following code provides an example on how to perform resumable download:

# -*- coding: utf-8 -*-
import oss2
# Security risks may arise if you use the AccessKey pair of an Alibaba Cloud account to access OSS because the account has permissions on all API operations. We recommend that you use a RAM user to call API operations or perform routine O&M. To create a RAM user, log on to the RAM console. 
auth = oss2.Auth('yourAccessKeyId', 'yourAccessKeySecret')
# Set yourEndpoint to the endpoint of the region in which the bucket is located. For example, if the bucket is located in the China (Hangzhou) region, set the endpoint to https://oss-cn-hangzhou.aliyuncs.com. 
# Specify the name of the bucket. Example: examplebucket. 
bucket = oss2.Bucket(auth, 'https://oss-cn-hangzhou.aliyuncs.com', 'examplebucket')

# Set yourObjectName to the full path of the object. The full path cannot contain the bucket name. Example: exampledir/exampleobject.txt. 
# Set yourLocalFile to the full path of the local file. Example: D:\\localpath\\examplefile.txtt. 
oss2.resumable_download(bucket, 'exampledir/exampleobject.txt', 'D:\\localpath\\examplefile.txt')
# If you do not specify a directory by using the store parameter, the .py-oss-upload directory is created in the HOME directory to store the checkpoint information. 

# The following optional parameters are supported by OSS SDK for Python that is later than version 2.1.0. 
# import sys
# # If the length of the data to download cannot be determined, the value of total_bytes is None. 
# def percentage(consumed_bytes, total_bytes):
#     if total_bytes:
#         rate = int(100 * (float(consumed_bytes) / float(total_bytes)))
#         print('\r{0}% '.format(rate), end='')
#         sys.stdout.flush()
# # If you use the store parameter to specify a directory, the checkpoint information is stored in the specified directory. If you use the num_threads parameter to specify the number of concurrent download threads, make sure that the value of oss2.defaults.connection_pool_size is greater than or equal to the number of concurrent download threads. The default number of concurrent threads is 1. 
# oss2.resumable_download(bucket,  'exampledir/exampleobject.txt', 'D:\\localpath\\examplefile.txt',
#                       store=oss2.ResumableDownloadStore(root='/tmp'),
#                       # Specify that resumable download is used when the object size is greater than or equal to the optional multipart_threshold parameter. The default value of multipart_threshold is 10 MB. 
#                       multiget_threshold=100*1024,
#                       # Set the size of each part. Unit: bytes. Valid values: 100 KB to 5 GB. Default value: 100 KB. 
#                       part_size=100*1024,
#                       # Configure the download progress callback function. 
#                       progress_callback=percentage,
#                       # If you use num_threads to set the number of cocurrent download threads, set oss2.defaults.connection_pool_size to a value that is greater than or equal to the number of concurrent download threads. The default number of concurrent download threads is 1. 
#                       num_threads=4)

References

  • For more information about the complete sample code that is used to perform resumable download, visit GitHub.
  • For more information about the API operation that you can call to perform resumable download, see GetObject.