How to use the SDK for resumable downloads - Object Storage Service

You may fail to download a large object if the network is unstable or other exceptions occur. In some cases, you may fail to download the object even after multiple attempts. To resolve this issue, Object Storage Service (OSS) provides the resumable download feature. In resumable download, OSS splits an object into multiple parts and downloads each part separately. After all parts are downloaded, OSS combines the parts into a complete object.

Usage notes

In this topic, the public endpoint of the China (Hangzhou) region is used. If you want to access OSS from other Alibaba Cloud services in the same region as OSS, use an internal endpoint. For more information about OSS regions and endpoints, see Regions and endpoints.
In this topic, access credentials are obtained from environment variables. For more information about how to configure access credentials, see Configure access credentials.
In this topic, an OSSClient instance is created by using an OSS endpoint. If you want to create an OSSClient instance by using custom domain names or Security Token Service (STS), see Initialization.
To use resumable download, you must have the oss:GetObject permission. For more information, see Attach a custom policy to a RAM user.

How it works

The resumable download process is as follows:

A temporary file is created on the local disk. The name of the temporary file consists of the original file name and a random suffix.
The OSS file is read in ranges by specifying the Range header in the HTTP request. The data is then written to the corresponding position in the temporary file.
After the download is complete, the temporary file is renamed to the name of the object file. If a file with the same name as the object file already exists, it is overwritten.

Warning

Breakpoint information on the local disk can be overwritten, and temporary file names may conflict. Avoid calling the oss2.resumable_download method from multiple programs or threads at the same time to download the same source file to the same object file.

Sample code

The following code shows how to perform a resumable download:

# -*- coding: utf-8 -*-
import oss2
from oss2.credentials import EnvironmentVariableCredentialsProvider
# Obtain access credentials from environment variables. Before you run this sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are set.
auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider())

# Specify the endpoint of the region where the bucket is located. For example, if the bucket is in the China (Hangzhou) region, set the endpoint to https://oss-cn-hangzhou.aliyuncs.com.
endpoint = "https://oss-cn-hangzhou.aliyuncs.com"

# Specify the region where the endpoint is located, for example, cn-hangzhou. Note that this parameter is required for V4 signatures.
region = "cn-hangzhou"

# Set yourBucketName to the name of the bucket.
bucket = oss2.Bucket(auth, endpoint, "yourBucketName", region=region)

# Set yourObjectName to the full path of the object. The full path cannot contain the bucket name. Example: exampledir/exampleobject.txt.
# Set yourLocalFile to the full path of the local file. Example: D:\\localpath\\examplefile.txt.
oss2.resumable_download(bucket, 'exampledir/exampleobject.txt', 'D:\\localpath\\examplefile.txt')
# If you do not use the store parameter to specify a folder, a .py-oss-upload folder is created in the HOME directory to save breakpoint information.

# Python SDK 2.1.0 and later support optional parameters for resumable downloads.
# import sys
# # If the length of the data to be downloaded cannot be determined, the value of total_bytes is None.
# def percentage(consumed_bytes, total_bytes):
#     if total_bytes:
#         rate = int(100 * (float(consumed_bytes) / float(total_bytes)))
#         print('\r{0}% '.format(rate), end='')
#         sys.stdout.flush()
# # If you use the store parameter to specify a folder, breakpoint information is saved in the specified folder. If you use num_threads to set the number of concurrent download threads, set oss2.defaults.connection_pool_size to a value greater than or equal to the number of concurrent download threads. The default number of concurrent download threads is 1.
# oss2.resumable_download(bucket,  'exampledir/exampleobject.txt', 'D:\\localpath\\examplefile.txt',
#                       store=oss2.ResumableDownloadStore(root='/tmp'),
#                       # Use resumable download if the file size is greater than or equal to the optional multipart_threshold parameter. The default value is 10 MB.
#                       multiget_threshold=100*1024,
#                       # Set the shard size in bytes. The value must be in the range of 100 KB to 5 GB. The default value is 100 KB.
#                       part_size=100*1024,
#                       # Set the download progress callback function.
#                       progress_callback=percentage,
#                       # If you use num_threads to set the number of concurrent download threads, set oss2.defaults.connection_pool_size to a value greater than or equal to the number of concurrent download threads. The default number of concurrent download threads is 1.
#                       num_threads=4)

References

For the complete sample code for resumable downloads, see GitHub examples.
For more information about the API operation for resumable downloads, see GetObject.