This topic describes how to use the Downloader module of Python SDK V2 to download files.
Usage notes
The sample code in this topic uses the China (Hangzhou) region with the region ID
cn-hangzhouas an example. By default, a public endpoint is used. If you access OSS from other Alibaba Cloud services in the same region, use an internal endpoint. For more information about the regions and endpoints that OSS supports, see Regions and endpoints.To download a file, you must have the
oss:GetObjectpermission. For more information, see Attach a custom policy to a RAM user.
Method definition
Downloader features
The Downloader module of Python SDK V2 provides a versatile download method that abstracts the underlying implementation details to offer a convenient file download feature.
The Downloader module uses range download to automatically split a file into smaller parts and download the parts in parallel, which improves download performance.
The Downloader module also provides the resumable download feature. During the download process, the status of completed parts is recorded. If the download is interrupted by issues such as network failures or unexpected program exits, you can resume the download from the last breakpoint.
The following code shows the common methods of the Downloader module:
class Downloader:
...
def downloader(self, **kwargs) -> Downloader:
...
def download_file(self, request: models.GetObjectRequest, filepath: str, **kwargs: Any) -> DownloadResult:
...
def download_to(self, request: models.GetObjectRequest, writer: IO[bytes], **kwargs: Any) -> DownloadResult:
...Request parameters
Parameter | Type | Description |
request | GetObjectRequest | The request parameters for downloading an object. The parameters are the same as those for the GetObject method. For more information, see GetObjectRequest |
filepath | str | The path of the local file. |
writer | IO[bytes] | The download stream. |
**kwargs | Any | (Optional) Arbitrary parameter. Type: dictionary |
Response parameters
Type | Description |
DownloadResult | The response parameters for downloading an object. For more information, see DownloadResult |
When you use client.downloader to initialize a downloader instance, you can specify configuration options to customize the download behavior. You can also specify configuration options for each download call to customize the behavior for each object. For example, you can specify the part size as follows:
Set the configuration parameters of the downloader.
downloader = client.downloader(part_size=1024 * 1024)Set the configuration parameters for each download request.
result = downloader.download_file(oss.GetObjectRequest( bucket="example_bucket", key="example_key", ), filepath="/local/dir/example", part_size=10 * 1024 * 1024, )
The following table describes the common configuration options.
Parameter | Type | Description |
part_size | int | Specifies the part size. The default value is 6 MiB. |
parallel_num | int | Specifies the number of concurrent download tasks. The default value is 3. This parameter specifies the concurrency limit for a single call, not the global concurrency limit. |
enable_checkpoint | bool | Specifies whether to enable the resumable download feature. By default, this feature is disabled. |
checkpoint_dir | str | Specifies the path to save the record file. Example: /local/dir/. This parameter is valid only when enable_checkpoint is set to True. |
verify_data | bool | Specifies whether to verify the CRC-64 value of the downloaded data when the download is resumed. By default, the value is not verified. This parameter is valid only when enable_checkpoint is set to True. |
use_temp_file | bool | Specifies whether to use a temporary file when you download a file. By default, a temporary file is used. The file is first downloaded to the temporary file. After the download is successful, the temporary file is renamed to the object file. |
For more information about the method definition of the file download manager, see Downloader.
Sample code
You can use the following code to download a file from a bucket to a local device.
import argparse
import alibabacloud_oss_v2 as oss
# Create a command-line argument parser and describe the purpose of the script: download file sample.
parser = argparse.ArgumentParser(description="download file sample")
# Add the command-line argument --region, which indicates the region where the bucket is located. This argument is required.
parser.add_argument('--region', help='The region in which the bucket is located.', required=True)
# Add the command-line argument --bucket, which indicates the name of the bucket from which you want to download the file. This argument is required.
parser.add_argument('--bucket', help='The name of the bucket.', required=True)
# Add the command-line argument --endpoint, which indicates the domain name that other services can use to access OSS. This argument is optional.
parser.add_argument('--endpoint', help='The domain names that other services can use to access OSS')
# Add the command-line argument --key, which indicates the key of the object (file) in OSS. This argument is required.
parser.add_argument('--key', help='The name of the object.', required=True)
# Add the command-line argument --file_path, which indicates the local path to save the downloaded file. This argument is required. For example, "/Users/yourLocalPath/yourFileName".
parser.add_argument('--file_path', help='The path to save the downloaded file.', required=True)
def main():
# Parse the command-line arguments to obtain the values entered by the user.
args = parser.parse_args()
# Load the authentication information required to access OSS from environment variables for identity verification.
credentials_provider = oss.credentials.EnvironmentVariableCredentialsProvider()
# Use the default configurations of the SDK to create a configuration object and set the authentication provider.
cfg = oss.config.load_default()
cfg.credentials_provider = credentials_provider
# Set the region property of the configuration object based on the command-line arguments provided by the user.
cfg.region = args.region
# If a custom endpoint is provided, update the endpoint property of the configuration object.
if args.endpoint is not None:
cfg.endpoint = args.endpoint
# Use the preceding configurations to initialize the OSS client to interact with OSS.
client = oss.Client(cfg)
# Create an object for downloading files.
downloader = client.downloader()
# Call the method to perform the file download operation.
result = downloader.download_file(
oss.GetObjectRequest(
bucket=args.bucket, # Specify the destination bucket.
key=args.key, # Specify the name of the file in OSS.
),
filepath=args.file_path # Specify the local path to save the downloaded file.
)
# Print information about the download result, including the number of bytes written.
print(f'written: {result.written}')
# When this script is directly executed, call the main function to start the processing logic.
if __name__ == "__main__":
main() # The entry point of the script, where the program flow starts.Common scenarios
References
For more information about the download manager, see Developer Guide.
For the complete sample code for the download manager, see download_file.py.