how to use oss python sdk v2 to upload files - Object Storage Service

This topic describes how to use the new Uploader module in Python SDK V2 to upload files.

Notes

The sample code in this topic uses the public endpoint for the China (Hangzhou) region. The region ID is cn-hangzhou. If you want to access OSS from other Alibaba Cloud services in the same region, you must use an internal endpoint. For more information about the regions and endpoints that OSS supports, see OSS regions and endpoints.
To perform an upload, you must have the oss:PutObject permission. For more information, see Grant custom permissions to a RAM user.

Method definition

Introduction to the upload manager

The new Uploader module in Python SDK V2 provides a unified upload method that abstracts the underlying implementation details to simplify file uploads.

The Uploader uses the multipart upload method to split a file or stream into multiple parts and then uploads the parts concurrently. This process improves upload performance.
The Uploader also provides a resumable upload feature. During the upload, the Uploader records the status of completed parts. If the upload is interrupted by an issue such as a network error or an unexpected program exit, you can resume the upload from the recorded breakpoints.

The following table describes the common methods of the Uploader.

class Uploader:
  ...

def uploader(self, **kwargs) -> Uploader:
  ...

def upload_file(self, request: models.PutObjectRequest, filepath: str, **kwargs: Any) -> UploadResult:
  ...
  
def upload_from(self, request: models.PutObjectRequest, reader: IO[bytes], **kwargs: Any) -> UploadResult:
  ...

Request parameters

Parameter	Type	Description
request	PutObjectRequest	The request parameters for uploading an object. These parameters are the same as those for the PutObject method. For more information, see PutObjectRequest
reader	IO[bytes]	The data stream to be uploaded
filepath	str	The path of the local file
**kwargs	Any	(Optional) Any parameter. The type is dictionary.

Response parameters

Type	Description
UploadResult	The response parameters for uploading an object. For more information, see UploadResult

When you use client.uploader to initialize an upload manager instance, you can specify configuration options to customize the upload behavior. You can also specify these options for each upload API call to customize the behavior for a specific object upload. For example, you can specify the part size.

Set the configuration parameters for the uploader

uploader = client.uploader(part_size=10  * 1024 * 1024)

Set the configuration parameters for each upload request

result = uploader.upload_file(oss.PutObjectRequest(
        bucket="example_bucket",
        key="example_key",
    ),
    filepath="/local/dir/example",
    part_size=10 * 1024 * 1024,
)

The following table describes the common configuration options.

Parameter	Type	Description
part_size	int	Specifies the part size. The default value is 6 MiB.
parallel_num	int	Specifies the number of concurrent upload tasks. The default value is 3. This parameter limits the concurrency for a single call, not the global concurrency.
leave_parts_on_error	bool	Specifies whether to retain the uploaded parts when the upload fails. By default, the parts are not retained.
enable_checkpoint	bool	Specifies whether to enable resumable upload. By default, this feature is disabled. Note The enable_checkpoint parameter is valid only for the upload_file method. The upload_from method does not support this parameter.
checkpoint_dir	str	Specifies the path where the record file is saved, for example, /local/dir/. This parameter is valid only when enable_checkpoint is set to true.

For the complete method definitions of the file upload manager, see Uploader.

Sample code

You can use the following code to upload a local file to a bucket using the upload manager.

import argparse
import alibabacloud_oss_v2 as oss

# Create a command-line argument parser and describe the script's purpose: upload file sample
parser = argparse.ArgumentParser(description="upload file sample")

# Add the --region command-line argument, which specifies the region where the bucket is located. This is a required parameter.
parser.add_argument('--region', help='The region in which the bucket is located.', required=True)
# Add the --bucket command-line argument, which specifies the name of the bucket to which the file is uploaded. This is a required parameter.
parser.add_argument('--bucket', help='The name of the bucket.', required=True)
# Add the --endpoint command-line argument, which specifies the domain name that other services can use to access OSS. This is an optional parameter.
parser.add_argument('--endpoint', help='The domain names that other services can use to access OSS')
# Add the --key command-line argument, which specifies the key of the object (file) in OSS. This is a required parameter.
parser.add_argument('--key', help='The name of the object.', required=True)
# Add the --file_path command-line argument, which specifies the path of the local file to be uploaded. This is a required parameter, for example, "/Users/yourLocalPath/yourFileName".
parser.add_argument('--file_path', help='The path of Upload file.', required=True)

def main():
    # Parse the command-line arguments to obtain the user-provided values.
    args = parser.parse_args()

    # Load the authentication information required to access OSS from environment variables for identity verification.
    credentials_provider = oss.credentials.EnvironmentVariableCredentialsProvider()

    # Create a configuration object using the default configurations of the SDK and set the credentials provider.
    cfg = oss.config.load_default()
    cfg.credentials_provider = credentials_provider
    
    # Set the region property of the configuration object based on the command-line arguments provided by the user.
    cfg.region = args.region

    # If a custom endpoint is provided, update the endpoint property in the configuration object.
    if args.endpoint is not None:
        cfg.endpoint = args.endpoint

    # Initialize the OSS client using the preceding configurations to prepare for interaction with OSS.
    client = oss.Client(cfg)

    # Create an object for uploading files.
    uploader = client.uploader()

    # Call the method to perform the file upload operation.
    result = uploader.upload_file(
        oss.PutObjectRequest(
            bucket=args.bucket,  # Specify the destination bucket.
            key=args.key,        # Specify the name of the file in OSS.
        ),
        filepath=args.file_path  # Specify the location of the local file.
    )

    # Print information about the upload result, including the status code, request ID, and Content-MD5.
    print(f'status code: {result.status_code},'
          f' request id: {result.request_id},'
          f' content md5: {result.headers.get("Content-MD5")},'
          f' etag: {result.etag},'
          f' hash crc64: {result.hash_crc64},'
          f' version id: {result.version_id},'
          f' server time: {result.headers.get("x-oss-server-time")},'
          )

# When this script is executed directly, call the main function to start the processing logic.
if __name__ == "__main__":
    main()  # The entry point of the script. The program flow starts from here.

Scenarios

Use the upload manager to enable resumable upload

You can use the following code to enable resumable upload.

import argparse
import alibabacloud_oss_v2 as oss

# Create a command-line argument parser and describe the script's purpose: upload file sample
parser = argparse.ArgumentParser(description="upload file sample")

# Add the --region command-line argument, which specifies the region where the bucket is located. This is a required parameter.
parser.add_argument('--region', help='The region in which the bucket is located.', required=True)
# Add the --bucket command-line argument, which specifies the name of the bucket to which the file is uploaded. This is a required parameter.
parser.add_argument('--bucket', help='The name of the bucket.', required=True)
# Add the --endpoint command-line argument, which specifies the domain name that other services can use to access OSS. This is an optional parameter.
parser.add_argument('--endpoint', help='The domain names that other services can use to access OSS')
# Add the --key command-line argument, which specifies the key of the object (file) in OSS. This is a required parameter.
parser.add_argument('--key', help='The name of the object.', required=True)
# Add the --file_path command-line argument, which specifies the path of the local file to be uploaded. This is a required parameter, for example, "/Users/yourLocalPath/yourFileName".
parser.add_argument('--file_path', help='The path of Upload file.', required=True)

def main():
    # Parse the command-line arguments to obtain the user-provided values.
    args = parser.parse_args()

    # Load the authentication information required to access OSS from environment variables for identity verification.
    credentials_provider = oss.credentials.EnvironmentVariableCredentialsProvider()

    # Create a configuration object using the default configurations of the SDK and set the credentials provider.
    cfg = oss.config.load_default()
    cfg.credentials_provider = credentials_provider
    
    # Set the region property of the configuration object based on the command-line arguments provided by the user.
    cfg.region = args.region

    # If a custom endpoint is provided, update the endpoint property in the configuration object.
    if args.endpoint is not None:
        cfg.endpoint = args.endpoint

    # Initialize the OSS client using the preceding configurations to prepare for interaction with OSS.
    client = oss.Client(cfg)

    # Create an object for uploading files, enable resumable upload, and specify the path to save the breakpoint record file.
    uploader = client.uploader(enable_checkpoint=True, checkpoint_dir="/Users/yourLocalPath/checkpoint/")

    # Call the method to perform the file upload operation.
    result = uploader.upload_file(
        oss.PutObjectRequest(
            bucket=args.bucket,  # Specify the destination bucket.
            key=args.key,        # Specify the name of the file in OSS.
        ),
        filepath=args.file_path  # Specify the location of the local file.
    )

    # Print information about the upload result, including the status code, request ID, and Content-MD5.
    print(f'status code: {result.status_code},'
          f' request id: {result.request_id},'
          f' content md5: {result.headers.get("Content-MD5")},'
          f' etag: {result.etag},'
          f' hash crc64: {result.hash_crc64},'
          f' version id: {result.version_id},'
          f' server time: {result.headers.get("x-oss-server-time")},'
          )

# When this script is executed directly, call the main function to start the processing logic.
if __name__ == "__main__":
    main()  # The entry point of the script. The program flow starts from here.

Use the upload manager to upload a local file stream

You can use the following code to upload a local file stream using the upload manager.

import argparse
import alibabacloud_oss_v2 as oss

# Create a command-line argument parser and describe the script's purpose: upload from file sample
parser = argparse.ArgumentParser(description="upload from sample")

# Add the --region command-line argument, which specifies the region where the bucket is located. This is a required parameter.
parser.add_argument('--region', help='The region in which the bucket is located.', required=True)
# Add the --bucket command-line argument, which specifies the name of the bucket to which the file is uploaded. This is a required parameter.
parser.add_argument('--bucket', help='The name of the bucket.', required=True)
# Add the --endpoint command-line argument, which specifies the domain name that other services can use to access OSS. This is an optional parameter.
parser.add_argument('--endpoint', help='The domain names that other services can use to access OSS')
# Add the --key command-line argument, which specifies the key of the object (file) in OSS. This is a required parameter.
parser.add_argument('--key', help='The name of the object.', required=True)
# Add the --file_path command-line argument, which specifies the path of the local file to be uploaded. This is a required parameter, for example, "/Users/yourLocalPath/yourFileName".
parser.add_argument('--file_path', help='The path of Upload file.', required=True)

def main():
    # Parse the command-line arguments to obtain the user-provided values.
    args = parser.parse_args()

    # Load the authentication information required to access OSS from environment variables for identity verification.
    credentials_provider = oss.credentials.EnvironmentVariableCredentialsProvider()

    # Create a configuration object using the default configurations of the SDK and set the credentials provider.
    cfg = oss.config.load_default()
    cfg.credentials_provider = credentials_provider
    
    # Set the region property of the configuration object based on the command-line arguments provided by the user.
    cfg.region = args.region

    # If a custom endpoint is provided, update the endpoint property in the configuration object.
    if args.endpoint is not None:
        cfg.endpoint = args.endpoint

    # Initialize the OSS client using the preceding configurations to prepare for interaction with OSS.
    client = oss.Client(cfg)

    # Create an object for uploading files.
    uploader = client.uploader()

    # Open the local file for reading in binary mode.
    with open(file=args.file_path, mode='rb') as f:
        # Call the method to perform the file upload operation.
        result = uploader.upload_from(
            oss.PutObjectRequest(
                bucket=args.bucket,  # Specify the destination bucket.
                key=args.key,        # Specify the name of the file in OSS.
            ),
            reader=f  # Pass in the file reader.
        )

        # Print information about the upload result, including the status code, request ID, and Content-MD5.
        print(f'status code: {result.status_code},'
              f' request id: {result.request_id},'
              f' content md5: {result.headers.get("Content-MD5")},'
              f' etag: {result.etag},'
              f' hash crc64: {result.hash_crc64},'
              f' version id: {result.version_id},'
              f' server time: {result.headers.get("x-oss-server-time")},'
              )

# When this script is executed directly, call the main function to start the processing logic.
if __name__ == "__main__":
    main()  # The entry point of the script. The program flow starts from here.

Use the upload manager to set the part size and concurrency

You can use the following code to specify the part size and concurrency.

import argparse
import alibabacloud_oss_v2 as oss

# Create a command-line argument parser and describe the script's purpose: upload file sample
parser = argparse.ArgumentParser(description="upload file sample")

# Add the --region command-line argument, which specifies the region where the bucket is located. This is a required parameter.
parser.add_argument('--region', help='The region in which the bucket is located.', required=True)
# Add the --bucket command-line argument, which specifies the name of the bucket to which the file is uploaded. This is a required parameter.
parser.add_argument('--bucket', help='The name of the bucket.', required=True)
# Add the --endpoint command-line argument, which specifies the domain name that other services can use to access OSS. This is an optional parameter.
parser.add_argument('--endpoint', help='The domain names that other services can use to access OSS')
# Add the --key command-line argument, which specifies the key of the object (file) in OSS. This is a required parameter.
parser.add_argument('--key', help='The name of the object.', required=True)
# Add the --file_path command-line argument, which specifies the path of the local file to be uploaded. This is a required parameter, for example, "/Users/yourLocalPath/yourFileName".
parser.add_argument('--file_path', help='The path of Upload file.', required=True)

def main():
    # Parse the command-line arguments to obtain the user-provided values.
    args = parser.parse_args()

    # Load the authentication information required to access OSS from environment variables for identity verification.
    credentials_provider = oss.credentials.EnvironmentVariableCredentialsProvider()

    # Create a configuration object using the default configurations of the SDK and set the credentials provider.
    cfg = oss.config.load_default()
    cfg.credentials_provider = credentials_provider
    
    # Set the region property of the configuration object based on the command-line arguments provided by the user.
    cfg.region = args.region

    # If a custom endpoint is provided, update the endpoint property in the configuration object.
    if args.endpoint is not None:
        cfg.endpoint = args.endpoint

    # Initialize the OSS client using the preceding configurations to prepare for interaction with OSS.
    client = oss.Client(cfg)

    # Create an object for uploading files and set the part size and concurrency.
    uploader = client.uploader(
        part_size=100 * 1024,  # Set the part size to 100 KB.
        parallel_num=5,        # Set the concurrency to 5.
        leave_parts_on_error=True  # Retain the uploaded parts in case of an error.
    )

    # Call the method to perform the file upload operation.
    result = uploader.upload_file(
        oss.PutObjectRequest(
            bucket=args.bucket,  # Specify the destination bucket.
            key=args.key,        # Specify the name of the file in OSS.
        ),
        filepath=args.file_path  # Specify the location of the local file.
    )

    # Print information about the upload result, including the status code, request ID, and Content-MD5.
    print(f'status code: {result.status_code},'
          f' request id: {result.request_id},'
          f' content md5: {result.headers.get("Content-MD5")},'
          f' etag: {result.etag},'
          f' hash crc64: {result.hash_crc64},'
          f' version id: {result.version_id},'
          f' server time: {result.headers.get("x-oss-server-time")},'
          )

# When this script is executed directly, call the main function to start the processing logic.
if __name__ == "__main__":
    main()  # The entry point of the script. The program flow starts from here.

Use the upload manager to configure an upload callback

If you want to notify an application server after a file is uploaded, use the following sample code.

import argparse
import base64
import alibabacloud_oss_v2 as oss

# Create a command-line argument parser and describe the script's purpose: upload file sample
parser = argparse.ArgumentParser(description="upload file sample")

# Add the --region command-line argument, which specifies the region where the bucket is located. This is a required parameter.
parser.add_argument('--region', help='The region in which the bucket is located.', required=True)
# Add the --bucket command-line argument, which specifies the name of the bucket to which the file is uploaded. This is a required parameter.
parser.add_argument('--bucket', help='The name of the bucket.', required=True)
# Add the --endpoint command-line argument, which specifies the domain name that other services can use to access OSS. This is an optional parameter.
parser.add_argument('--endpoint', help='The domain names that other services can use to access OSS')
# Add the --key command-line argument, which specifies the key of the object (file) in OSS. This is a required parameter.
parser.add_argument('--key', help='The name of the object.', required=True)
# Add the --file_path command-line argument, which specifies the path of the local file to be uploaded. This is a required parameter.
parser.add_argument('--file_path', help='The path of Upload file.', required=True)

def main():
    # Parse the command-line arguments to obtain the user-provided values.
    args = parser.parse_args()

    # Load the authentication information required to access OSS from environment variables for identity verification.
    credentials_provider = oss.credentials.EnvironmentVariableCredentialsProvider()

    # Create a configuration object using the default configurations of the SDK and set the credentials provider.
    cfg = oss.config.load_default()
    cfg.credentials_provider = credentials_provider

    # Set the region property of the configuration object based on the command-line arguments provided by the user.
    cfg.region = args.region

    # If a custom endpoint is provided, update the endpoint property in the configuration object.
    if args.endpoint is not None:
        cfg.endpoint = args.endpoint

    # Initialize the OSS client using the preceding configurations to prepare for interaction with OSS.
    client = oss.Client(cfg)

    # Create an uploader object for uploading files.
    uploader = client.uploader()

    # Define the webhook address.
    call_back_url = "http://www.example.com/callback"
    # Construct the callback parameter (callback): specify the webhook address and the request body for the callback, and encode them in Base64.
    callback=base64.b64encode(str('{\"callbackUrl\":\"' + call_back_url + '\",\"callbackBody\":\"bucket=${bucket}&object=${object}&my_var_1=${x:var1}&my_var_2=${x:var2}\"}').encode()).decode()
    # Construct the custom variable (callback-var) and encode it in Base64.
    callback_var=base64.b64encode('{\"x:var1\":\"value1\",\"x:var2\":\"value2\"}'.encode()).decode()

    # Call the method to perform the file upload operation.
    result = uploader.upload_file(
        oss.PutObjectRequest(
            bucket=args.bucket,  # Specify the destination bucket.
            key=args.key,        # Specify the name of the file in OSS.
            callback=callback,
            callback_var=callback_var,
        ),
        filepath=args.file_path,  # Specify the location of the local file.
    )

    # Print information about the upload result, including the status code, request ID, and Content-MD5.
    print(f'status code: {result.status_code},'
          f' request id: {result.request_id},'
          f' content md5: {result.headers.get("Content-MD5")},'
          f' etag: {result.etag},'
          f' hash crc64: {result.hash_crc64},'
          f' version id: {result.version_id},'
          f' server time: {result.headers.get("x-oss-server-time")},'
          )

# When this script is executed directly, call the main function to start the processing logic.
if __name__ == "__main__":
    main()  # The entry point of the script. The program flow starts from here.

References

For more information about the upload manager, see Developer Guide.
For complete examples of the upload manager, see upload_file.py and upload_from.py.