This topic describes how to use the Copier module of the OSS SDK for Python V2 to copy files for large file transfers.
Notes
The sample code in this topic uses the region ID
cn-hangzhoufor the China (Hangzhou) region. A public endpoint is used by default. If you access OSS from other Alibaba Cloud services in the same region, use an internal endpoint. For more information about the mappings between OSS regions and endpoints, see Regions and endpoints.To copy an object, you must have read permission on the source object and read and write permissions on the destination bucket.
Cross-region copy is not supported. For example, you cannot copy an object from a bucket in the China (Hangzhou) region to a bucket in the China (Qingdao) region.
When you copy an object, ensure that no retention policies are configured for the source and destination buckets. Otherwise, the The object you specified is immutable. error is returned.
Method definition
Introduction to the copy manager
To copy an object from one bucket to another or to modify the properties of an object, you can use the copy operation or the multipart copy operation. Each operation is suitable for a different scenario:
The copy operation (CopyObject) is suitable only for copying an object smaller than 5 GiB.
The multipart copy operation (UploadPartCopy) supports copying an object that is larger than 5 GiB. However, this operation does not support the (x-oss-metadata-directive) or (x-oss-tagging-directive) parameters. When you use this operation, you must specify the metadata and tags to be copied.
The copy manager Copier is a new feature in OSS SDK for Python V2 that provides a universal copy interface. This interface abstracts the underlying implementation details and automatically selects an appropriate copy operation based on the request parameters. The following code shows the common methods of the Copier:
class CopyError(exceptions.BaseError):
...
def copier(self, **kwargs) -> Copier:
...
def copy(self, request: models.CopyObjectRequest, **kwargs: Any) -> CopyResult:
...Request parameters
Parameter | Type | Description |
request | CopyObjectRequest | The request parameters of the operation. For more information, see CopyObjectRequest. |
**kwargs | Any | (Optional) Any parameter of the dictionary type. |
The following table describes the common parameters of CopyObjectRequest.
Parameter | Type | Description |
bucket | str | The name of the destination bucket. |
key | str | The name of the destination object. |
source_bucket | str | The name of the source bucket. |
source_key | str | The name of the source object. |
forbid_overwrite | str | Specifies whether to overwrite a destination object that has the same name during the CopyObject operation. |
tagging | str | The tags of the object. You can specify multiple tags at a time. Example: TagA=A&TagB=B. |
tagging_directive | str | Specifies how to set tags for the destination object. Valid values:
|
You can customize the copy behavior of objects by specifying configuration options when you initialize a copy manager instance using client.copier. You can also specify configuration options for each copy call to customize the behavior for a specific object.
Set the configuration parameters for the Copier
copier = client.copier( part_size=100 * 1024 * 1024, )Set the configuration parameters for each copy request
result = copier.copy(oss.CopyObjectRequest( bucket="example_bucket", key="example_key", source_bucket="example_source_bucket", source_key="example_source_key", ), part_size=100 * 1024 * 1024, )
The following table describes the common configuration options.
Parameter | Type | Description |
part_size | int | The part size. The default value is 64 MiB. |
parallel_num | int | The number of concurrent copy tasks. Default value: 3. This parameter specifies the concurrency limit for a single call, not the global concurrency limit. |
multipart_copy_threshold | int64 | The threshold for multipart copy. The default value is 200 MiB. |
leave_parts_on_error | bool | Specifies whether to retain the copied parts if the copy fails. By default, the copied parts are not retained. |
disable_shallow_copy | bool | Specifies whether to disable shallow copy. By default, shallow copy is used. |
Sample code
The following sample code shows how to copy an object from a source bucket to a destination bucket.
import argparse
import alibabacloud_oss_v2 as oss
# Create a command-line argument parser.
parser = argparse.ArgumentParser(description="copier sample")
# Add the command-line parameter: region (required), which specifies the region where the bucket is located.
parser.add_argument('--region', help='The region in which the bucket is located.', required=True)
# Add the command-line parameter: bucket (required), which specifies the name of the destination bucket.
parser.add_argument('--bucket', help='The name of the bucket.', required=True)
# Add the command-line parameter: endpoint (optional), which specifies the endpoint for accessing OSS.
parser.add_argument('--endpoint', help='The domain names that other services can use to access OSS')
# Add the command-line parameter: key (required), which specifies the name of the destination object.
parser.add_argument('--key', help='The name of the object.', required=True)
# Add the command-line parameter: source_key (required), which specifies the name of the source object.
parser.add_argument('--source_key', help='The name of the source address for object.', required=True)
# Add the command-line parameter: source_bucket (required), which specifies the name of the source bucket.
parser.add_argument('--source_bucket', help='The name of the source address for bucket.', required=True)
def main():
# Parse the command-line arguments.
args = parser.parse_args()
# Load credentials from environment variables.
# Use EnvironmentVariableCredentialsProvider to read the AccessKey ID and AccessKey secret from environment variables.
credentials_provider = oss.credentials.EnvironmentVariableCredentialsProvider()
# Use the default configurations of the SDK.
cfg = oss.config.load_default()
cfg.credentials_provider = credentials_provider # Set the credential provider.
cfg.region = args.region # Set the region where the bucket is located.
if args.endpoint is not None:
cfg.endpoint = args.endpoint # If an endpoint is provided, set a custom endpoint.
# Create an OSS client instance.
client = oss.Client(cfg)
# Create a Copier instance and perform the object copy operation.
copier = client.copier()
# Perform the object copy operation.
result = copier.copy(
oss.CopyObjectRequest(
bucket=args.bucket, # The name of the destination bucket.
key=args.key, # The name of the destination object.
source_bucket=args.source_bucket, # The name of the source bucket.
source_key=args.source_key # The name of the source object.
)
)
# Print the copy result.
# Use vars(result) to convert the result object to the dictionary format and print the result.
print(vars(result))
if __name__ == "__main__":
main()
Scenarios
References
For more information about the copy manager, see Developer Guide.
For the complete sample code for the copy manager, see copier.py.