Object Storage Service (OSS) SDK for Java uses MD5 verification and CRC-64 to ensure data integrity when you upload, download, and copy objects.
Usage notes
In this topic, the public endpoint of the China (Hangzhou) region is used. If you want to access OSS from other Alibaba Cloud services in the same region as OSS, use an internal endpoint. For more information about OSS regions and endpoints, see Regions and endpoints.
In this topic, access credentials are obtained from environment variables. For more information about how to configure access credentials, see Configure access credentials using OSS SDK for Python 1.0.
In this topic, an OSSClient instance is created by using an OSS endpoint. If you want to create an OSSClient instance by using custom domain names or Security Token Service (STS), see Initialization.
MD5 validation
If you configure Content-MD5 in an object upload request, OSS calculates the MD5 hash of the uploaded object. If the calculated MD5 hash is different from the MD5 hash configured in the upload request, InvalidDigest is returned. This allows OSS to ensure data integrity for object uploads. If InvalidDigest is returned, you need to upload the object again.
The following sample code provides an example on how to configure MD5 verification in a PutObject operation:
# -*- coding: utf-8 -*-
import oss2
from oss2.credentials import EnvironmentVariableCredentialsProvider
# Obtain access credentials from environment variables. Before you run this sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are set.
auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider())
# Specify the Endpoint for the region where the bucket is located. For example, if the bucket is in the China (Hangzhou) region, set the Endpoint to https://oss-cn-hangzhou.aliyuncs.com.
endpoint = "https://oss-cn-hangzhou.aliyuncs.com"
# Specify the region that corresponds to the Endpoint, for example, cn-hangzhou. Note that this parameter is required for v4 signatures.
region = "cn-hangzhou"
# Replace examplebucket with the name of your bucket.
bucket = oss2.Bucket(auth, endpoint, "examplebucket", region=region)
# Specify the full path of the object. The full path cannot include the bucket name. For example, exampledir/exampleobject.txt.
object_name = 'exampledir/exampleobject.txt'
# Specify the local path of the file to upload. The value of this variable is transferred to OSS as the content to upload. The file can be of any type, such as text, image, video, or audio.
with open('/Users/test/Desktop/demo.txt', 'rb') as file:
content = file.read()
# Calculate the MD5 hash of the content to be uploaded.
content_md5 = oss2.utils.content_md5(content)
print('content_md5', content_md5)
# Include the 'Content-MD5' header in the upload request. The server verifies the MD5 hash of the uploaded content to ensure its integrity and correctness.
headers = dict()
headers['Content-MD5'] = content_md5
bucket.put_object(object_name, content, headers=headers)MD5 validation is supported for put_object, append_object, post_object, and upload_part.
CRC-64 validation
When you use cyclic redundancy check (CRC) for data validation, note the following:
CRC-64 validation is supported for put_object, get_object, append_object, and upload_part. CRC validation is enabled by default for file uploads. If the CRC value calculated by the client does not match the CRC value returned by the server, an InconsistentError exception is thrown.
Range downloads do not support CRC-64 validation.
CRC-64 validation consumes CPU resources and can affect upload and download speeds.
CRC-64 validation for downloads
The following code shows how to perform CRC-64 data integrity validation when you download a file:
# -*- coding: utf-8 -*- import oss2 from oss2.credentials import EnvironmentVariableCredentialsProvider # Obtain access credentials from environment variables. Before you run this sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are set. auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider()) # Specify the Endpoint for the region where the bucket is located. For example, if the bucket is in the China (Hangzhou) region, set the Endpoint to https://oss-cn-hangzhou.aliyuncs.com. endpoint = "https://oss-cn-hangzhou.aliyuncs.com" # Specify the region that corresponds to the Endpoint, for example, cn-hangzhou. Note that this parameter is required for v4 signatures. region = "cn-hangzhou" # Replace examplebucket with the name of your bucket. bucket = oss2.Bucket(auth, endpoint, "examplebucket", region=region) # Specify the full path of the object. The full path cannot include the bucket name. object_name = 'yourObjectName' # Check whether CRC validation is enabled by default. print('bucket.enable-crc:', bucket.enable_crc) # The return value of bucket.get_object is a file-like object and is also iterable. object_stream = bucket.get_object(object_name) print(object_stream.read()) # Because the get_object operation returns a stream, you must call read() before you can calculate the CRC checksum of the returned object data. Therefore, perform CRC validation after you call this operation. if object_stream.client_crc != object_stream.server_crc: print("The CRC checksum between client and server is inconsistent!")CRC-64 validation for append uploads
For append uploads, if you specify the init_crc parameter, CRC-64 validation is enabled by default.
# -*- coding: utf-8 -*- import oss2 from oss2.credentials import EnvironmentVariableCredentialsProvider # Obtain access credentials from environment variables. Before you run this sample code, make sure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are set. auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider()) # Specify the Endpoint for the region where the bucket is located. For example, if the bucket is in the China (Hangzhou) region, set the Endpoint to https://oss-cn-hangzhou.aliyuncs.com. endpoint = "https://oss-cn-hangzhou.aliyuncs.com" # Specify the region that corresponds to the Endpoint, for example, cn-hangzhou. Note that this parameter is required for v4 signatures. region = "cn-hangzhou" # Replace examplebucket with the name of your bucket. bucket = oss2.Bucket(auth, endpoint, "examplebucket", region=region) object_name = "yourAppendObjectName" first_content = "yourFirstContent" second_content = "yourSecondContent" # First append upload. # If init_crc is specified, the SDK performs CRC validation on the returned result by default. result = bucket.append_object(object_name, 0, first_content, init_crc=0) # Second append upload. # Set init_crc to the CRC value of the uploaded data. result = bucket.append_object(object_name, result.next_position, second_content, init_crc=result.crc)
References
For complete sample code for data validation, see the GitHub example.