×
Community Blog Replication Time Control - New Feature of Alibaba Cloud OSS

Replication Time Control - New Feature of Alibaba Cloud OSS

This article introduces the Replication Time Control (RTC) feature in Alibaba Cloud's Object Storage Service (OSS).

Alibaba Cloud Object Storage Service (OSS) provides the Cross-Region Replication (CRR) feature. However, depending on the data size, the replication process may take several hours. For more information about CRR, please refer to the documentation here.

A new feature called Replication Time Control (RTC) has been released for CRR. When RTC is enabled, it guarantees that 99.99% of the CRR data will be duplicated within 10 minutes as per the SLA. RTC also provides various metrics to monitor the status of CRR. For more details about RTC, refer to the documentation here. This article will verify the effectiveness of using this feature.

1. Overall Configuration Diagram

To confirm the effectiveness of RTC in this verification, we will compare the time required for CRR with RTC enabled and disabled. Since the RTC feature is currently not available in the Japan (Tokyo) region, this cross-replication verification will be performed using the China (Hangzhou) and China (Shanghai) regions. To find out the regions where RTC is available, visit the RTC introduction page.

For this verification, we will use two duplication patterns: multiple small files and a large file.

1

2. Preparation in OSS

2-1. Create a source bucket in the China (Hangzhou) region.

・Click Create Bucket to display the creation screen.
・Configure the necessary settings, such as the bucket name and region.
・Click OK to create the bucket.

2

2-2. As in Step 2-1, create two destination buckets A and B in the China (Shanghai) region.

3

2-3. Verify that the buckets are created.

4

2-4. Create a cross-region replication job between the source bucket and the destination bucket A with RTC disabled.

・On the details page of the source bucket, click Cross-Region Replication to go to the creation page.

・Select the destination bucket A to be the replication destination.

・Configure the replication job:

Objects to Replicate: All Files in Source Bucket
Replication Policy: Add/Delete/Change
Replicate Historical Data: Yes
Replicate Objects Encrypted based on KMS: No
Replication Time Control (RTC): Disabled

・Click OK.

・Click Enable on the pop-up window to perform the operation.

5

2-5. Wait until the status of the created replication job becomes Enabled.

6

2-6. Create a cross-region replication job between the source bucket and the destination bucket B with RTC enabled.

・On the details page of the source bucket, click Cross-Region Replication to go to the creation page.

・Select the destination bucket B to be the replication destination.

・Configure the replication job:

Objects to Replicate: All Files in Source Bucket
Replication Policy: Add/Delete/Change
Replicate Historical Data: Yes
Replicate Objects Encrypted based on KMS: No
Replication Time Control (RTC): Enabled

・Click OK.

・Click Enable on the pop-up window to perform the operation.

7

2-7. Wait until the status of both the created replication job and the RTC feature become Enabled.

8

3. Verification with Small File Replication

3-1. To measure the time required for cross-region replication, prepare a Python script to generate the number of files in the destination buckets A and B.

import oss2
import datetime

auth = oss2.Auth('yourAccessKeyId', 'yourAccessKeySecret')
bucket_src = oss2.Bucket(auth, 'yourEndpoint', 'bucket name')
bucket_dst = oss2.Bucket(auth, 'yourEndpoint', 'bucket name')
print('Test start on {0}'.format(datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')))

while True:
a_counts = 0
b_counts = 0
for obj in oss2.ObjectIterator(bucket_src, prefix=''):
a_counts += 1
print('Get {0} files in noRTC bucket on {1}'.format(a_counts, datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')))
for obj in oss2.ObjectIterator(bucket_dst, prefix=''):
b_counts += 1
print('Get {0} files in RTC bucket on {1}'.format(b_counts, datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')))

Output example

Test start on 2023-05-08 19:30:56
Get 0 files in noRTC bucket on 2023-05-08 19:30:57
Get 0 files in RTC bucket on 2023-05-08 19:30:57
Get 0 files in noRTC bucket on 2023-05-08 19:30:57
Get 0 files in RTC bucket on 2023-05-08 19:30:57
……

3-2. Prepare a Python script to generate 2,000 small files and upload them to the source bucket.

from faker import Faker
import random
import oss2
import datetime

def send_testing_data_file_to_oss(oss_bucket, file_counts, name_pattern):
fake = Faker(locale='ja_jp')
for i in range(file_counts):
data = []
counts = random.randint(1, 30)
print('Generate {0} lines as data.'.format(counts))
for c in range(counts):
data.append('{0},"{1}","{2}"'.format(fake.name(), fake.address(), fake.text()))
tmp_file_name = name_pattern.format(i)
oss_bucket.put_object(tmp_file_name, "\r\n".join(data))
print('Upload testing data file: {0} successfully!'.format(tmp_file_name))

auth = oss2.Auth('yourAccessKeyId', 'yourAccessKeySecret')
bucket = oss2.Bucket(auth, 'yourEndpoint', 'bucket name')
send_testing_data_file_to_oss(bucket, 2000, 'data_file_{0}_historical_update.csv')

3-3. While running the Python script from Step 3-1, simultaneously execute the Python script from Step 3-2 to upload 2,000 small files to the source bucket.

3-4. After the upload is complete, review the output log generated by the Python script.

Test start on 2023-05-08 20:00:06
Get 0 files in noRTC bucket on 2023-05-08 20:00:06
Get 0 files in RTC bucket on 2023-05-08 20:00:07
……
Get 1923 files in noRTC bucket on 2023-05-08 20:03:00
Get 1996 files in RTC bucket on 2023-05-08 20:03:10
Get 2000 files in noRTC bucket on 2023-05-08 20:03:18
Get 2000 files in RTC bucket on 2023-05-08 20:03:26

3-5. Calculate the replication time of the small files.

・RTC-disabled replication time (small files) = Time when the destination bucket A first obtains 2,000 files - Execution start time

・RTC-enabled replication time (small files) = Time when the destination bucket B first obtains 2,000 files - Execution start time

Verification pattern RTC-disabled (s) RTC-enabled (s)
2,000 small file duplication 192 200

3-6. For multiple small files, the upload and replication complete at approximately the same time. Because the upload speed becomes a bottleneck, the above verification method may not be used to confirm the effect of the RTC feature.

3-7. Delete all files from the source bucket to perform the next verification. Because the replication policy created in Steps 2-4 and 2-6 for the replication jobs includes the Delete operation, it replicates the Delete operation to the destination buckets A and B, resulting in the deletion of all files there.

4. Verification with Large File Replication

4-1. To measure the upload speed with a large file, create an ECS instance with high network performance in the same region as the source bucket.

Region: China (Hangzhou)

Spec: ecs.c7.3xlarge

Disk size: 240 GB

OS: CentOS 7.9

9

10

4-3. After the ECS instance is started, create a 200 GB dummy file by running the following command:

# fallocate -l 214748364800 testfile

4-4. Use ossutil, an official tool provided by Alibaba Cloud, to upload the large file. The tool package can be downloaded from the URL below and unzipped.

https://gosspublic.alicdn.com/ossutil/1.7.15/ossutil-v1.7.15-linux-amd64.zip

4-5. While running the Python script from Step 3-1, simultaneously execute the following command to upload the 200 GB file to the source bucket through the intranet:

# ./ossutil64 cp testfile oss://[bucketname]/ -r -f -e oss-cn-hangzhou-internal.aliyuncs.com -i [accessid] -k [accesskey]

4-6. On the details pages of the destination buckets A and B in the OSS console, you can check the parts of the file that is being uploaded. You can see that the replication started before the single file was uploaded to the source bucket regardless of RTC.

11
12

4-7. When the upload is complete, the processing time (approximately 345 seconds) and the average speed (approximately 600 MB/s) are displayed.

13

4-8. Review the output log generated by the Python script.

Test start on 2023-05-08 20:31:39
Get 0 files in noRTC bucket on 2023-05-08 20:31:40
Get 0 files in RTC bucket on 2023-05-08 20:31:40
……
Get 0 files in RTC bucket on 2023-05-08 20:37:50
Get 0 files in noRTC bucket on 2023-05-08 20:37:50
Get 1 files in RTC bucket on 2023-05-08 20:37:50
Get 0 files in noRTC bucket on 2023-05-08 20:37:50
…..
Get 0 files in noRTC bucket on 2023-05-08 20:39:35
Get 1 files in RTC bucket on 2023-05-08 20:39:35
Get 1 files in noRTC bucket on 2023-05-08 20:39:35
Get 1 files in RTC bucket on 2023-05-08 20:39:35

4-9. Calculate the replication time of the large file.

・RTC-disabled replication time (large file) = Time when the destination bucket A first obtains 1 file - Execution start time

・RTC-enabled replication time (large file) = Time when the destination bucket B first obtains 1 file - Execution start time

Verification pattern RTC-disabled (s) RTC-enabled (s)
Large file (200 GB) duplication 476 371

4-10. For a large file, a time lag exists between upload and replication completion, so the upload speed is not a bottleneck; therefore, the verification above successfully confirms the effect of RTC.

5. Summary

As of the publication date on May 12, 2023, Alibaba Cloud confirmed that, although the RTC feature was initially free of charge, they plan to introduce charges in the near future. Before usage, please check the official OSS price list page.

The new Replication Time Control feature in Alibaba Cloud OSS has been verified in this topic. While the effect of RTC could not be confirmed for multiple small files due to the upload speed becoming the bottleneck over the replication speed, a 105-second reduction with RTC duplication for a large file (200 GB) has been observed. If you have a need for low-latency cross-region replication, please try the RTC feature.

This article is a translated piece of work from SoftBank: https://www.softbank.jp/biz/blog/cloud-technology/articles/202305/rtc/

Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.

0 1 0
Share on

H Ohara

9 posts | 0 followers

You may also like

Comments

H Ohara

9 posts | 0 followers

Related Products