This topic describes how to use ossimport to migrate data from a third-party storage service (or OSS) to OSS.
- You have activated OSS and created a bucket in the China (Hangzhou) region.
- You have configured a RAM user and assigned OSS access permissions to the RAM user.
Create a RAM user in the RAM console, authorize the RAM user to access OSS, and save the AccessKey ID and AccessKey secret. For more information, see Create and grant permissions to a RAM user.
- Optional. You have purchased an ECS instance.
The ECS instance and OSS instance are located in the same region, which is China (Hangzhou). We recommend that you purchase a general-purpose ECS instance with 2 vCPU and memory of 4 GiB. We recommend that you purchase a pay-as-you-go instance if you need to release the ECS instance after the data is migrated.Note If you want to deploy ossimport to a small number of machines, you can deploy them locally. If you want to deploy ossimport to a large number of machines, we recommend that you deploy them on an ECS instance. This example uses an ECS instance to show you how to perform a migration task.
The number of required ECS instances is calculated based on the formula: Number of required ECS instances = X/Y/(Z/100). In the formula, X indicates the amount of data to be migrated. Y indicates the required duration in days. Z indicates the migration speed in Mbit/s (about Z/100 TB of data to be migrated each day). If the migration speed of an ECS instance reaches 200 Mbit/s (about 2 TB of data is migrated each day), you need to purchase 36 ECS instances (500/7/2) in the preceding example).
- ossimport has been configured.
To meet the large-scale migration requirements in this example, you must build ossimport in distributed mode on ECS. For more information about the configuration and definition of distributed deployment, such as
conf/sys.properties, and concurrency control, see Architectures and configurations. For more information about operations on distributed deployment such as downloading ossimport and troubleshooting common errors during configurations, see Distributed deployment.
A user has 500 TB of data stored in the Guangzhou (South China) region of Tencent Cloud Object Storage (COS). The user wants to use ossimport to migrate the data to the China (Hangzhou) region of OSS within one week. The business runs properly during the migration process.
- Standalone mode is suitable to migrate data volumes smaller than 30 TB.
- Distributed mode is suitable to migrate data volumes larger than 30 TB.
To migrate a large amount of data, deploy ossimport in distributed mode.
The process of migrating data from a third-party storage service to OSS in distributed mode is as follows:
- Migrate all data last modified before T1. For more information, see the Running section in Distributed deployment.
Notice T1 is a Unix timestamp representing the number of milliseconds that have elapsed since the epoch time January 1, 1970, 00:00:00 UTC. You can run the date +%s command to obtain the seconds.
- Go to the OSS console. Configure a mirroring-based back-to-origin rule. Set Origin URL to the URL of the source (third-party storage service). For more information, see Configure back-to-origin rules.
- Switch the read/write operations on the business system to OSS. At this time, the
business system records time at T2.
Data last modified before T1 is read from OSS. Data modified after T1 is read from the third-party storage service through OSS mirroring-based back-to-origin. Data can be written to OSS after T2.
- Open the job.cfg configuration file. Specify importSince=T1. Reinitiate the migration task to migrate incremental data last modified between
T1 and T2.
- After step 4 is complete, all read and write operations on your business system are switched to OSS. Data stored in the third-party storage service is only a copy of historical data, which can be retained or deleted as needed.
- ossimport only migrates and verifies data, but does not delete any data.
Costs involved during the migration process include the fees incurred when the source and destination buckets are accessed, outbound traffic fees for the source bucket, ECS instance fees, data storage fees, and time costs. If there is more than 1 TB of data to be migrated, the storage cost is proportional to the migration period. Compared with the data transfer and storage fees, fewer fees are incurred when you use ECS. Having more ECS instances shortens the migration period.
For more information about ossimport, see the following topics: