This topic describes the notes, limitations, and procedure for migrating source data using an inventory.
Notes
Note the following when you migrate data using Data Online Migration:
Data Online Migration accesses the source data address by using the public interfaces provided by the storage service provider of the source data address. The access behavior depends on the interface implementation of the storage service provider.
When Data Online Migration is used for migration, it consumes resources at the source and destination data addresses. This may interrupt your business. To ensure business continuity, we recommend that you enable throttling for your migration tasks or run the migration tasks during off-peak hours after careful assessment.
Before a migration task starts, Data Online Migration checks the files at the source and destination data addresses. If a file at the source data address and a file at the destination data address have the same name, and the File Overwrite Method parameter of the migration task is set to Yes, the file at the destination data address is overwritten during migration. If the two files contain different information and the file at the destination data address needs to be retained, we recommend that you change the name of one file or back up the file at the destination data address.
The LastModifyTime attribute of the source file is retained after the file is migrated to the destination bucket. If a lifecycle rule is configured for the destination bucket and takes effect, the migrated file whose last modification time is within the specified time period of the lifecycle rule may be deleted or archived in specific storage types.
Migration limits
Data Online Migration allows you to migrate only the data of a single bucket in a task. You cannot migrate all data that belongs to your account in a single task.
The properties of data migrated using a generic inventory are as follows:
The properties that can be migrated depend on the specific data type, such as OSS or S3. For more information, see the migration tutorial for the corresponding data source.
Unsupported properties: To be determined. The content that is migrated is considered final.
Step 1: Select a region
Log on to the Data Online Migration console as the Resource Access Management (RAM) user that you created for data migration.
In the upper-left corner of the top navigation bar, select the region in which the source data address resides or select the region that is closest to the region in which the source data address resides.
The region that you select is the region in which your Data Online Migration is deployed. Supported regions inside China include China (Beijing), China (Shanghai), China (Hangzhou), China (Shenzhen), China (Ulanqab), and China (Hong Kong), and supported regions outside China include Singapore, Germany (Frankfurt), and US (Virginia).
ImportantThe data addresses and migration tasks that you create in a region cannot be used in another region. Select the region with caution.
We recommend that you select the region in which the source data address resides. If the region in which the source data address resides is not supported by Data Online Migration, select the region that is closest to the region in which the source data address resides.
To speed up cross-border data migration, we recommend that you enable transfer acceleration. If you enable transfer acceleration for OSS buckets, you are charged for transfer acceleration fees. For more information, see Access OSS using transfer acceleration.
Step 2: Create a source data address
In the left-side navigation pane, choose Data Online Migration > Address Management. On the Address Management page, click Create Address.
In the New Address panel, configure the following parameters and then click OK.
Parameter
Required
Description
Name
Yes
The name of the source data address. The name must meet the following requirements:
The name is 3 to 63 characters in length.
The name must be case-sensitive and can contain lowercase letters, digits, hyphens (-), and underscores (_).
The name is encoded in the UTF-8 format and cannot start with a hyphen (-) or an underscore (_).
Type
Yes
Select Inventory.
Data Storage Type
Yes
Select the specific source for migration as needed.
Domain Name
Depends on the data storage type
The endpoint of the source storage service. For example, the endpoint of AWS S3.
Region
Yes (if Data Storage Type is set to Alibaba OSS)
Select the region where the source data address is located, for example, China (Hangzhou).
Authorization Role
Yes (if Data Storage Type is set to Alibaba OSS)
The source bucket belongs to the account used to log on to the Data Online Migration console
The source bucket does not belong to the account used to log on to the Data Online Migration console
Bucket Key
Yes
Enter the AccessKey pair (including the AccessKeyId and SecretAccessKey) of the migration data source account to verify your identity and confirm that you have the permissions to read the source data.
Bucket
Yes
Enter the name of the Bucket that contains the data to be migrated.
Prefix
No
You can specify a prefix to migrate source data to a specific directory. The prefix cannot start with a forward slash (/) but must end with a forward slash (/), such as
data/to/oss/.Specify a prefix: For example, if the source prefix is
example/src/, which contains the file example.jpg, and you set the destination prefix toexample/dest/, the full path of the migrated file will beexample/dest/example.jpg.Do not specify a prefix: If you do not specify a prefix, the source data is migrated to the root directory of the destination bucket.
List Location
Yes
The bucket where the manifest file is located. You can select Alibaba OSS or Third-party Source.
List Path
Yes
Enter the path where the manifest.json file is located.
List Domain Name
Yes (if List Location is not set to Alibaba OSS)
If you set List Location to Third-party Source, enter the specific endpoint for accessing the inventory.
List Region
Depends on the data storage type
If you set the Inventory Location parameter to Alibaba OSS, specify the region in which the OSS inventory list resides.
Authorization Role
Yes (if List Location is set to Alibaba OSS)
The inventory bucket belongs to the account used to log on to the Data Online Migration console
The inventory bucket does not belong to the account used to log on to the Data Online Migration console
List Bucket
Yes
Enter the name of the Bucket that contains the migration list and belongs to the current console account.
List Bucket Key
Yes (if List Location is not set to Alibaba OSS)
When List Location is set to Non-OSS , enter the AccessKey pair (including AccessKeyId and SecretAccessKey) to access the source manifest. This pair is used to verify your identity and confirm that you have permission to read the manifest file.
Channel
No
The name of the tunnel that you want to use.
ImportantThis parameter is required only when you migrate data to the cloud by using Express Connect circuits or VPN gateways or migrate data from self-managed databases to the cloud.
If data at the destination data address is stored in a local file system or you need to migrate data over an Express Connect circuit in an environment such as Alibaba Finance Cloud or Apsara Stack, you must create and deploy an agent.
Proxy
No
The name of the agent that you want to use.
ImportantThis parameter is required only when you migrate data to the cloud by using Express Connect circuits or VPN gateways or migrate data from self-managed databases to the cloud.
You can select up to 200 agents at a time for a specific tunnel.
Step 3: Create a destination data address
In the left-side navigation pane, choose Data Online Migration > Address Management. On the Address Management page, click Create Address.
In the New Address panel, configure the following parameters and then click OK.
The name is 3 to 63 characters in length.
The name must be case-sensitive and can contain lowercase letters, digits, hyphens (-), and underscores (_).
The name is encoded in the UTF-8 format and cannot start with a hyphen (-) or an underscore (_).
The destination bucket belongs to the account used to log on to the Data Online Migration console
The destination bucket does not belong to the account used to log on to the Data Online Migration console
Specify a prefix: For example, if the source prefix is
example/src/, which contains the file example.jpg, and you set the destination prefix toexample/dest/, the full path of the migrated file will beexample/dest/example.jpg.Do not specify a prefix: If you do not specify a prefix, the source data is migrated to the root directory of the destination bucket.
This parameter is required only when you migrate data to the cloud by using Express Connect circuits or VPN gateways or migrate data from self-managed databases to the cloud.
If data at the destination data address is stored in a local file system or you need to migrate data over an Express Connect circuit in an environment such as Alibaba Finance Cloud or Apsara Stack, you must create and deploy an agent.
This parameter is required only when you migrate data to the cloud by using Express Connect circuits or VPN gateways or migrate data from self-managed databases to the cloud.
You can select up to 200 agents at a time for a specific tunnel.
Parameter | Required | Description |
Name | Yes | Enter a name for the destination data address. The name must follow these conventions: |
Type | Yes | Select Alibaba OSS. |
Custom Domain Name | No | Supports user-defined custom domain names. |
Region | Yes | Select the region where the destination data address is located, for example, China (Hangzhou). |
Authorization Role | Yes | |
Bucket | Yes | Enter the name of the destination bucket that belongs to the current console account. |
Prefix | No | You can specify a prefix to migrate source data to a specific directory. The prefix cannot start with a forward slash (/) but must end with a forward slash (/), such as |
Channel | No | The name of the tunnel that you want to use. Important |
Proxy | No | The name of the agent that you want to use. Important |
Step 4: Create a migration task
In the left-side navigation pane, choose Data Online Migration > Migration Tasks. On the Migration Tasks page, click Create Task.
In the Select Address step,configure the parameters. The following table describes the parameters.
Parameter
Required
Description
Name
Yes
The name of the migration task. The name must meet the following requirements:
The name is 3 to 63 characters in length.
The name must be case-sensitive and can contain lowercase letters, digits, hyphens (-), and underscores (_).
The name is encoded in the UTF-8 format and cannot start with a hyphen (-) or an underscore (_).
Source Address
Yes
The source data address that you created.
Destination Address
Yes
The destination data address that you created.
On the Configure Task page, configure the following parameters.
Parameter
Required
Description
Migration Bandwidth
No
The maximum bandwidth that is available to the migration task. Valid values:
Default: Use the default upper limit for the migration bandwidth. The actual migration bandwidth depends on the file size and the number of files.
Specify an upper limit: Specify a custom upper limit for the migration bandwidth as prompted.
ImportantThe actual migration speed depends on multiple factors, such as the source data address, network, throttling at the destination data address, and file size. Therefore, the actual migration speed may not reach the specified upper limit.
Specify a reasonable value for the upper limit of the migration bandwidth based on the evaluation of the source data address, migration purpose, business situation, and network bandwidth. Inappropriate throttling may affect business performance.
Files Migrated Per Second
No
The maximum number of files that can be migrated per second. Valid values:
Default: Use the default upper limit for the number of files that can be migrated per second.
Specify an upper limit: Specify a custom upper limit as prompted for the number of files that can be migrated per second.
ImportantThe actual migration speed depends on multiple factors, such as the source data address, network, throttling at the destination data address, and file size. Therefore, the actual migration speed may not reach the specified upper limit.
Specify a reasonable value for the upper limit of the migration bandwidth based on the evaluation of the source data address, migration purpose, business situation, and network bandwidth. Inappropriate throttling may affect business performance.
Overwrite Rule
No
Specifies whether to overwrite a file at the destination data address if the file has the same name as a file at the source data address. Valid values:
Do not overwrite: does not migrate the file at the source data address.
Overwrite All: overwrites the file at the destination data address.
Overwrite based on the last modification time:
If the last modification time of the file at the source data address is later than that of the file at the destination data address, the file at the destination data address is overwritten.
If the last modification time of the file at the source data address is the same as that of the file at the destination data address, the file at the destination data address is overwritten if the files differ from one of the following aspects: size and Content-Type header.
If you select Overwrite based on the last modification time, there is no guarantee that newer files won’t be overwritten by older ones, which creates a risk of losing recent updates.
If you select Overwrite based on the last modification time, make sure that the file at the source data address contains information such as the last modification time, size, and Content-Type header. Otherwise, the overwrite policy may become invalid and unexpected migration results may occur.
If you select Do not overwrite or Overwrite based on the last modification time, the system sends a request to the source and destination data addresses to obtain the meta information and determines whether to overwrite a file. Therefore, request fees are generated for the source and destination data addresses.
WarningMigration Report
Yes
The method for pushing the migration report.
Do Not Push (default): Do not push the migration report to the destination bucket.
Push: Push the migration report to the destination bucket. For the detailed path, see What to do next.
ImportantPushing migration reports consumes storage space at the destination.
There may be a time latency in pushing the migration report. Wait for the report to be generated.
Each task execution record has a unique ID. The migration report is pushed only once. Delete it with caution.
Migration Log
Yes
Specifies whether to push migration logs to Simple Log Service (SLS). Valid values:
Do not push (default): Does not push migration logs.
Push: Pushes migration logs to SLS. View the migration logs in the SLS console.
Push only file error logs: Pushes only error migration logs to SLS. View the error migration logs in the SLS console.
If you select Push or Push only file error logs, Data Online Migration creates a project in SLS. The name of the project is in the aliyun-oss-import-log-Alibaba Cloud account ID-Region of the Data Online Migration console format. Example: aliyun-oss-import-log-137918634953****-cn-hangzhou.
ImportantTo prevent errors in the migration task, make sure that the following requirements are met before you select Push or Push only file error logs:
SLS is activated.
You have confirmed the authorization on the Authorize page.
Simple Log Service Authorization
No
This parameter is displayed if you set the Migration Logs parameter to Push or Push only file error logs.
Click Authorize to go to the Cloud Resource Access Authorization page. On this page, click Confirm Authorization Policy. The RAM role AliyunOSSImportSlsAuditRole is created and permissions are granted to the RAM role.
Running Time
No
ImportantIf the current execution of a migration task is not complete by the next scheduled start time, the task starts its next execution at the subsequent scheduled start time after the current migration is complete. This process continues until the task is run the specified number of times.
If Data Online Migration is deployed in the China (Hong Kong) region or the regions in the Chinese mainland, up to 10 concurrent migration tasks are supported. If Data Online Migration is deployed in regions outside China, up to five concurrent migration tasks are supported. If the number of concurrent tasks exceeds the limit, executions of tasks may not be complete as scheduled.
The time when the migration task is run. Valid values:
Immediately: The task is immediately run.
Scheduled Task: The task is run within the specified time period every day. By default, the task is started at the specified start time and stopped at the specified stop time.
Periodic Scheduling: The task is run based on the execution frequency and number of execution times that you specify.
Execution Frequency: Specify the execution frequency of the task. Valid values: Every Hour, Every Day, Every Week, Certain Days of the Week, and Custom. For more information, see the Supported execution frequencies section of this topic.
Executions: Specify the maximum number of execution times of the task as prompted. By default, if you do not specify this parameter, the task is run once.
ImportantYou can manually start and stop tasks at any point in time. This is not affected by the custom execution time of tasks.
Read Data Online Migration Agreement. Select I have read and agree to the Alibaba Cloud International Website Product Terms of Service. and I have understood that when the migration task is complete, the migrated data may be different from the source data. Therefore, I have the obligation and responsibility to confirm the consistency between the migrated data and source data. Alibaba Cloud is not responsible for the confirmation of the consistency between the migrated data and source data. Then, click Next.
Verify that the configurations are correct and click OK. The migration task is created.
Execution frequency reference
Frequency | Description | Example |
Hourly | Select an hourly frequency. You can use this option with the number of executions. | The current time is 8:05. If you specify an hourly frequency and 3 executions, the first task starts at the next hour, which is 9:00.
|
Daily | When you select a daily frequency, you must set a start time on the hour (0 to 23) for the task. You can use this option with the number of executions. | The current time is 8:05. If you specify a daily run at 10:00 and 5 executions, the first task starts at 10:00 on the same day.
|
Weekly | When you select a weekly frequency, you must specify a day of the week and a start time on the hour (0 to 23). You can use this option with the number of executions. | The current time is 8:05 on Monday. If you specify a weekly run on Monday at 10:00 and 10 executions, the first task starts at 10:00 on the same day.
|
Specific days of the week | When you select specific days of the week, you can choose any days of the week and set a start time on the hour (0 to 23). | The current time is 8:05 on Wednesday. If you specify a run on Monday, Wednesday, and Friday at 10:00, the first task starts at 10:00 on the same day.
|
Custom | Use a cron expression to set a custom start time for the task. | Note A cron expression consists of six fields separated by spaces that represent the execution time rule for the task: Second Minute Hour Day Month Week. The following cron expression examples are for reference only. For more information, we recommend that you use a cron expression generator:
|
Step 5: Verify data
Data Online Migration solely handles the migration of data and does not ensure data consistency or integrity. After a migration task is complete, you must review all the migrated data and verify the data consistency between the source and destination data addresses.
Make sure that you verify the migrated data at the destination data address after a migration task is complete. If you delete the data at the source data address before you verify the migrated data at the destination data address, you are liable for the losses and consequences caused by any data loss.