This topic describes the precautions and procedure for data migration.

Precautions

When creating a migration job, you must note the following issues:
  • A migration job occupies the network resources of the source data address and destination data address. To ensure business continuity, we recommend that you specify a speed limit for a migration job or perform the migration job during off-peak hours.
  • Before a migration job is performed, files at the source data address and the destination data address are checked. The files at the destination data address are overwritten if the source files have the same name as the destination files and have a later modification time. If two files have the same name but different content, you must change the name of one file or back up the files.

Step 1: Create a source data address

  1. Log on to the Data Transport console.
  2. Choose Data Online Migration > Data Address, and click Create Data Address.
  3. In the Create Data Address dialog box, configure the parameters and click OK.
    Parameter Required Description
    Data Type Yes Select OSS.
    Data Region Yes Select a region where the source data address is located. For example, China (Zhangjiakou-Beijing Winter Olympics).
    Data Name Yes A data name is 3 to 63 characters in length. Special characters are not supported, except for hyphens (-) and underscores (_).
    OSS Endpoint Yes Select an endpoint based on the region where your data is located.
    • You can use an HTTP endpoint to access OSS from the Internet, for example, http://oss-cn-endpoint.aliyuncs.com.
    • You can also use an HTTPS endpoint to access OSS from the Internet, for example, https://oss-cn-endpoint.aliyuncs.com.
    • You can use an internal HTTP endpoint to access OSS from the internal network, for example, http://oss-cn-qingdao-internal.aliyuncs.com.
    • You can use an internal HTTPS endpoint to access OSS from the internal network, for example, https://oss-cn-qingdao-internal.aliyuncs.com.
    For more information about OSS endpoints, see Regions and endpoints.
    Notice When creating a migration job, you can use an OSS bucket that is deployed in an internal network as the source data address. In this case, the destination data address can only be a network-attached storage (NAS) file system or OSS bucket that is deployed in the same region.
    AccessKey Id and AccessKey Secret Yes Enter an AccessKey pair that is used to migrate data. For more information, see Create and authorize a RAM user.
    OSS Bucket Yes Select a bucket where data to be migrated is stored.
    OSS Prefix Yes An OSS prefix cannot start with a forward slash (/) and must end with a forward slash (/). For example, data/to/oss/.
  4. Apply for the permission to use this feature. This step is required because this feature is in public preview. Click Application.
  5. Enter the required information and submit the application for using this feature. After the application is approved, you will receive a short message service (SMS) notification.

Step 2: Create a destination data address

  1. Choose Data Online Migration > Data Address, and click Create Data Address.
  2. In the Create Data Address dialog box, configure the parameters and click OK.
    Parameter Required Description
    Data Type Yes Select NAS.
    Data Region Yes Select the region where the NAS file system is located.
    Data Name Yes A data name is 3 to 63 characters in length. Special characters are not supported, except for hyphens (-) and underscores (_).
    NAS Type Yes Select Alibaba Cloud.
    File System Yes Select the destination NAS file system.
    Mount Point Yes Select the mount point of the destination NAS file system.
    Notice You can only mount a NAS file system on an ECS instance that is located in a VPC. The classic network is not supported.
    Sub Folder No Select a subdirectory to store migrated data. If you leave this field blank, migrated data is stored in the root directory (/).
    Notice Ensure that the specified subdirectory exists on the NAS server. Otherwise, the data address fails to be created.
    Note For more information about the status of a new data address, see Data address status.

Step 3: Create a migration job

  1. Choose Data Online Migration > Migration Jobs, and click Create Job.
  2. In the Create Job dialog box, read the Terms of Migration Service, select I understand the above terms and conditions, and apply for opening Data Transport, and click Next.
    Then, the Fee Reminder dialog box appears.OSS_billing request
  3. In the Create Job dialog box, configure the parameters and click Next.
    Parameter Required Description
    Job Name Yes A job name is 3 to 63 characters in length and can contain lowercase letters, digits, and hyphens (-). A job name cannot start or end with a hyphen (-).
    Source Data Address Yes Select the source data address that you have created.
    Destination Data Address Yes Select the destination data address that you have created.
    Notice If the source data address and the destination data address are located in different countries, you can submit a ticket to request permissions to create a cross-national migration job. You must ensure that your business is legitimate, data transit conforms to local rules and regulations, and data does not include illegal information.
    Specified Directory No
    • Do not filter: All data at the source data address is migrated.
    • Exclude: During migration, the files and subdirectories under the specified directory are not migrated.
    • Contain: During migration, only the files and subdirectories under the specified directory are migrated.
    Note
    • A directory cannot start with a forward slash (/) or a backslash (\), and cannot contain double slashes (//), double periods (..), or double quotation marks ("). The total length of characters you enter for the directories is constrained by a maximum size of 10 KB.
    • A directory must end with a forward slash (/), for example, docs/.
    • You can specify a maximum of 20 directories of the Exclude or Contain type.
    Migration Type Yes
    • Full: You can specify the Start Time Point of File parameter. Files with the last modification time later than the specified start time point will be migrated. After the files are migrated, the migration job is closed. You can submit the job again if the data at the source data address changes. In this case, Data Transport only migrates the data that is changed after the previous job.
    • Incremental: You must specify the Migration Interval and Migration Times parameters to perform an incremental migration job. You must specify the Start Point Time of File parameter. Files with the last modification time later than the specified start time point are migrated during the first migration. After the first migration is complete, an incremental migration is performed based on the migration interval. An incremental migration job only migrates files that are created or modified after the previous migration started and before this migration starts. Assume that you specify N for the migration times. Full migration is performed once. Incremental migration will be performed (N-1) times. For example, you can set the migration interval to 1, the migration times to 5, and the start time point to 2019/03/05 08:00. The present time is 2019/03/10 08:00. When the first migration starts, Data Transport migrates files that are modified between 2019/03/05 08:00 and 2019/03/10 08:00. Assume that the first migration requires one hour to complete. The second migration starts at 2019/03/10 10:00, which is two hours later than 2019/03/10 08:00. The first migration takes one hour, and the other hour is consumed by the specified migration interval. During the second migration, if the last modification time of files is between 2019/03/10 08:00 and 2019/03/10 10:00, these files are migrated. The migration job includes a full migration and four incremental migrations.
    • Sync: You can synchronize data from the source data address to the destination data address. A synchronization job continues to run based on the specified synchronization interval until you manually stop the job. When a synchronization job is performed for the first time, files are synchronized based on the specified start time point. After the first synchronization is complete, files that are created or modified after the start time of the last synchronization will be synchronized when the specified synchronization interval ends. For example, the first synchronization is performed at 2018/11/01 08:00. For the second synchronization, files that are created or modified after 2018/11/01 08:00 are synchronized.
    Notice
    • You can select Sync if the source data address and the destination data address are located in the same region. Otherwise, you cannot select this option.
    • Before you start a migration job of the Full, Incremental, or Sync type, Data Transport compares files of the source data address with those of the destination data address. If a source file has the same name as a destination file, the destination file is overwritten when either of the following conditions is met:
      • The source file has a later modification time.
      • The size of the source file is different from that of the destination file.
    Start Time Point of File Yes (only for full and incremental migration)
    • All: All files are migrated.
    • Assign: Files that are created or modified after the specified time are migrated. For example, when you set the start time point to 2018/11/01 08:00:00, only files that are created or modified after 2018/11/01 08:00:00 are migrated. Files that are created or modified before the specified time are skipped.
    Migration Interval Yes (only for incremental migration) The default value is 1 hour and the maximum value is 24 hours.
    Migration Times Yes (only for incremental migration) The default value is 1 time and the maximum value is 30 times.
    Start Time Point of File Yes (only for synchronization)
    • All: All files are synchronized.
    • Assign: Files that are created or modified after the specified time are synchronized. For example, when you set the start time point to 2018/11/01 08:00:00, only files that are created or modified after 2018/11/01 08:00:00 are synchronized. Files that are created or modified before the specified time are skipped.
    Start Time of Job Yes (only for synchronization)
    • Immediately: A synchronization job immediately runs after a migration job is complete.
    • Schedule: You can set the scheduled time and synchronize data at the specified time.
    Job Period Yes (only for synchronization) The time interval between two synchronizations. A synchornization job starts each time an interval ends. Unit: hour, day, and week.
    Don't trigger new task if another task running Yes (only for synchronization) Specifies whether to start a synchronization job if the last synchronization job is still running when the synchronization interval ends. You must combine the use of this parameter with Job Period. Assume that you set Job Period to 1 hour and do not select this parameter. The next synchronization job runs regardless of whether the last synchronization job is complete within one hour. By default, this parameter is selected.
  4. Click Next to go to the Performance tab.
    • If you select Full or Incremental for the job type, specify the Data Size and File Count parameters.
      Note To ensure a successful migration, you must estimate the amount of data to be migrated. For more information, see Estimate the amount of data to be migrated..
    • If you select Sync for the job type, specify the Subtask File Count and Subtask File Size parameters.
      • Subtask File Count: You can separate a migration job into multiple subtasks based on the number of files that you specify. A maximum of 20 subtasks can run at a time. Set an appropriate number of files for each subtask to reduce the time of a migration job. The default value is 1000. Assume that you need to migrate 10,000 files. If you set the Subtask File Count to 500, the migration job is separated into 20 subtasks that run at the same time. If you set the Subtask File Count to 100, the migration job is separated into 100 subtasks. Each time 20 subtasks run and the remaining subtasks are queued.
      • Subtask File Size: You can separate a migration job into multiple subtasks based on the number of files that you specify. A maximum of 20 subtasks can run at a time. Set an appropriate size of files for each subtask to reduce the time of a migration job. The default value is 1 GB. Assume that you need to migrate files with a total size of 40 GB. If you set the Subtask File Size to 2 GB, the migration job is separated into 20 subtasks that run at the same time. If you set the Subtask File Size to 1 GB, the migration job is separated into 40 subtasks. Each time 20 subtasks run and the remaining subtasks are queued.
      Note Subtask are generated if the specified Subtask File Count or Subtask File Size parameter value is reached. If the number of files reaches the specified Subtask File Count parameter value but the file size does not reach the specified Subtask File Size parameter value, subtasks are generated based on the number of files. If the file size reaches the specified Subtask File Size parameter value but the number of files does not reach the specified Subtask File Count parameter value, subtasks are generated based on the file size. Assume that you set the Subtask File Count parameter to 1000 and Subtask File Size parameter to 1 GB. If the number of files reaches 1,000 but the file size does not reach 1 GB, subtasks are generated based on the number of files. If the file size reaches 1 GB but the number of files does not reach 1,000, subtasks are generated based on the file size.
  5. Optional. On the Performance tab, navigate to the Flow Control section, specify the Time Range and Max Flow parameters, and then click Add.
    Note To ensure business continuity, we recommend that you specify the Time Range and Max Flow parameters based on the fluctuation of workloads.
  6. Click Create and wait until the migration job is complete.

View the status of a data address

After you create the data address of an ECS instance, only one status for the data address of an ECS instance is displayed. The status can be one of the following:
  • Normal: indicates that a data address is created.
  • Creating: requires about three minutes to create the first NAS data address. This process takes a while. If the status of a data address is in the Creating state for a long time, you can click Refresh in the upper-right corner to update the status.
  • Invalid: an error occurred while creating a data address. You can verify that the configuration is correct and Data Transport is allowed to access the shared files of an ECS instance. If this issue persists, you can contact Alibaba Cloud technical support.