Learn the usage notes, limitations, and procedures for migrating data from a local file system to OSS.
Notes
Keep the following in mind when using Data Online Migration:
-
When creating a source data address, specify an absolute path for the Directory To Be Migrated. This path must begin and end with a forward slash (/) and cannot contain environment variables or special characters.
-
When creating a source data address, ensure the specified Directory To Be Migrated exists and is valid.
-
Online migration consumes resources at both the source and the destination, potentially impacting your business operations. If your services are critical, evaluate the potential impact and configure a speed limit for the migration task, or run the task during off-peak hours.
-
The online migration service checks files at both the source and destination before a migration begins. However, if a file at the source has the same name as a file at the destination and the migration task is configured to overwrite existing files, the service overwrites the destination file. If the files contain different data, rename one of the files or back up the destination file before starting the migration.
-
The online migration service preserves the last modified time of source files. If a lifecycle rule is configured for the destination bucket, a migrated file whose last modified time matches the rule's conditions may be deleted or transitioned to the specified archive storage class.
Migration limitations
-
The migration excludes the following data types in the source data address: empty directories, symbolic links (files or directories), character device files, block device files, socket files, and pipeline files.
-
The migration converts hard links in the source data address to regular files, without preserving the link relationship.
-
The migration does not migrate parent directory attributes.
-
The migration does not migrate special file permissions, such as SUID, SGID, and SBID.
-
The following attribute limitations apply when migrating data from a local file system to OSS:
-
Supported attributes: The migration maps ModifyTime to X-Oss-Meta-Mtime, Permissions to X-Oss-Meta-Perms, and Uid:Gid to X-Oss-Meta-Owner.
Note-
Permissions: Includes nine permission bits for read, write, and execute access.
-
Uid:Gid: Represents the user ID and group ID, separated by a colon (:).
-
-
Unsupported attributes: Examples include AccessTime, ChangeTime, Attr, and Acl.
NoteThe migration behavior for unlisted attributes is not guaranteed, and you must verify them after the migration completes.
-
Step 1: Select a region
-
Log in to the Data Online Migration console as the RAM user you created.
-
In the upper-left corner of the top navigation bar, select the region where your agent is located.
Important-
Tunnels, agents, data addresses, and migration tasks created in one region cannot be used in another. Choose your region carefully.
-
Select the region where your agent is located. If that region is not available, select the closest region to create the migration task.
-
For cross-border migration, we recommend that you enable transfer acceleration to increase migration speed. If you enable transfer acceleration for a bucket, transfer acceleration fees apply. For more information, see Access OSS using transfer acceleration.
-
Step 2: Create a tunnel
-
In the left navigation pane, go to Data Online Migration > Channel Management and click Create Tunnel.
-
In the Create Tunnel dialog box, configure the following parameters and click OK.
Parameter
Required
Description
name
Yes
The name of the tunnel.
-
The name cannot be empty and can be up to 100 characters in length.
-
The name can contain letters, digits, hyphens (-), and underscores (_).
Maximum Bandwidth
Yes
The maximum bandwidth that the tunnel can use.
-
If you do not configure this parameter, the default value 0 is used, which indicates that the bandwidth for the tunnel is not limited.
-
If you configure this parameter, enter a value based on the note in the console.
ImportantThe bandwidth that is available for the tunnel depends on the actual bandwidth of the network connection.
Requests/s
Yes
The maximum number of requests per second over the tunnel.
-
If you do not configure this parameter, the default value 0 is used, which indicates that the number of requests per second over the tunnel is not limited.
-
If you configure this parameter, enter a value based on the note in the console.
WarningWe recommend that you evaluate the capabilities of the storage system of the data source before you configure this parameter. If you set this parameter to a great value, your business is affected. We recommend that you enter a value based on the note in the console.
-
To learn more about tunnels, see Tunnel Management.
Step 3: Create an agent
-
If LocalFS is a local file system, you can deploy only one agent.
-
If LocalFS is a remote file system such as Network Attached Storage (NAS), you can deploy multiple agents. Mount the NAS to directories with the same name.
-
In the left-side navigation pane, choose Data Online Migration > Agent Management, and click New Agent.
-
In the New Agent dialog box, configure the following parameters and click OK.
Parameter
Required
Description
Name
Yes
The agent's name.
-
The name must be 3 to 63 characters long.
-
The name can contain lowercase letters, digits, hyphens (-), and underscores (_). The name is case-sensitive.
-
The name must be UTF-8 encoded and cannot start with a hyphen (-) or an underscore (_).
-
Network Type
Yes
The network connection type for the agent. The following options are available:
-
VPC (Recommended): The agent connects to the Data Online Migration service over a VPC. This method requires the machine that hosts the agent to access the internal endpoint of Data Online Migration in the corresponding region. For example, if you use Data Online Migration in the China (Beijing) region, the agent machine must be able to access the internal endpoint {TunnelId}.cn-beijing.mgw-tc-internal.aliyuncs.com. We recommend using an ECS instance in the same region as the Data Online Migration console to deploy the agent.
-
Public network: The agent connects to the Data Online Migration service over the public network. This method requires the machine that hosts the agent to access the public endpoint of Data Online Migration in the corresponding region. For example, if you use Data Online Migration in the China (Beijing) region, the agent machine must be able to access the public endpoint {TunnelId}.cn-beijing.mgw-tc.aliyuncs.com.
Note-
{TunnelId} is a placeholder for the tunnel ID.
-
You can use the
pingcommand to test the network connectivity between the agent and the Data Online Migration service.
Deployment method
Yes
The agent's deployment method. Currently, only standalone process deployment is supported.
Tunnel
Yes
The tunnel to associate with the agent. An agent can be associated with only one tunnel. The bandwidth of the agent is limited by the total bandwidth of the tunnel.
For example, a tunnel named tunnel-1 is configured with a maximum bandwidth of 10 Gbit/s. tunnel-1 is associated with three agents: agent-1, agent-2, and agent-3. The combined bandwidth of these three agents cannot exceed 10 Gbit/s. If agent-1 is allocated 3 Gbit/s of bandwidth, only 7 Gbit/s of bandwidth remains available for agent-2 and agent-3. Plan and allocate your bandwidth carefully.
-
-
Generate the agent deployment script. For more information, see Generate an agent deployment script.
For more information about agents, see Agent Management.
Step 4: Create a source address
-
If the LocalFS source is a local file system on a single machine, you can deploy only one agent.
-
If the LocalFS source is a NAS file system mounted on multiple machines, ensure that the NAS is mounted to the same directory path on each machine. When you create the data address, enter this mount directory for the Directory to be migrated parameter.
-
In the left-side navigation pane, choose Data Online Migration > Address management, and then click Create address.
-
In the Create address panel, configure the following parameters, and then click OK.
Parameter
Required
Description
Name
Yes
-
The name must be 3 to 63 characters in length.
-
The name can contain lowercase letters, digits, hyphens (-), and underscores (_). The name is case-sensitive.
-
The name cannot start with a hyphen (-) or an underscore (_).
Type
Yes
Select LocalFS.
Directory To Be Migrated
Yes
Specify the path of the directory to be migrated. The path must be absolute, start and end with a forward slash (/), and contain no environment variables or special characters.
For example, if the source prefix is
/example/src/and the destination prefix isexample/dest/, a source file such as example.jpg is migrated toexample/dest/example.jpg.Important-
If multiple agents are associated with this data address, ensure each agent can access this directory. Otherwise, some data may fail to migrate.
-
If the LocalFS source is a NAS file system mounted as a local directory on multiple machines, ensure that the mount directory has the same name on each machine. When you create the data address, specify this local directory name as the directory to be migrated.
Tunnel
Yes
Select the name of the channel to use.
Important-
This parameter is required only when migrating data over an Express Connect or VPN connection, or when migrating data from a self-managed storage system.
-
You must associate an agent if the destination is a local file system or if the migration uses a dedicated connection, for example, to a financial cloud or Apsara Stack environment.
Agent
Yes
Select the name of the agent to use.
Important-
This parameter is required only when migrating data over an Express Connect or VPN connection, or when migrating data from a self-managed storage system.
-
You can select a maximum of 200 agents for a specified channel.
-
Step 5: Create a destination address
-
In the navigation pane on the left, choose Data Online Migration > Address Management, and then click Create Address.
-
In the Create Address panel, configure the following parameters, and then click OK.
Parameter
Required
Description
Name
Yes
Enter a name for the destination address. The name must meet the following requirements:
-
The name must be 3 to 63 characters in length.
-
The name can contain lowercase letters, digits, hyphens (-), and underscores (_). The name is case-sensitive.
-
The name cannot start with a hyphen (-) or an underscore (_).
Type
Yes
Select OSS.
Region
No
Select the destination region. For example, China (Hangzhou).
Authorize role
Yes
-
If the destination bucket belongs to the Alibaba Cloud account used for Data Online Migration:
-
If the destination bucket belongs to a different Alibaba Cloud account:
Bucket
Yes
Enter the name of the destination OSS bucket in the current account.
Agent
No
Select the name of the agent to use.
Important-
This parameter is required only when migrating data over an Express Connect or VPN connection, or when migrating data from a self-managed storage system.
-
You can select a maximum of 200 agents for a specified channel.
-
Step 6: Create a migration task
-
In the left-side navigation pane, choose Data Online Migration > Migration Tasks, and then click Create Task.
-
On the Select Address page, configure the following parameters, and then click Next.
Parameter
Required
Description
Name
Yes
Enter a name for the migration task. The name must meet the following requirements:
-
The name must be 3 to 63 characters in length.
-
The name can contain lowercase letters, digits, hyphens (-), and underscores (_). The name is case-sensitive.
-
The name cannot start with a hyphen (-) or an underscore (_).
Source Address
Yes
Select an existing source address.
Destination Address
Yes
Select an existing destination address.
-
-
On the Task Configurations page, configure the following parameters.
Parameter
Required
Description
Basic configurations
Migration Bandwidth
No
Select the migration bandwidth.
-
Default: The maximum available bandwidth. The actual speed depends on the file size and the number of files.
-
Specify an upper limit: Specify a maximum bandwidth limit.
Important-
The actual migration bandwidth depends on factors such as the data source, network conditions, destination throttling, and file size, and may not reach the specified upper limit.
-
Select a value based on your data source, destination, business requirements, and network bandwidth. Improper throttling can disrupt your business operations.
Files Migrated Per Second
No
Select the number of files migrated per second.
-
Default: Uses the default rate for files migrated per second.
-
Specify an upper limit: Specify a maximum number of files migrated per second.
Important-
The actual number of files migrated per second is affected by factors such as the data source, network conditions, destination throttling, and file size, and may not reach the specified upper limit.
-
Select a value based on your data source, destination, business requirements, and network bandwidth. Improper throttling can disrupt your business operations.
Overwrite Mode
No
Specifies how to handle files with the same name.
-
Do not overwrite: Skips the file.
-
Overwrite All: The source file overwrites the destination file.
-
Overwrite based on the last modification time:
-
If the last modified time of the source file is later than that of the destination file, the destination file is overwritten.
-
If the last modified time of the source and destination files are the same, the destination file is overwritten if their Size or Content-Type differs.
-
-
With the Overwrite based on the last modification time policy, there is a risk that an older file may overwrite a newer file because it does not strictly guarantee that newer files will be preserved.
-
If you select the Overwrite based on the last modification time policy, ensure that the source files can return metadata such as last modified time, Size, and Content-Type. Otherwise, the overwrite policy may not work as expected.
-
When you select Do not overwrite or Overwrite if last modified time is later, the service sends one request to the source and one to the destination to retrieve metadata for the overwrite check. This incurs request fees at both the source and the destination.
WarningAuditing method
Migration Report
Yes
Specifies how to push the migration report.
-
Do not push (Default): Does not push the migration report to the destination bucket.
-
Push: Pushes the migration report to the destination bucket. For path details, see Next steps.
Important-
Pushing the migration report consumes storage space at the destination.
-
Pushing the migration report may be delayed.
-
Each task execution record has a unique ID. The migration report is pushed only once. Do not delete it unless necessary.
Migration Logs
Yes
Specifies how to push the migration log.
-
Do not push (Default): Does not push the migration log.
-
Push: Pushes the migration log to Log Service (SLS). You can view the migration log in SLS.
-
Push only file error logs.: Pushes only file error logs to Log Service (SLS). You can view these logs in SLS.
When you select Push or Push only file error logs., Online Migration Service creates a project in Log Service (SLS) named aliyun-oss-import-log-<Alibaba Cloud account ID>-<current deployment region>, for example, aliyun-oss-import-log-137918634953****-cn-hangzhou.
ImportantBefore you select Push or Push only file error logs., ensure you have completed the following operations. Otherwise, the migration task may fail.
-
You have activated Log Service (SLS).
-
You have granted the required permissions on the Authorize page.
Authorize
No
This option appears only when you set Migration Logs to Push or Push only file error logs..
Click Authorize to go to the Cloud Resource Access Authorization page. The system creates the AliyunOSSImportSlsAuditRole role and grants it the necessary permissions. Click Agree to Authorization to complete the authorization.
Filter
File Name
No
A filter based on filenames.
Supports Include and Exclude rules using regular expressions based on the RE2 library (only a subset of the syntax is supported). For example:
-
.*\.jpg$ matches all files that end with .jpg.
-
^file.* matches all files at the root level that start with file by default.
If you set a prefix for the source address, such as data/to/oss/, you must use ^data/to/oss/file.* to match all files that start with file under that prefix.
-
.*/picture/.* matches any subdirectory named picture.
Important-
When you use an Include rule, all matching files are migrated. If there are multiple Include rules, any file that matches at least one rule is migrated.
For example, if you have two files, picture.jpg and picture.png, and you set an Include rule for .*\.jpg$, only picture.jpg is migrated. If you also add an Include rule for .*\.png$, both files are migrated.
-
When you use an Exclude rule, no matching files are migrated. If there are multiple Exclude rules, any file that matches at least one rule is not migrated.
For example, if you have two files, picture.jpg and picture.png, and you set an Exclude rule for .*\.jpg$, only picture.png is migrated. If you also add an Exclude rule for .*\.png$, neither file is migrated.
-
Exclude rules take precedence. If a file matches both an Exclude rule and an Include rule, the file is not migrated.
For example, for a file named file.txt, if you set an Exclude rule for .*\.txt$ and an Include rule for file.*, the file file.txt is not migrated.
File Modification Time
No
Filters files based on their last modified time.
You can specify a time range to migrate only files whose last modified time falls within that range. The rules are as follows:
-
If you specify only a start time of January 1, 2019, the task migrates only files last modified on or after that date.
-
If you specify only an end time of January 1, 2022, the task migrates only files last modified on or before that date.
-
If you specify a start time of January 1, 2019 and an end time of January 1, 2022, the task migrates only files last modified on or between these two dates.
Migrate special entities
No
Controls migration of special entity types. Select the checkbox to enable, or clear it to disable.
Directory:
-
Enable: Directories at the source address are added to the migration queue and are included in the task's file count and storage volume statistics. Corresponding empty objects ending with a forward slash (/) are created at the destination, and the attributes of the source directory (if supported) are set as user metadata on the destination object.
-
Disable: Directories at the source address are ignored and are not included in the task's file count and storage volume statistics. No corresponding empty objects are created at the destination.
Symbolic link:
-
Enable: Symbolic links at the source address are added to the migration queue and are included in the task's file count and storage volume statistics. Corresponding symbolic link objects are created at the destination, and the attributes of the source symbolic link (if supported) are set as user metadata on the symbolic link object. The Target attribute of the symbolic link object depends on the Whether to Convert Target setting.
-
Disable: Symbolic links at the source address are ignored and are not included in the task's file count and storage volume statistics.
ImportantThe service does not migrate the target files or directories that symbolic links point to, unless those targets are also within the migration scope.
Migration configurations
Convert target
No
Converts the Target attribute of a source symbolic link so the destination symbolic link points to the correct target object. Select the checkbox to enable, or clear it to disable.
Important-
This option is effective only when symbolic link migration is enabled.
-
Regardless of this setting, the migration does not check whether the target object exists, if its type is valid, or if you have access permissions for it.
Enable: The service first resolves the Target attribute into the shortest absolute path (AbsTarget), relative to the symbolic link's directory. It then performs a string replacement, replacing the
SrcPrefix(if it matches) in the AbsTarget with theDestPrefix. The resulting value is set as the Target attribute of the destination symbolic link object.NoteExample: Assume the migration task is configured with
SrcPrefix="/mnt/nas1/"andDestPrefix="cloud_base/". A symbolic link /mnt/nas1/links/a.lnk exists at the source. Consider the following Target attributes:-
If the Target is "../data/./a.txt", the resolved shortest absolute path is
"/mnt/nas1/data/a.txt". The final replaced Target value becomes "cloud_base/data/a.txt". -
If the Target is
"/mnt/nas1/verbose/../data/./a.txt", the resolved shortest absolute path is"/mnt/nas1/data/a.txt". The final replaced Target value becomes "cloud_base/data/a.txt". -
If the Target is
"/root/outer/../data/./a.txt", the resolved shortest absolute path is"/root/data/a.txt". No prefix match is found, so the final Target value remains"/root/data/a.txt".
Disable: No conversion is performed. The original Target attribute value of the source symbolic link is set on the destination symbolic link object.
Preserve last modified time
Yes
Controls whether the last modified time of the source file is preserved.
-
Preserve (Default): The last modified time of the source file is set on the destination object.
-
Do not preserve: The last modified time is not set.
Destination storage class
No
Controls the storage class assigned to destination objects.
-
Specify: Migrated objects are set to the specified storage class. The following storage classes are available:
-
Standard
-
Infrequent Access
-
Archive
-
Cold Archive
-
Deep Cold Archive
-
-
Not specified (default): The storage class is not set, and files migrated to the destination will be consistent with the default storage class of the destination.
Important-
This option is displayed only if your account is added to the allowlist for this feature.
-
This option is currently supported only for tasks where the destination is OSS.
Task scheduling
Execution time
No
Important-
If a migration task is still running at its next scheduled execution time, that execution is postponed until the next scheduled time after the current run completes. This continues until the specified number of executions is reached.
-
Concurrency limit for migration tasks: Deployment regions in the Chinese mainland and China (Hong Kong) support a maximum of 10 concurrent tasks. Overseas regions support a maximum of 5 concurrent tasks. Exceeding this limit may cause scheduled tasks to fail.
Defines when to run the migration task.
-
Immediately: Runs the task as soon as it is created.
-
At the Specified Time: Specifies a daily time window for the task to run. By default, the task starts at the specified start time and pauses at the specified stop time.
-
Periodic Scheduling: Runs the task based on a specified frequency and number of executions.
-
Execution frequency: Supports five types of frequencies: Hourly, Daily, Weekly, Specific days of the week, and Custom. For more information, see Execution frequency reference.
-
Number of executions: Specifies how many times the task runs. If not set, the task runs once by default. For the maximum number of executions, refer to the console prompt.
-
ImportantYou can manually start and stop tasks at any time, regardless of the custom execution time settings.
-
-
Read the Online Migration Service Agreement, select I understand and confirm the compliance commitment statement and acknowledge my responsibility to verify data consistency after the migration task is complete, and then click Next.
-
Review your configuration settings. If everything is correct, click OK and wait for the migration task to run.
Execution frequency
Execution frequency | Description | Example |
Hourly | Run the task once every hour. You can use this option with the maximum number of runs. | The current time is 8:05. The frequency is set to hourly with a maximum of 3 runs. The first run starts at the next hour, 9:00.
|
Daily | Run the task once a day. You must specify an hour (0-23) for the task to start. You can use this option with the maximum number of runs. | The current time is 8:05. The task is scheduled to run daily at 10:00, with a maximum of 5 runs. The first run starts at 10:00 today.
|
Weekly | Run the task once a week. You must specify a day of the week and an hour (0-23) for the task to start. You can use this option with the maximum number of runs. | The current time is Monday, 8:05. The task is scheduled to run every Monday at 10:00, with a maximum of 10 runs. The first run starts at 10:00 today.
|
Specific days of the week | Run the task on selected days of the week. You must specify the days and an hour (0-23) for the task to start. | The current time is Wednesday, 8:05. The task is scheduled to run on Mondays, Wednesdays, and Fridays at 10:00. The first run starts at 10:00 today.
|
Custom | Use a cron expression to define a custom schedule for the task start time. | Note A cron expression consists of six space-separated fields that define the execution schedule: second, minute, hour, day of the month, month, and day of the week. The minimum interval is 1 hour. The following cron expression examples are for reference only. For more options, use a cron expression generator.
|
Step 7: Verify data
The migration service only transfers your data and does not guarantee data consistency or integrity. After the migration task completes, you must verify data consistency between the source and the destination.
After the migration task completes, you must verify the data at the destination. If you delete the source data before confirming a successful migration, you are solely responsible for any resulting data loss.