This topic describes how to migrate data from Google Cloud Storage to Alibaba Cloud Object Storage Service (OSS) by using Data Online Migration. It covers usage notes, limitations, step-by-step procedures, and data verification.
Usage notes
When you migrate data by using Data Online Migration, take note of the following items:
Data Online Migration accesses the source data address by using the public APIs provided by the storage service provider. The access behavior depends on the API implementation of the storage service provider.
Migration consumes resources at both the source and destination data addresses, which may affect your business. To ensure business continuity, we recommend that you enable throttling for migration tasks or run them during off-peak hours.
Before a migration task starts, Data Online Migration checks the files at both addresses. If a source file and a destination file have the same name and the Overwrite Method is set to overwrite, the destination file is overwritten during migration. If the files contain different information and you need to keep the destination file, rename one of the files or back up the destination file before migration.
The LastModified property of the source file is retained after migration. If a lifecycle rule is configured for the destination bucket, migrated files whose last modification time falls within the lifecycle rule's time period may be deleted or transitioned to a different storage class.
Limitations
You can migrate data from only a single bucket per task. You cannot migrate all data that belongs to your account in a single task.
Only specific attributes can be migrated from Google Cloud Storage to OSS:
Migratable attributes: x-amz-meta-\*, LastModifyTime, Content-Type, Cache-Control, Content-Encoding, Content-Disposition, and Content-Language.
Non-migratable attributes: These include but are not limited to Expires, StorageClass, ACL, server-side encryption, and Tagging.
Whether other attributes can be migrated is unknown. The actual migration results prevail.
Step 1: Select a region
Log on to the Data Online Migration console as the Resource Access Management (RAM) user that you created for data migration.
In the upper-left corner of the top navigation bar, select the region in which the source data address resides, or select the closest supported region. The region you select determines where Data Online Migration is deployed. Supported regions inside China: China (Beijing), China (Shanghai), China (Hangzhou), China (Shenzhen), China (Ulanqab), and China (Hong Kong). Supported regions outside China: Singapore, Germany (Frankfurt), and US (Virginia).

Data addresses and migration tasks created in a region cannot be used in another region. Select the region carefully.
We recommend that you select the region where the source data address resides. If that region is not supported by Data Online Migration, select the closest supported region.
To speed up cross-border data migration, consider enabling transfer acceleration for the destination OSS bucket. Transfer acceleration incurs additional fees. For more information, see Transfer acceleration.
Step 2: Create a source data address
In the left-side navigation pane, choose Data Online Migration > Address Management. On the Address Management page, click Create Address.
In the Create Address panel, configure the following parameters and click OK.
Parameter Required Description Name Yes The name of the source data address. The name must be 3 to 63 characters in length. The name is case-sensitive and can contain lowercase letters, digits, hyphens (-), and underscores (\_). The name must be encoded in UTF-8 and cannot start with a hyphen (-) or an underscore (\_). Type Yes The type of the source data address. Select Google Cloud Storage. Request Endpoint Yes The endpoint of the Google Cloud Storage bucket. You cannot specify a custom endpoint or Content Delivery Network (CDN) endpoint. To find the endpoint, log on to the Cloud Storage console and go to the Interoperability tab of the Settings page. AccessKey ID Yes The Access Key of the account that is used to read the source data. To find the Access Key, log on to the Cloud Storage console and go to the Interoperability tab of the Settings page. NoteTo ensure data security, we recommend that you create an access key pair that has read-only permissions for Data Online Migration. After the migration task is complete, you can delete the access key pair.
AccessKey Secret Yes The Secret Key of the account that is used to read the source data. To find the Secret Key, log on to the Cloud Storage console and go to the Interoperability tab of the Settings page. NoteTo ensure data security, we recommend that you create an access key pair that has read-only permissions for Data Online Migration. After the migration task is complete, you can delete the access key pair.
Bucket Yes The name of the Google Cloud Storage bucket that contains the data to migrate. NoteMake sure that the bucket name does not have spaces, line feeds, or tab keys as a prefix or suffix.
Prefix No The prefix of the source data address. Specify a prefix to migrate only the data in a specific directory. A prefix cannot start with a forward slash (/) and must end with a forward slash (/). If you do not specify a prefix, all data in the bucket is migrated. Tunnel No The name of the tunnel to use. ImportantThis parameter is required only when you migrate data to the cloud over leased lines or VPN gateways, or migrate data from self-managed databases to the cloud. If data at the destination data address is stored in a local file system or you need to migrate data over a leased line in an environment such as Alibaba Finance Cloud or Apsara Stack, you must create and deploy an agent.
Agent No The name of the agent to use. ImportantThis parameter is required only when you migrate data to the cloud over leased lines or VPN gateways, or migrate data from self-managed databases to the cloud. You can select up to 30 agents for a specific tunnel.
Step 3: Create a destination data address
In the left-side navigation pane, choose Data Online Migration > Address Management. On the Address Management page, click Create Address.
In the Create Address panel, configure the following parameters and click OK.
Parameter Required Description Name Yes The name of the destination data address. The name must be 3 to 63 characters in length. The name is case-sensitive and can contain lowercase letters, digits, hyphens (-), and underscores (\_). The name must be encoded in UTF-8 and cannot start with a hyphen (-) or an underscore (\_). Type Yes The type of the destination data address. Select Alibaba OSS. Custom Domain Name No Specifies whether custom domain names are supported. Region Yes The region in which the destination data address resides. Example: China (Hangzhou). Authorize Role Yes The RAM role authorization for accessing the destination bucket.
If the destination bucket belongs to the current Alibaba Cloud account: We recommend that you create and authorize a RAM role in the Data Online Migration console. For more information, see Authorize a RAM role in the Data Online Migration console. You can also manually attach policies to a RAM role in the RAM console. For more information, see the "Step 4: Grant permissions on the destination bucket to a RAM role" section of the Preparations topic.
If the destination bucket belongs to a different Alibaba Cloud account: Attach policies to a RAM role in the OSS console. For more information, see the "Step 4: Grant permissions on the destination bucket to a RAM role" section of the Preparations topic.Bucket Yes The name of the OSS bucket to which data is migrated. Prefix No The prefix of the destination data address. The prefix cannot start with a forward slash (/) and must end with a forward slash (/). Example: data/to/oss/.
With a prefix: For example, if the source prefix isexample/src/, a file named example.jpg is stored in example/src/, and the destination prefix isexample/dest/, the full path of the migrated file isexample/dest/example.jpg.
Without a prefix: Source data is migrated to the root directory of the destination bucket.Tunnel No The name of the tunnel to use. ImportantThis parameter is required only when you migrate data to the cloud over leased lines or VPN gateways, or migrate data from self-managed databases to the cloud. If data at the destination data address is stored in a local file system or you need to migrate data over a leased line in an environment such as Alibaba Finance Cloud or Apsara Stack, you must create and deploy an agent.
Agent No The name of the agent to use. ImportantThis parameter is required only when you migrate data to the cloud over leased lines or VPN gateways, or migrate data from self-managed databases to the cloud. You can select up to 30 agents for a specific tunnel.
Step 4: Create a migration task
Up to five concurrent migration tasks can run in each region. If the number of concurrent migration tasks in a region exceeds this limit, periodic task scheduling may not work as expected.
In the left-side navigation pane, choose Data Online Migration > Migration Tasks. On the Migration Tasks page, click Create Task.
In the Select Address step, configure the following parameters and click Next.
Parameter Required Description Name Yes The name of the migration task. The name must be 3 to 63 characters in length. The name is case-sensitive and can contain lowercase letters, digits, hyphens (-), and underscores (\_). The name must be encoded in UTF-8 and cannot start with a hyphen (-) or an underscore (\_). Source Address Yes The source data address that you created. Destination Address Yes The destination data address that you created. In the Task Configurations step, configure the following parameters.
Parameter Required Description Migration Bandwidth No The maximum bandwidth available to the migration task. Valid values:
Default: Use the default upper limit for migration bandwidth. The actual bandwidth depends on the file size and the number of files.
Specify an upper limit: Set a custom upper limit for migration bandwidth.ImportantThe actual migration bandwidth depends on multiple factors, such as the source data address, network conditions, throttling at the destination data address, and file size. The actual bandwidth may not reach the specified upper limit. Set a reasonable upper limit based on your source data address, migration purpose, business situation, and network bandwidth. Inappropriate throttling may affect business performance.
Files Migrated Per Second No The maximum number of files that can be migrated per second. Valid values:
Default: Use the default upper limit.
Specify an upper limit: Set a custom upper limit.ImportantThe actual migration speed depends on multiple factors, such as the source data address, network conditions, throttling at the destination data address, and file size. The actual speed may not reach the specified upper limit. Set a reasonable upper limit based on your source data address, migration purpose, business situation, and network bandwidth. Inappropriate throttling may affect business performance.
Overwrite Method No Specifies whether to overwrite a file at the destination if it has the same name as a file at the source. Valid values:
Do not overwrite: The source file is not migrated.
Overwrite All: The destination file is overwritten.
Overwrite based on the last modification time: The destination file is overwritten if the source file was modified more recently. If the source and destination files have the same last modification time, the destination file is overwritten if the files differ in size or Content-Type header.WarningIf you select Overwrite based on the last modification time, a newer file may be overwritten by an older file with the same name. Make sure the source file contains last modification time, size, and Content-Type header information. Otherwise, the overwrite policy may not work as expected. If you select Do not overwrite or Overwrite based on the last modification time, the system sends requests to the source and destination data addresses to obtain metadata, which generates request fees at both addresses.
Migration Report Yes Specifies whether to deliver a migration report to the destination bucket. Valid values:
Do not push (default): Does not deliver the migration report.
Push: Delivers the migration report to the destination bucket. For more information, see Subsequent operations.ImportantThe migration report occupies storage space at the destination data address. The migration report may be delivered with a delay. Wait until the report is generated. A unique ID is generated for each execution of a task. A migration report is delivered only once. We recommend that you do not delete the migration report.
Migration Logs Yes Specifies whether to deliver migration logs to Simple Log Service. Valid values:
Do not push (default): Does not deliver migration logs.
Push: Delivers migration logs to Simple Log Service. You can view the logs in the Simple Log Service console.
Push only file error logs: Delivers only error logs to Simple Log Service.
If you select Push or Push only file error logs, Data Online Migration creates a project in Simple Log Service. The project name follows the formataliyun-oss-import-log-{Alibaba Cloud account ID}-{Region}. Example: aliyun-oss-import-log-137918634953\*\*\*\*-cn-hangzhou.ImportantBefore you select Push or Push only file error logs, make sure that Simple Log Service is activated and you have confirmed the authorization on the Authorize page.
Authorize No Displayed when you set Migration Logs to Push or Push only file error logs. Click Authorize to go to the Cloud Resource Access Authorization page. On this page, click Confirm Authorization Policy. The RAM role AliyunOSSImportSlsAuditRole is created and granted the required permissions. File Name No A filter based on file names. Both inclusion and exclusion rules are supported. The filter uses re2 regular expression syntax. For more information, see re2.
Examples:.*\.jpg$matches all files whose names end with .jpg.
By default,^file.*matches all files whose names start with file in the root directory. If the source data address has a prefix ofdata/to/oss/, use^data/to/oss/file.*to match files starting with file in that directory..*/picture/.*matches files whose paths contain a subdirectory named picture.ImportantIf an inclusion rule is configured, only files that match the rule are migrated. If multiple inclusion rules are configured, files that match any rule are migrated. If an exclusion rule is configured, files that match the rule are not migrated. If multiple exclusion rules are configured, files that match any rule are excluded. Exclusion rules take precedence over inclusion rules. If a file matches both an exclusion rule and an inclusion rule, the file is not migrated.
File Modification Time No A filter based on the last modification time of files. If you specify a time period, only files whose last modification time falls within that period are migrated.
Examples:
If you set the start time to January 1, 2019 without specifying an end time, only files modified on or after January 1, 2019 are migrated.
If you set the end time to January 1, 2022 without specifying a start time, only files modified on or before January 1, 2022 are migrated.
If you set the start time to January 1, 2019 and the end time to January 1, 2022, only files modified on or after January 1, 2019 and on or before January 1, 2022 are migrated.Execution Time No The schedule for running the migration task. Valid values:
Immediately: Runs the task right away.
Scheduled Task: Runs the task within a specified time period every day. The task starts at the specified start time and stops at the specified stop time.
Periodic Scheduling: Runs the task based on the execution frequency and number of executions you specify.
Execution Frequency: The frequency at which the task runs. Valid values: Every Hour, Every Day, Every Week, Certain Days of the Week, and Custom. For more information, see the Supported execution frequencies section.
Executions: The maximum number of times the task runs. By default, the task runs once.ImportantIf the current execution is not complete by the next scheduled start time, the task starts its next execution at the subsequent scheduled time after the current run finishes. This continues until the task runs the specified number of times. If Data Online Migration is deployed in the China (Hong Kong) region or Chinese mainland regions, up to 10 concurrent migration tasks are supported. If deployed in regions outside China, up to 5 concurrent migration tasks are supported. If concurrent tasks exceed the limit, executions may not complete on schedule. You can manually start and stop tasks at any time, regardless of the scheduled execution time.
Read the Data Online Migration Agreement. Select I have read and agree to the Alibaba Cloud International Website Product Terms of Service. and I have understood that when the migration task is complete, the migrated data may be different from the source data. Therefore, I have the obligation and responsibility to confirm the consistency between the migrated data and source data. Alibaba Cloud is not responsible for the confirmation of the consistency between the migrated data and source data. Then, click Next.
Verify that the configurations are correct and click OK. The migration task is created.
Supported execution frequencies
| Frequency | Description | Example |
|---|---|---|
| Every Hour | Runs the migration task every hour. You can also specify the maximum number of times the task runs. | Schedule a task to run every hour for three times. If the current time is 08:05, the task starts at 09:00. If the first run finishes before 10:00, the second run starts at 10:00, and so on. If the first run finishes at 12:30, the second run starts at 13:00. |
| Every Day | Runs the migration task every day at a full hour between 00:00 and 23:00. You can also specify the maximum number of times the task runs. | Schedule a task to run at 10:00 every day for five times. If the current time is 08:05, the task starts at 10:00. If the first run finishes before 10:00 the next day, the second run starts at 10:00 that day. If the first run finishes at 12:05 the next day, the second run starts at 10:00 the day after. |
| Every Week | Runs the migration task every week on a specified day at a full hour between 00:00 and 23:00. You can also specify the maximum number of times the task runs. | Schedule a task to run at 10:00 every Monday for 10 times. If the current time is 08:05 on Monday, the task starts at 10:00. If the first run finishes before 10:00 the next Monday, the second run starts at 10:00 that Monday. If the first run finishes at 12:05 the next Monday, the second run starts at 10:00 the Monday after. |
| Certain Days of the Week | Runs the migration task on specific days of the week at a full hour between 00:00 and 23:00. | Schedule a task to run at 10:00 every Monday, Wednesday, and Friday. If the current time is 08:05 on Wednesday, the task starts at 10:00. If the first run finishes before 10:00 on Friday, the second run starts at 10:00 on Friday. If the first run finishes at 12:05 the next Monday, the second run starts at 10:00 the next Wednesday. |
| Custom | Uses a cron expression to specify a custom schedule for the migration task. | A cron expression consists of six fields separated by spaces, specifying the schedule in the following order: second, minute, hour, day of the month, month, and day of the week. Examples: 0 * * * * *: Runs every hour on the hour. 0 0 0/1 * * ?: Runs every hour. 0 0 12 * * MON-FRI: Runs at 12:00 every Monday through Friday. 0 30 8 1,15 * *: Runs at 8:30 on the 1st and 15th of each month. |
Step 5: Verify data
Data Online Migration handles data migration only and does not guarantee data consistency or integrity. After a migration task is complete, you must review all migrated data and verify data consistency between the source and destination data addresses.
Verify the migrated data at the destination data address before deleting data at the source data address. If you delete source data before verification, you are liable for any losses or consequences caused by data loss.