This topic describes how to migrate all data in a MySQL database to MaxCompute.

Background information

The database migration feature allows you to migrate all tables in a MySQL database to MaxCompute in an efficient and cost-effective manner. This saves the time and cost that are required to migrate your initial data to the cloud. For more information, see Rules and restrictions.

Procedure

  1. Go to the Data Source page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region where your workspace resides. Find the workspace and click Data Integration in the Actions column.
    4. In the left-side navigation pane of the page that appears, click Connection. The Data Source page appears.
  2. On the Data Source page, click New data source in the upper-right corner. In the Add data source dialog box, create a MySQL connection named clone_database for database migration. For more information, see Configure a MySQL data source.
  3. Click the Icon icon in the upper-left corner and choose All Products > Data Aggregation > Data Integration.
  4. On the Data Integration page, click Migrate Database in the left-side navigation pane.
  5. On the Migrate Database page, find the created MySQL connection and click Migration in the Operation column.
    The database migration settings page consists of three sections. Database migration
    No. Section Description
    1 Tables to migrate This section lists all the tables in the MySQL database that is connected based on the clone_database connection. You can select the tables to migrate based on your needs.
    2 Advanced settings You can configure the rules for converting the table names, field names, and data types between MySQL and MaxCompute.
    3 Basic settings You can specify whether to synchronize full or incremental data on a daily basis, whether to upload data in one or more batches, and the synchronization frequency and efficiency. You can also view the migration progress and results of each table after you commit sync nodes.
    If you select Synchronize Incremental Data Daily, you must specify the Incremental configuration mode parameter. Valid values:
    • Incremental field: Data Integration automatically generates a WHERE clause based on the specified field to read incremental data.
    • Where condition for incremental extraction?: Data Integration uses the specified WHERE clause to read incremental data.
  6. Click Advanced Settings. In the Advanced Settings dialog box, configure conversion rules based on your needs. For example, you can add an ods_ prefix to the name of each MaxCompute table.
    Advanced Settings dialog box
  7. Specify basic settings. In this example, set the Sync Method parameter to Synchronize Incremental Data Daily and configure the incremental data to be determined based on the gmt_modified field. Data Integration automatically generates a WHERE clause for each sync node to read incremental data based on the specified field and DataWorks scheduling parameters such as ${bdp.system.bizdate}.
    Data Integration reads data from MySQL tables by connecting to a remote MySQL database over Java Database Connectivity (JDBC) and executing SELECT statements. Data Integration uses standard SQL statements, and therefore you can configure WHERE clauses to filter data. The following code shows an example of the WHERE clause:
    STR_TO_DATE('${bdp.system.bizdate}', '%Y%m%d') <= gmt_modified AND gmt_modified < DATE_ADD(STR_TO_DATE('${bdp.system.bizdate}', '%Y%m%d'), interval 1 day)

    Set the Sync Mode parameter to Upload in Batches to protect the MySQL database from being overloaded. Configure Data Integration to start data synchronization for three tables every one hour from 00:00 each day.

  8. Click Commit Sync Node. Then, you can view the migration progress and results of each table.
  9. Find Table a1 and click View Node to view the migration results.

Result

After you complete the preceding steps, sync nodes are configured to migrate all data in the MySQL database to MaxCompute by using the clone_database connection. These sync nodes are run based on the specified schedule, daily by default. You can also create retroactive node instances to synchronize historical data.
The logs in the following figure show that Table a1 has been migrated. View logs