This topic describes how to use the task orchestration feature of Data Management (DMS) to regularly migrate historical data from an ApsaraDB RDS for MySQL database that uses the InnoDB engine to an ApsaraDB RDS for MySQL database that uses X-Engine. This is a cost-effective storage solution.

Prerequisites

Database instances are registered with DMS and meet the following conditions:
  • The source database is an ApsaraDB RDS for MySQL database that uses the InnoDB engine.
  • The destination database is an ApsaraDB RDS for MySQL database that uses X-Engine. For information about how to create an ApsaraDB RDS for MySQL instance, see Create an ApsaraDB RDS for MySQL instance.
  • Both the source and destination database instances are managed in Security Collaboration mode.
  • Cross-database query is enabled for both the source and destination database instances. For information about how to enable this feature, see Modify an instance.
    Note In this example, the database link of the source instance is dblink_src_rds, and that of the destination instance is dblink_target_rds.

Background information

As business develops and data accumulates in a database, various problems may arise, such as surging storage costs and reduced performance of databases. Therefore, Alibaba Cloud provides a cost-effective storage solution based on X-Engine. If your business is supported by an ApsaraDB RDS for MySQL database that uses the InnoDB engine, you can migrate historical data in this database to an ApsaraDB RDS for MySQL database that uses X-Engine. This way, only hot data is stored in the business database. This improves data security and availability and reduces storage costs.
Note X-Engine is an online transaction processing (OLTP) storage engine that is developed by Alibaba Cloud. This storage engine is widely used in the business systems of Alibaba Group, such as the historical databases related to transactions and DingTalk services. This significantly reduces business costs. In addition, X-Engine is a crucial database technology that empowers Alibaba Group to cope with bursts of traffic that may surge to hundreds of times greater than usual during massive online promotions. X-Engine provides similar performance as the InnoDB engine, but costs far less than the InnoDB engine. Therefore, X-Engine is highly cost-effective. For more information, see Introduction to X-Engine.
The following figure shows the details of the storage solution that uses X-Engine.Flowchart

You can use the task orchestration feature of DMS to automatically and regularly migrate historical data to a database that uses X-Engine. This helps reduce labor costs.

In this example, a task flow that consists of four tasks is created by using the task orchestration feature. The first task in the task flow creates the destination table in the ApsaraDB RDS for MySQL database that uses X-Engine. The second task migrates historical data from the source table to the destination table on a daily basis. The historical data refers to the data that has been stored in the source table for more than a month. The third task deletes the historical data from the source table. The last task executes an OPTIMIZE TABLE statement to optimize the source table. This task flow is run at a specific point in time every day. After the task flow is run, the source database stores only the data that is generated over the last month. Large amounts of historical data are migrated to the destination database that uses X-Engine, which is more cost-effective.

Create a task flow

  1. Log on to the DMS console V5.0.
  2. In the top navigation bar, move the pointer over the More icon and choose Data Factory > Task Orchestration.
  3. In the Free orchestration tasks section of the Task Orchestration tab, click New task flow.
  4. In the New Task Flow dialog box, set the Task Flow Name parameter to Rds_innodb_to_X-Engine and the Description parameter to Rds_innodb_to_X-Engine demo. Then, click OK. The Task Orchestration tab appears.
    New Task Flow

Create tasks

Create the following tasks in the task flow:
  • Create a destination table: This task creates a destination table in the destination database. This table will be used to store the historical data from the source table.
  • Migrate historical data: This task uses the cross-database query feature to migrate historical data from a source table in the source database to the destination table.
  • Delete the historical data from the source table: After the historical data in the source table is migrated to the destination table, this task deletes the historical data from the source table.
  • Optimize the source table: After the historical data in the source table is deleted, this task executes an OPTIMIZE TABLE statement to optimize the source table and save storage space.
  1. On the Task Orchestration tab, create and configure the following tasks:
    Create a destination table
    1. From the node list on the left side of the Task Orchestration tab, drag the MySQL task node to the blank area of the canvas.
    2. On the canvas, right-click this task node, select Rename, rename the task node "Create a destination table", and then press Enter.
    3. Click this task node. The Content tab appears on the right.
    4. On the Content tab, select the destination database from the Database drop-down list. In this example, select the ApsaraDB RDS for MySQL database that uses X-Engine.
    5. In the code editor on the Content tab, enter an SQL statement for creating a destination table in the destination database and click Save. In this example, enter the following SQL statement:
      CREATE TABLE IF NOT EXISTS `target_xengine_tbl` (
        `id` BIGINT,
        `price` DECIMAL(10,2),
        `count` INT,
        `trx_time` DATETIME,
        PRIMARY KEY (`id`)
      ) ENGINE=XENGINE DEFAULT CHARSET=utf8;
      Note In this example, the name of the destination table is target_xengine_tbl. The schema of this table must be the same as that of the source table named src_innodb_tbl.
      SQL statement for creating a destination table in the destination database
    Migrate historical data
    1. From the node list on the left side of the Task Orchestration tab, drag the Cross Database SQL task node to the blank area of the canvas.
    2. On the canvas, right-click this task node, select Rename, rename the task node "Migrate historical data", and then press Enter.
    3. Click the blank area on the canvas. The Scheduling Settings tab appears on the right. Click the Variables tab, configure one or more variables as needed, and then click Save. In this example, configure two variables: thirty_one_days_ago and thirty_days_ago, as shown in the following figure.

      After you configure the variables, you can use ${Variable name} in SQL statements to specify a point in time or time period. For information about the rules and usage of variables, see Task.

      • The thirty_one_days_ago variable is added in the format of yyyy-MM-dd with an offset of -30 days.

        The value of this variable is accurate to the day and indicates the day that is 30 days ahead of the current day. For example, today is January 15, 2021. The variable value indicates December 15, 2020.

      • The thirty_days_ago variable is added in the format of yyyy-MM-dd with an offset of -29 days.

        The value of this variable is accurate to the day and indicates the day that is 29 days ahead of the current day. For example, today is January 15, 2021. The variable value indicates December 16, 2020.

      Variables tab
    4. Click this task node. The Content tab appears on the right.
    5. In the code editor on the Content tab, enter an SQL statement for migrating historical data from the source table to the destination table and click Save. In this example, enter the following SQL statement:
      INSERT INTO `dblink_target_rds`.`target_db`.`target_xengine_tbl`
       (`id`, `price`, `count`, `trx_time`)
      SELECT `id`, `price`, `count`, `trx_time`
      FROM `dblink_src_rds`.`src_db`.`src_innodb_tbl`
      WHERE `trx_time` >= '${thirty_one_days_ago}'
      AND `trx_time` < '${thirty_days_ago}';
      Note This SQL statement is used to migrate historical data from the source table in the source database to the destination table in the destination database.
      • In this example, the database link dblink_src_rds refers to the source ApsaraDB RDS for MySQL instance that uses the InnoDB engine. The database link dblink_target_rds refers to the destination ApsaraDB RDS for MySQL instance that uses X-Engine.
      • The following two variables are used in the WHERE clause to specify the range of historical data to be migrated: ${thirty_one_days_ago} and ${thirty_days_ago}.
      SQL statement for migrating historical data from the source table to the destination table
    Delete the historical data from the source table
    1. From the node list on the left side of the Task Orchestration tab, drag the Lock-Free data modify task node to the blank area of the canvas.
    2. On the canvas, right-click this task node, select Rename, rename the node "Delete the historical data from the source table", and then press Enter.
    3. Click this task node. The Content tab appears on the right.
    4. On the Content tab, select the source database from the Database drop-down list. In this example, select the ApsaraDB RDS for MySQL database that uses the InnoDB engine.
      Note This task is used to delete the historical data that has been migrated to the destination table from the source table.
    5. In the code editor on the Content tab, enter an SQL statement for deleting the historical data from the source table and click Save. In this example, enter the following SQL statement:
      DELETE FROM `src_innodb_tbl`
        WHERE `trx_time` >= '${thirty_one_days_ago}'
        AND `trx_time` < '${thirty_days_ago}';
      SQL statement for deleting the historical data from the source table
    Optimize the source table
    1. From the node list on the left side of the Task Orchestration tab, drag the Lock-Free data modify task node to the blank area of the canvas.
    2. On the canvas, right-click this task node, select Rename, rename the node "Optimize the source table", and then press Enter.
    3. Click this task node. The Content tab appears on the right.
    4. On the Content tab, select the source database from the Database drop-down list. In this example, select the ApsaraDB RDS for MySQL database that uses the InnoDB engine.
    5. In the code editor on the Content tab, enter an SQL statement for optimizing the source table and click Save. In this example, enter the following SQL statement:
      OPTIMIZE TABLE `src_innodb_tbl`;
      SQL statement for optimizing the source table
      Note To prevent the OPTIMIZE TABLE statement from affecting online business, we recommend that you enable the Open (DMS OnlineDDL first) feature for the source ApsaraDB RDS for MySQL instance, as shown in the following figure. For more information, see Modify an instance. Edit dialog box
  2. On the canvas, connect the four task nodes in the order that the task nodes are created to form a task flow.
    Connect the task nodes

Configure scheduling properties

  1. Click the blank area on the canvas. The Scheduling Settings tab appears on the right. On this tab, turn on Enable Scheduling and set the parameters as described in the following figure.
    Scheduling Settings tab
    Note In this example, the task flow is scheduled to be run at 01:00 every day during off-peak hours. You can customize the scheduling properties as needed.
  2. Click Save.