This topic describes how to use the task orchestration feature of Data Management Service (DMS) to regularly migrate historical data from an ApsaraDB RDS for MySQL database that uses the InnoDB engine to an ApsaraDB RDS for MySQL database that uses X-Engine. This is a cost-effective storage solution.

Prerequisites

Database instances are registered in DMS and meet the following conditions:
  • The source database is an ApsaraDB RDS for MySQL database that uses the InnoDB engine.
  • The destination database is an ApsaraDB RDS for MySQL database that uses X-Engine. For information about how to create an ApsaraDB RDS for MySQL instance, see Create an ApsaraDB RDS for MySQL instance.
  • Both the source and destination instances are in Secure Collaboration mode.
  • Cross-database query is enabled for both the source and destination instances. For information about how to enable this feature, see Modify an instance.
    Note In this example, the database link of the source instance is dblink_src_rds, and that of the destination instance is dblink_target_rds.

Background information

As business develops and data accumulates in a database, various problems may arise, such as surging storage costs and reduced performance of the database. Therefore, Alibaba Cloud provides a cost-effective storage solution based on X-Engine, a storage engine that is developed by Alibaba Cloud. Assume that your business is supported by an ApsaraDB RDS for MySQL database that uses the InnoDB engine. You can migrate historical data in this database to an ApsaraDB RDS for MySQL database that uses X-Engine, where data can be stored at lower costs. In this way, only hot data is stored in the business database. This improves the security and availability of your business and reduces storage costs.
Note X-Engine is an online transaction processing (OLTP) storage engine that is developed by Alibaba Cloud. This storage engine is widely used in many business systems of Alibaba Group, including the transaction history database and DingTalk chat history database. This significantly reduces business costs. In addition, X-Engine is a crucial database technology that empowers Alibaba Group to cope with bursts of traffic that may surge to hundreds of times greater than usual during Double 11. X-Engine provides similar performance as the InnoDB engine, but costs far less than the InnoDB engine. Therefore, X-Engine is a highly cost-effective storage engine. For more information, see X-Engine overview.

You can use the task orchestration feature of DMS to automatically and regularly migrate historical data to a database that uses X-Engine. This helps reduce labor costs.

In this example, a task flow that consists of four tasks is created by using the task orchestration feature in DMS. The first task in the task flow creates a destination table in the ApsaraDB RDS for MySQL database that uses X-Engine. The second task backs up historical data in a source table, namely, data that has been stored in the source table for more than a month, to the destination table on a daily basis. The third task deletes the historical data in the source table. Then, the last task executes an OPTIMIZE TABLE statement to optimize the source table. This task flow is run at a specified time point every day. The source database only stores data that is generated over the last month. Large amounts of historical data are migrated to the destination database that uses X-Engine, which is more cost-effective.

Create a task flow

  1. Log on to the DMS console.
  2. In the top navigation bar, choose Data Factory > Task Orchestration. The Home tab of the Task Orchestration page appears.
  3. In the Free orchestration tasks section, click New task flow.
  4. In the New Task Flow dialog box, enter relevant information in the Task Flow Name and Description fields and click OK. In this example, set the task flow name to Rds_innodb_to_X-Engine and enter Rds_innodb_to_X-Engine demo in the Description field. The Task Orchestration page appears.
    New Task Flow dialog box

Create tasks

Create the following tasks in the task flow:
  • Create a destination table: This task creates a destination table in the destination database. This table will be used to store historical data of the source table.
  • Back up historical data: This task uses cross-database query to back up historical data from a source table in the source database to the destination table.
  • Delete the historical data in the source table: After the historical data in the source table is backed up to the destination table, this task deletes the historical data in the source table.
  • Optimize the source table: After the historical data in the source table is deleted, this task executes an OPTIMIZE TABLE statement to optimize the source table and save storage space.
  1. On the Task Orchestration page, create and configure the following tasks:
    Create a destination table
    1. In the navigation tree, find the MySQL task node and drag the task node to the canvas.
    2. Double-click this task node on the canvas, modify its name, and then press Enter. In this example, rename the task node as Create a destination table.
    3. Click this task node. The Content tab appears on the right.
    4. On the Content tab, select the destination database from the drop-down list. In this example, select the ApsaraDB RDS for MySQL database that uses X-Engine.
    5. Enter an SQL statement for creating a destination table in the destination database and click Save. In this example, enter the following SQL statement:
      CREATE TABLE IF NOT EXISTS `target_xengine_tbl` (
        `id` BIGINT,
        `price` DECIMAL(10,2),
        `count` INT,
        `trx_time` DATETIME,
        PRIMARY KEY (`id`)
      ) ENGINE=XENGINE DEFAULT CHARSET=utf8;
      Note In this example, the name of the destination table is target_xengine_tbl. The schema of this table must be the same as that of the source table, which is named src_innodb_tbl.
      Content tab
    Back up historical data
    1. In the navigation tree, find the Cross Database SQL task node and drag the task node to the canvas.
    2. Double-click this task node on the canvas, modify its name, and then press Enter. In this example, rename the task node as Back up historical data.
    3. Click this task node. The Content tab appears on the right.
    4. On the Content tab, enter an SQL statement for backing up historical data from the source table to the destination table and click Save. In this example, enter the following SQL statement:
      INSERT INTO `dblink_target_rds`.`target_db`.`target_xengine_tbl`
       (`id`, `price`, `count`, `trx_time`)
      SELECT `id`, `price`, `count`, `trx_time`
      FROM `dblink_src_rds`.`src_db`.`src_innodb_tbl`
      WHERE `trx_time` >= '${thirty_one_days_ago}'
      AND `trx_time` < '${thirty_days_ago}';
      Note This SQL statement is used to back up historical data from the source table in the source database to the destination table in the destination database.
      • In this example, the database link dblink_src_rds refers to the source ApsaraDB RDS for MySQL instance that uses the InnoDB engine. The database link dblink_target_rds refers to the destination ApsaraDB RDS for MySQL instance that uses X-Engine.
      • The following two variables are used in the WHERE clause to specify the range of historical data to be backed up: ${thirty_one_days_ago} and ${thirty_days_ago}.
      Content tab
    5. Click the blank area on the canvas. The Scheduling tab appears on the right. Click the Variables tab, configure one or more variables as needed, and then click Save. In this example, configure two variables: thirty_one_days_ago and thirty_days_ago, as shown in the following figure.
      Variables tab
    Delete the historical data in the source table
    1. In the navigation tree, find the MySQL task node and drag the task node to the canvas.
    2. Double-click this task node on the canvas, modify its name, and then press Enter. In this example, rename the task node as Delete the historical data in the source table.
    3. Click this task node. The Content tab appears on the right.
    4. On the Content tab, select the source database from the drop-down list. In this example, select the ApsaraDB RDS for MySQL database that uses the InnoDB engine.
      Note This task is designed to delete the historical data, which has been backed up to the destination table, in the source table.
    5. Enter an SQL statement for deleting the historical data in the source table and click Save. In this example, enter the following SQL statement:
      DELETE FROM `src_innodb_tbl`
        WHERE `trx_time` >= '${thirty_one_days_ago}'
        AND `trx_time` < '${thirty_days_ago}';
      Content tab
    Optimize the source table
    1. In the navigation tree, find the MySQL task node and drag the task node to the canvas.
    2. Double-click this task node on the canvas, modify its name, and then press Enter. In this example, rename the task node as Optimize the source table.
    3. Click this task node. The Content tab appears on the right.
    4. On the Content tab, select the source database from the drop-down list. In this example, select the ApsaraDB RDS for MySQL database that uses the InnoDB engine.
    5. Enter an SQL statement for optimizing the source table and click Save. In this example, enter the following SQL statement:
      OPTIMIZE TABLE `src_innodb_tbl`;
      Content tab
      Note To prevent the OPTIMIZE TABLE statement from affecting online business, we recommend that you allow changing schemas without locking tables for the source ApsaraDB RDS for MySQL instance, as shown in the following figure. For more information, see Modify an instance.Edit instance dialog box
  2. On the canvas, connect the four task nodes to form a task flow. Draw lines between the task nodes in the order that the task nodes are created.
    Connect the task nodes

Configure scheduling properties

  1. Click the blank area on the canvas. The Scheduling tab appears on the right. Turn on the switch at the top of the Scheduling tab and complete the configurations. In this example, configure the task flow to be run at 01:00 every day, as shown in the following figure.
    Scheduling tab
    Note In this example, to avoid business peak hours, the task flow is configured to be run at 01:00 every day. You can customize the scheduling properties based on your actual needs.
  2. Click Save.