You can use Data Transmission Service (DTS) to track data changes from databases in real time. Then, you can consume the tracked data and write data to a destination database. You can use the change tracking feature in the following scenarios: cache updates, asynchronous business decoupling, data synchronization between heterogeneous data sources, and data synchronization with extract, transform, and load (ETL) operations. This topic describes how to create a change tracking task for an ApsaraDB RDS for MySQL instance in a DTS dedicated cluster.

Prerequisites

Usage notes

CategoryDescription
Limits on the source database
  • The source tables must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, part of the tracked data changes may be duplicate.
  • If you select tables as the objects to be tracked, up to 500 tables can be tracked in a single change tracking task. If you run a change tracking task to track more than 500 tables, a request error occurs. In this case, we recommend that you configure multiple tasks to track the tables in batches or configure a change tracking task for the entire database.
  • The following requirements for binary logs must be met:
    • The value of the binlog_row_image parameter must be set to full. For more information, see View the parameters of an ApsaraDB RDS for MySQL instance. Otherwise, error messages are returned during precheck and the change tracking task cannot be started.

    • The binary logs of the source database must be stored for more than 24 hours. Otherwise, DTS may fail to obtain the binary logs and the task may fail. In exceptional circumstances, data inconsistency or loss may occur. Make sure that you set the retention period of binary logs based on the preceding requirements. Otherwise, the service reliability and performance stated in the Service Level Agreement (SLA) of DTS may not be guaranteed.
  • A read-only instance or temporary instance cannot be used as the source instance for change tracking.
Other limits
  • You must make sure that the precision settings for columns of the FLOAT or DOUBLE data type meet your business requirements. DTS uses the ROUND(COLUMN,PRECISION) function to retrieve values from columns of the FLOAT or DOUBLE data type. If you do not specify a precision, DTS sets the precision for the FLOAT data type to 38 digits and the precision for the DOUBLE data type to 308 digits.
  • DTS does not track the DDL operations that are performed by using gh-ost or pt-online-schema-change. Therefore, the change tracking client may fail to write the consumed data to the destination tables due to schema conflicts.

Procedure

  1. Go to the Dedicated Cluster page.
  2. In the top navigation bar, select the region in which you want to create a DTS dedicated cluster.
  3. Find the DTS dedicated cluster for which you want to configure a change tracking task. In the Actions column, choose Configure Task > Configure Change Tracking Task.
  4. Configure parameters in the Source Database and Consumer Network Type sections.
    Warning After you select the source instance, we recommend that you read the limits displayed in the upper part of the page. This helps you create and run the change tracking task.
    SectionParameterDescription
    Task NameN/A

    The task name that DTS automatically generates. We recommend that you specify a descriptive name that makes it easy to identify the task. You do not need to use a unique task name.

    Source DatabaseSelect an existing database connection. (Optional. If you have not created a database connection, ignore this option and configure database settings in the section below.)
    The database instance that you want to use. You can choose whether to use an existing instance based on your business requirements.
    • If you use an existing instance, DTS automatically applies the parameter settings of the instance.
    • If you do not use an existing instance, you must configure the parameters for the database.
    Database TypeThe type of the source database. Select MySQL.
    Access MethodThe access method of the source database. Select Alibaba Cloud Instance.
    Instance RegionSelect the region in which the source ApsaraDB RDS for MySQL instance resides. This parameter is set to the value that is specified when you create the DTS dedicated cluster and cannot be changed.
    Replicate Data Across Alibaba Cloud Accounts

    Specifies whether data is replicated across multiple Alibaba Cloud Accounts. Set the value to No.

    RDS Instance IDThe ID of the source ApsaraDB RDS for MySQL instance.
    Database AccountEnter a database account that has read-only permissions on the ApsaraDB RDS for MySQL instance, or a custom account that has the REPLICATION CLIENT, REPLICATION SLAVE, SHOW VIEW, and SELECT permissions.
    Database Password

    The password of the database account.

    Encryption

    Specifies whether to encrypt the connection to the database. You can select Non-encrypted or SSL-encrypted based on your business requirements. If you want to select SSL-encrypted, you must enable SSL encryption for the source instance before you configure the change tracking task. For more information, see Configure SSL encryption for an ApsaraDB RDS for MySQL instance.

    Consumer Network TypeNetwork TypeThe network type over which data changes are tracked and consumed. Only the Virtual Private Cloud (VPC) network type is supported. Select a VPC and a vSwitch.
    Note After a change tracking task is configured, you cannot change the network type. Data changes must be tracked and consumed over the specified network type.
  5. In the lower part of the page, click Test Connectivity and Proceed.
    Warning
    • If the source database is an Alibaba Cloud database instance, such as an ApsaraDB RDS for MySQL or ApsaraDB for MongoDB instance, DTS automatically adds the CIDR blocks of DTS servers to the whitelist of the database instance. For more information, see Add the CIDR blocks of DTS servers to the security settings of on-premises databases. If the source database is a self-managed database hosted on an ECS instance, DTS automatically adds the CIDR blocks of DTS servers to the security group rules of the ECS instance. To allow DTS to access the database, you must also manually add the CIDR blocks of DTS servers to the security settings of the database. If the source database is a self-managed database that is deployed in a data center or provided by a third-party cloud service provider, you must manually add the CIDR blocks of DTS servers to the security settings of the database to allow DTS to access the database.
    • If the CIDR blocks of DTS servers are automatically or manually added to the whitelist of the database, Alibaba Cloud database instance, or ECS security group rules, security risks may arise. Therefore, before you use DTS to migrate data, you must understand and acknowledge the potential risks and take preventive measures, including but not limited to the following measures: enhancing the security of your username and password, limiting the ports that are exposed, authenticating API calls, regularly checking the whitelist or ECS security group rules and forbidding unauthorized CIDR blocks, or connecting the database to DTS by using Express Connect, VPN Gateway, or Smart Access Gateway.
    • After the DTS task is complete or released, we recommend that you manually detect and remove the added CIDR blocks from the whitelist or ECS security group rules.
  6. Configure objects to migrate and advanced settings.
    ParameterDescription
    Data Change Types
    • Data Update

      DTS tracks data updates of the selected objects, including the INSERT, DELETE, and UPDATE operations.

    • Schema Updates

      DTS tracks the create, delete, and modify operations that are performed on all object schemas of the source instance. You must use the change tracking client to filter the required data.

    Source Objects
    Select one or more objects from the Source Objects section and click the Rightwards arrow icon to add the objects to the Selected Objects section.
    Note You can select tables or databases as the objects for change tracking.
    • If you select a database as the object, DTS tracks data changes of all objects, including new objects in the database.
    • If you select a table as the object, DTS tracks only data changes of this table. In this case, if you want to track data changes of another table, you must add the table to the selected objects. For more information, see Modify the objects for change tracking.
  7. Click Next: Advanced Settings to configure advanced settings.
    ParameterDescription
    Select the dedicated cluster used to schedule the taskYour DTS dedicated cluster is selected by default.
    Set Alerts
    Specifies whether to set alerts for the data migration task. If the task fails or the migration latency exceeds the threshold, the alert contacts will receive notifications. Valid values:
    Retry Time for Failed Connections
    The retry time range for failed connections. If the source or destination database fails to be connected after the data migration task is started, DTS immediately retries a connection within the time range. Valid values: 10 to 1440. Unit: minutes. Default value: 720. We recommend that you set the parameter to a value greater than 30. If DTS reconnects to the source and destination databases within the specified time range, DTS resumes the data migration task. Otherwise, the data migration task fails.
    Note
    • If you set different retry time ranges for multiple data migration tasks that share the same source or destination database, the value that is set later takes precedence.
    • If DTS retries a connection, you are charged for the operation of the DTS instance. We recommend that you specify the retry time based on your business needs and release the DTS instance at your earliest opportunity after the source and destination instances are released.
    The wait time before a retry when other issues occur in the source and destination databases.
    The retry time range for other issues. For example, if the DDL or DML operations fail to be performed after the data migration task is started, DTS immediately retries the operations within the time range. Valid values: 1 to 1440. Unit: minutes. Default value: 10. We recommend that you set the parameter to a value greater than 10. If the failed operations are successfully performed within the specified time range, DTS resumes the data migration task. Otherwise, the data migration task fails.
    Important The value of the The wait time before a retry when other issues occur in the source and destination databases parameter must be smaller than the value of the Retry Time for Failed Connections parameter.
  8. In the lower part of the page, click Next: Save Task Settings and Precheck.
    Note
    • Before you can start the change tracking task, DTS performs a precheck. You can start the change tracking task only after the task passes the precheck.
    • If the task fails to pass the precheck, click the Info icon next to each failed item to view details.
      • After you troubleshoot the issues based on the causes, run a precheck again.
      • If you do not need to troubleshoot the issues, ignore failed items and run a precheck again.
  9. Wait until the success rate becomes 100%. Then, click Next: Select DTS Instance Type.
  10. Configure Instance Class for the task in the New Instance Class section. You can configure a minimum of one DTS unit (DU) and a maximum of the remaining available DUs.
  11. Read and select the check box to agree to the Data Transmission Service (Pay-as-you-go) Service Terms.
  12. Click Start Task to start the change tracking task. You can go to the cluster details page and click Cluster Task List in the left-side navigation pane to view the task progress.

What to do next

When the change tracking task is running, you can create consumer groups based on the downstream client to consume the tracked data.
  1. Create and manage consumer groups. For more information, see Create consumer groups.
  2. Use one of the following methods to consume the tracked data: