After the data sources, network environments, and resource groups are configured, you can create a real-time sync node to synchronize data to an AnalyticDB for MySQL data source. This topic describes how to create a real-time sync node and view the status of the node.

Limits

  • You can run a real-time sync node to synchronize data to an AnalyticDB for MySQL data source only from a PolarDB, a MySQL, or an ApsaraDB for OceanBase data source.

Create a real-time sync node

  1. Log on to the DataWorks console.
  2. In the left-side navigation pane, click Workspaces.
  3. After you select the region in which the workspace that you want to manage resides, find the workspace and click Data Analytics in the Actions column.
  4. Create a workflow.
    If you have a workflow, skip this step.
    1. Move the pointer over the Create icon and select Workflow.
    2. In the Create Workflow dialog box, set the Workflow Name parameter.
    3. Click Create.
  5. This section describes how to create a real-time sync node.
    1. Log on to the DataWorks console. In the left-side navigation pane, click Workspaces. On the DataStudio page, create a real-time sync node. In the Create Node dialog box, set the parameters as required. Create a real-time sync node to synchronize data to AnalyticDB for MySQL
      Parameter Description
      Node Type The type of the node. Default value: Real-time synchronization.
      Sync Method Set this parameter to Migration to AnalyticDB MySQL 3.0 in realtime mode. This setting is used to synchronize data from specified or all tables in a database to an AnalyticDB for MySQL data source.
      Node Name The name of the node. The name must be 1 to 128 characters in length and can contain letters, digits, underscores (_), and periods (.).
      Location The directory in which the real-time sync node is stored.
  6. Select a source and configure synchronization rules.
    1. In the Data Source section, specify the Type and Data source parameters.
      Note You can set the Type parameter only to MySQL, PolarDB, or OceanBase.
    2. In the Source Table section, select the tables whose data you want to synchronize from the Source Table list. Then, click the Add icon icon to add the tables to the Selected Source Table list.
      Source Table
      The Source Table section displays all the tables in the source. You can select all or specific tables.
      Notice If a selected table does not have a primary key, the table cannot be synchronized in real time.
    3. In the Mapping Rule for Table Name section, click Add rule to select a rule.
      Supported options are Conversion Rule for Table Name and Rule for Destination Table name.
      • Conversion Rule for Table Name: the rule that is used to convert the names of source tables to those of destination tables.
      • Rule for destination Table name: the rule that is used to add a prefix or a suffix to the converted names of destination tables.
    4. Click Next Step.
  7. Select a data source as the destination and configure the formats for the destination tables.
    1. In the Set Destination Table step, specify the Target AnalyticDB for MySQL 3.0 data source parameter.
    2. Click Refresh source table and AnalyticDB MySQL 3.0 Table Mapping to configure the mappings between the source tables to be synchronized and the destination AnalyticDB for MySQL tables.
    3. View the mapping progress, source tables, and mapped destination tables. View the mapping progress, source tables, and mapped destination tables
      Serial number Description
      1
      The progress of mapping the source tables to the destination tables.
      Note The mapping may require a long period of time if you want to synchronize data from a large number of tables.
      2
      • If the tables in the source database contain primary keys, the system removes duplicate data based on the primary keys during the synchronization.
      • If the tables in the source database do not contain primary keys, you can click the Edit icon to customize primary keys. You can use one field or a combination of several fields as the primary keys of the tables. This way, the system removes duplicate data based on the primary keys during the synchronization.
      Note A real-time sync node cannot be used to synchronize a table that has no primary key.
      3
      The method that is used to create a table. Valid values:
      • If you set the Table creation method parameter to Use existing Table, the names of the automatically created AnalyticDB for MySQL tables are displayed in the AnalyticDB for MySQL 3.0 Table name column. You can also select the table name that you want to use from the drop-down list.
      • If you set the Table creation method parameter to Create Table, the names of the automatically created AnalyticDB for MySQL tables are displayed. To view and modify the SQL statements that are used to create a table, click the name of the table.
    4. Click Next Step.
      If you set the Table creation method parameter to Create Table, you must click Start table building in the Create tables automatically dialog box to create destination AnalyticDB for MySQL tables.
  8. Configure rules for processing DDL messages.
    DDL statements exist in the source. Before you synchronize data, you can configure synchronization rules for different DDL statements based on your business requirements.
    Note The rules apply when a real-time sync node is run for the first time. If you want to modify the rules in subsequent operations, go to the configuration page of the real-time sync node. For more information, see the "Start the real-time sync node" section of this topic.
    1. Click Next Step.

Start the real-time sync node

Start the real-time sync node.
  1. Go back to the previous page and click Start in the Operation column that corresponds to your desired node.
  2. In the Start dialog box, set the parameters as required. Start the real-time sync node
    Parameter Description
    Whether to reset the site Specifies whether to set the time point for the next startup. If you select the Reset site parameter, the Start time point and Time zone parameters are required.
    Start time point The date and time for starting the real-time sync node.
    Time zone The time zone in which the real-time sync node is run. You can select a time zone from the Time zone drop-down list.
    Failover The maximum number of failovers allowed within the specified time range.
    Note If this parameter is not specified, the system automatically stops the node if the number of failovers exceeds 100 within 5 minutes. This avoids excessive resource consumption caused by the frequent starting of the node.
    Dirty data policy
    • Zero tolerance, not allowed: The real-time sync node is automatically stopped if the node contains dirty data.
    • No limit: The real-time sync node can normally run regardless of whether the node contains dirty data.
    • Limited control: The real-time sync node is automatically stopped if the amount of dirty data contained in the node exceeds a specified value.
    Processing Policy for DDL Messages in Real-time Sync You can modify the configured rules that are used to process DDL message based on your business requirements. For more information, see Step 10 in this topic.