After you prepare data sources, network environments, and resources, you can create a real-time synchronization node to synchronize data to Hologres. This topic describes how to create a real-time synchronization node and view the status of the node.

Prerequisites

  1. The data sources that you want to use are prepared. Before you configure a data synchronization node, you must prepare the data sources from which you want to read data and to which you want to write data. This way, when you configure a data synchronization node, you can select the data sources. For information about the data source types, readers, and writers that are supported by real-time synchronization, see Data source types that support real-time synchronization.
    Note For information about the items that you need to understand before you prepare a data source, see Overview.
  2. An exclusive resource group for Data Integration that meets your business requirements is purchased. For more information, see Create and use an exclusive resource group for Data Integration.
  3. Network connections are established between the exclusive resource group for Data Integration and the data sources. For more information, see Establish a network connection between a resource group and a data source.
  4. The data source environments are prepared. You must create an account that can be used to access a database in the source and an account that can be used to access a database in the destination. You must also grant the accounts the permissions required to perform specific operations on the databases based on your configurations for data synchronization. For more information, see Overview.

Limits

  • You can use only exclusive resource groups for Data Integration to run real-time synchronization nodes.

  • You can use a real-time synchronization node to synchronize data to a Hologres data source only from a PolarDB, Oracle, or MySQL data source.
  • A real-time data synchronization node cannot be used to synchronize data from a table that has no primary key.

Create a real-time synchronization node

  1. Create a real-time synchronization node to synchronize all data in a database.
  2. Configure an exclusive resource group for Data Integration.
  3. Configure the source and mapping rules.
    1. In the Data Source section of the Configure Source and Synchronization Rules step, configure the Type and Data source parameters.
    2. Select the tables from which you want to read data.
      In the Source Table section, all tables in the selected data source are displayed in the Source Table list. You can select all or some tables from the Source Table list and click the Icon icon to move the tables to the Selected Source Table list.
      Important If a selected table does not have a primary key, the table cannot be synchronized in real time.
    3. In the Conversion Rule for Table Name section, click Add Rule, select a rule type, and then configure a mapping rule based on rule type that you selected.
      By default, data in the source tables is written to the destination Hologres schemas or tables that are named the same as the source tables. You can specify a destination schema name and a destination table name in a mapping rule to write data in multiple source tables to the same Hologres table. You can also specify prefixes in a mapping rule to write data from source tables whose names start with a specified prefix to Hologres tables whose names start with another specified prefix. Data Integration allows you to use a regular expression to configure a mapping rule to specify the names of the destination Hologres schemas or tables to which you want to write data. You can also concatenate built-in variables to specify the names of the destination Hologres tables. For more information about the configuration logic, see Configure the source and mapping rules.
  4. Configure the destination tables.
    1. Configure the Policy for Writing to Hologres parameter.
      You can set the Policy for Writing to Hologres parameter only to Replay. This value indicates that the operations performed on the source are also performed on the destination by Hologres Writer. The operations include INSERT, UPDATE, and DELETE.
    2. Refresh mappings between source tables and destination Hologres tables.
      Click Refresh source table and Hologres Table mapping to map the source tables and destination Hologres tables based on the mapping rules that you configured in the Conversion Rule for Table Name section. If no mapping rule is configured in the Conversion Rule for Table Name section, data in the source tables is written to the Hologres tables that are named the same as the source tables. If no such destination Hologres table exists in the destination, the system automatically creates the tables in the destination. You can modify the table generation method and add additional fields to the destination Hologres tables.
      OperationDescription
      Synchronize a source table that does not have a primary keyThe current solution cannot be used to synchronize data from a source table that does not have a primary key. If you want to synchronize data from a source table that does not have a primary key, you must click the eg icon in the Synchronized Primary Key column of the table to specify a primary key for the source table. You can use a field or a combination of multiple fields in the source table as the primary key. The system removes duplicate data based on the primary key during data synchronization.
      Select a table generation methodYou can select Create Table or Use Existing Table from the drop-down list in the Table creation method column.
      • If you select Use Existing Table from the drop-down list in the Table creation method column, you can select a destination table from the drop-down list in the Hologres Table column.
      • If you select Create Table from the drop-down list in the Table creation method column, the name of the table that is automatically created appears in the Hologres Table column. You can click the table name to view and modify the table creation statement.
      Add additional fields to a destination Hologres table and assign values to the fieldsYou can click Edit additional fields in the Actions column of a destination Hologres table to add additional fields to the table and assign values to the fields. You can manually assign constants and variables to the additional fields.
      Note You can add additional fields to a destination Hologres table only if you select Create Table from the drop-down list in the Table creation method column of the table.
      The following additional variable fields are supported by Data Integration:
      EXECUTE_TIME: the execution time
      UPDATE_TIME: the update time
      DB_NAME_SRC: the name of the original database
      DB_NAME_SRC_TRANSED: the converted name of the database
      DATASOURCE_NAME_SRC: the name of the source data source
      DATASOURCE_NAME_DEST: the name of the destination data source
      DB_NAME_DEST: the name of the destination database
      TABLE_NAME_DEST: the name of the destination table
      TABLE_NAME_SRC: the name of the source table
    3. Click Next.
      If you select Create Table from the drop-down list in the Table creation method column of a destination table, you must click Start table building in the Create Table dialog box to create destination Hologres tables.
  5. Configure rules for processing DML messages.
    You can configure rules for processing DML messages generated for insert, update, and delete operations that are performed on the source.
    • Normal: A DML message from the source is sent to the destination. Then, the system performs the same operation on the destination as the operation that is performed on the source.
    • Ignore: A DML message from the source is ignored and is not sent to the destination. The system does not perform any operation on the destination.
    • Conditionally Normal Processing: If you configure a processing rule of this type, you can specify a filter condition to filter source data. The related operation will be performed only on data that meets the filter condition. Data that does not meet the filter condition will be ignored.
  6. Configure rules for processing DDL messages.

    DDL operations are performed on a source. Data Integration provides default rules to process DDL messages. You can also configure processing rules for different DDL messages based on your business requirements. For more information, see Rules for processing DDL messages.

  7. Configure the resources required to run the data synchronization node.
    1. In the Configure Resources step, configure the parameters.
      ParameterDescription
      Maximum number of connections supported by source readThe maximum number of Java Database Connectivity (JDBC) connections that are allowed for the source. Configure this parameter based on the resources of the source database. Default value: 15.
      Maximum number of parallel threads allowed to read by destinationThe maximum number of parallel threads that the synchronization node uses to read data from the source table or write data to the destination. Maximum value: 32. Specify an appropriate number based on the specifications of the exclusive resource group for Data Integration and the data write capabilities of the destination.
    2. Click Complete Configuration.

Commit and deploy the real-time synchronization node

  1. Click the Save icon in the top toolbar to save the node.
  2. Click the Submit icon in the top toolbar to commit the node.
  3. In the Commit Node dialog box, configure the Change description parameter.
  4. Click Confirm.
    If you use a workspace in standard mode, you must deploy the node in the production environment after you commit the node. On the left side of the top navigation bar, click Deploy. For more information, see Deploy nodes.

What to do next

After the real-time synchronization node is configured, you can start and manage the node on the Real Time DI page in Operation Center. To go to the Real Time DI page, perform the following operations: Log on to the DataWorks console and go to the Operation Center page. In the left-side navigation pane of the Operation Center page, choose RealTime Task > RealTime DI. For more information, see O&M for real-time synchronization nodes.