Hologres Writer writes data that is obtained from various data sources to Hologres and helps you analyze data in real time.

Limits

How it works

Hologres Writer obtains data from a reader and writes data to Hologres based on the values that you specify for the writeMode and conflictMode parameters when you configure a data synchronization node that uses Hologres Writer in the code editor.
  • If you set the writeMode parameter to SQL (INSERT INTO), Hologres Writer writes data to Hologres by using Java Database Connectivity (JDBC) to execute the INSERT INTO statement that is provided by PostgreSQL. This write mode provides high data synchronization performance. We recommend that you use this write mode.
  • If you set the writeMode parameter to SDK, Hologres Writer writes data to Hologres by using the API operation for data write provided by Hologres. This write mode will be deprecated in the future.
If a primary key conflict occurs, you can use the conflictMode parameter to specify how to process conflicting data.
  • If you set the conflictMode parameter to Replace, new data overwrites existing data.
  • If you set the conflictMode parameter to Ignore, existing data is retained, and new data is ignored.
In different write modes, different methods are used to specify how to process conflicting data. If you set the writeMode parameter to SDK, you must configure the properties of the destination Hologres table to specify how to process the conflicting data.
Notice The conflictMode parameter is suitable only for tables that have primary keys.

Parameters

Parameter Description Required Default value
endpoint The endpoint of the destination Hologres instance. Specify the value in the format of instance-id-region-endpoint.hologres.aliyuncs.com:Port number. You can obtain the endpoint of a Hologres instance on the details page of the instance in the Hologres console.
The endpoint of a Hologres instance varies based on the network type. Network types include the Internet, classic network, and virtual private cloud (VPC). Configure this parameter based on the type of the network in which your exclusive resource group for Data Integration and the Hologres instance reside. If an invalid endpoint is specified, the network connection between the exclusive resource group for Data Integration and the Hologres instance may fail, or data synchronization performance may be degraded. The endpoints for the three network types are in the following formats:
  • Public endpoint: instance-id-region-endpoint.hologres.aliyuncs.com:Port number
  • Classic network endpoint: instance-id-region-endpoint-internal.hologres.aliyuncs.com:Port number
  • VPC endpoint: instance-id-region-endpoint-vpc.hologres.aliyuncs.com:Port number

We recommend that you deploy the exclusive resource group for Data Integration and the Hologres instance in the same zone of the same region. This helps ensure a successful network connection and optimize data synchronization performance.

Yes No default value
accessId The AccessKey ID of the Alibaba Cloud account that you use to connect to the Hologres instance. Yes No default value
accessKey The AccessKey secret of the Alibaba Cloud account that you use to connect to the Hologres instance. Make sure that the account is authorized to write data to the destination Hologres table. Yes No default value
database The name of the destination database in the Hologres instance. Yes No default value
table The name of the destination Hologres table. You can specify the table name in the format of Schema_name.Table_name. Yes No default value
writeMode The write mode. Valid values: SDK and SQL (INSERT INTO). For more information, see How it works.
In the code editor, you can configure the following parameters if you use the SDK mode:
  • maxCommitSize: the maximum size of data that can be written by Hologres Writer to Hologres at a time. Unit: bytes. Default value: 1048576. This parameter is optional.
  • maxRetryCount: the maximum number of retries that is allowed if a data write error occurs. Default value: 500. This parameter is optional.
  • retryInterval: the interval at which Hologres Writer performs retries. Unit: milliseconds. Default value: 1000. This parameter is optional.
Yes No default value
conflictMode The mode in which the conflicting data is processed. Valid values: Replace and Ignore. For more information, see How it works. Yes No default value
column The names of the columns to which you want to write data. The names of the primary key columns in the destination Hologres table must be included. If you want to write data to all the columns in the destination Hologres table, set this parameter to ["*"]. Yes No default value
partition The partition key column and the related value of the destination Hologres table. Configure this parameter in the format of column=value. This parameter is valid only for partitioned tables.
Note
  • Hologres supports only list partitioning. You can specify only one column as the partition key column. The data type of the partition key column must be INT4 or TEXT.
  • The value of this parameter must match the partition expression in the data definition language (DDL) statements that are used to create the destination Hologres table.
No Left empty, which indicates that the destination Hologres table is a non-partitioned table
truncate Specifies whether to clear the existing data in the destination Hologres table before Hologres Writer writes data to the table.
Note This parameter is supported only if you configure a data synchronization node that uses Hologres Writer in the code editor. For more information about how to use the code editor to configure a data synchronization node, see Configure Hologres Writer by using the code editor.
  • true: Clear the existing data in the destination Hologres table.
    Note The existing data can be cleared only if the destination Hologres table is a non-partitioned table or a table that contains static partitions. The existing data in the table that contains dynamic partitions cannot be cleared. If your destination Hologres table is a table that contains dynamic partitions and you set the truncate parameter to true, the data synchronization node exits due to exceptions.
  • false: Retain the existing data in the destination Hologres table.
No false

Configure Hologres Writer by using the codeless UI

  1. Configure data sources.

    Configure Source and Target for the synchronization node.

    Parameter Description
    Connection The name of the data source to which you want to write data. You must add a Hologres data source before you can select the data source. For more information, see Add a Hologres data source.
    Table The name of the table to which you want to write data. You can create a table in the destination database in advance. You can also use the automatic table creation feature to create a table when you configure the Table parameter. This parameter is equivalent to the table parameter that is described in the preceding section.
    Note
    • The table that is created by using the automatic table creation feature is only a basic table. You must modify the statement that is used to create the table based on your business requirements and create indexes for the table. This enables the table to provide higher performance. For more information about how to create a table, see CREATE TABLE.
    • Automatic routing to partitions is supported. If you use the automatic table creation feature to create a table, make sure that the created table is a partitioned table. This way, automatic routing to partitions can be achieved. If the created partitioned table has no child partitioned tables, the system creates child partitioned tables for the partitioned table. For more information about how to create a partitioned table, see CREATE TABLE.
    Partition Key Column The partition key of the destination Hologres table.
    Write Mode The write mode. This parameter is equivalent to the writeMode parameter that is described in the preceding section. Valid values:
    • SQL(INSERT INTO): This is the default write mode. This write mode provides excellent write performance. We recommend that you use this write mode.
    • SDK: This mode will be deprecated in the future.
    Write conflict strategy The mode in which conflicting data is processed. This parameter is equivalent to the conflictMode parameter that is described in the preceding section.
    Max Connection Count The maximum number of JDBC connections that are used for data write. This parameter takes effect only if the Write Mode parameter is set to SQL (INSERT INTO). Before you run the data synchronization node, make sure that sufficient JDBC connections are available. The default value of this parameter is 9, which indicates that a maximum of nine JDBC connections are used for one data synchronization node.
  2. Configure field mappings. This operation is equivalent to setting the column parameter that is described in the preceding section. Fields in the source on the left have a one-to-one mapping with fields in the destination on the right. Configure field mappings
    Operation Description
    Map Fields with the Same Name Click Map Fields with the Same Name to establish mappings between fields with the same name. The data types of the fields must match.
    Map Fields in the Same Line Click Map Fields in the Same Line to establish mappings between fields in the same row. The data types of the fields must match.
    Delete All Mappings Click Delete All Mappings to remove the mappings that are established.
    Auto Layout Click Auto Layout. Then, the system automatically sorts the fields based on specific rules.

Configure Hologres Writer by using the code editor

For more information about how to configure a synchronization node by using the code editor, see Create a synchronization node by using the code editor. The following sample code provides examples on how to configure a synchronization node to write data to a non-partitioned table and how to configure a synchronization node to write data to a partitioned table.
  • Write data to a non-partitioned table
    • In the following code, a synchronization node is configured to write the data obtained from the cache to a non-partitioned Hologres table in SQL (INSERT INTO) mode:
      {
          "type": "job",
          "version": "2.0",
          "steps": [
              {
                  "stepType": "mysql",
                  "parameter": {
                      "envType": 0,
                          "datasource": "<mysql_source_name>",
                      "column": [
                          "<column1>",
                          "<column2>",
                          ......,
                          "<columnN>"
                      ],
                      "connection": [
                          {
                              "datasource": "<mysql_source_name>",//The name of the MySQL data source from which you want to read data.
                              "table": [
                                  "<mysql_table_name>"
                              ]
                          }
                      ],
                      "where": "",
                      "splitPk": "",
                      "encoding": "UTF-8"
                  },
                  "name": "Reader",
                  "category": "reader"
              },
              {
                  "stepType": "holo",
                  "parameter": {
                      "maxConnectionCount": 9,
                      "datasource": "<holo_sink_name>",//The name of the Hologres data source to which you want to write data.
                      "truncate":true,// Specifies whether to clear the existing data in the destination Hologres table before Hologres Writer writes data to the table. 
                      "conflictMode": "ignore",
                      "envType": 0,
                      "column": [
                          "<column1>",
                          "<column2>",
                          ......,
                          "<columnN>"
                      ],
                      "writeMode": "insert",
                      "table": "<holo_table_name>"
                  },
                  "name": "Writer",
                  "category": "writer"
              }
          ],
          "setting": {
              "executeMode": null,
              "errorLimit": {
                  "record": ""
              },
              "speed": {
                  "concurrent": 2,// The maximum number of parallel threads.
                  "throttle": false//Specifies whether to enable throttling. The value false indicates that throttling is disabled, and the value true indicates that throttling is enabled.
              }
          },
          "order": {
              "hops": [
                  {
                      "from": "Reader",
                      "to": "Writer"
                  }
              ]
          }
      }
    • The following sample DDL statements are used to create a non-partitioned Hologres table:
      begin;
      drop table if exists mysql_to_holo_test;
      create table mysql_to_holo_test(
        tag text not null,
        id int not null,
        body text not null,
        brrth date,
        primary key (tag, id));
        call set_table_property('mysql_to_holo_test', 'orientation', 'column');
        call set_table_property('mysql_to_holo_test', 'distribution_key', 'id');
        call set_table_property('mysql_to_holo_test', 'clustering_key', 'birth');
      commit;
  • Write data to a partition in a partitioned table
    Note
    • Hologres supports only list partitioning. You can specify only one column as the partition key column. The data type of the partition key column must be INT4 or TEXT.
    • The value of the partition parameter must match the partition filter expression in the DDL statements that are used to create the destination Hologres table.
    • In the following code, a synchronization node is configured to write the data obtained from the cache to a child partitioned table of the partitioned Hologres table in SQL (INSERT INTO) mode.
      {
        "type": "job",
        "version": "2.0",
        "steps": [
          {
            "stepType": "mysql",
            "parameter": {
              "envType": 0,
              "datasource": "<mysql_source_name>",
              "column": [
                "<column1>",
                "<column2>",
                  ......,
                "<columnN>"
              ],
              "connection": [
                {
                  "datasource": "<mysql_source_name>",
                  "table": [
                    "<mysql_table_name>"
                  ]
                }
              ],
              "where": "",
              "splitPk": "<mysql_pk>",// The field that is used for data sharding when MySQL Reader reads data.
              "encoding": "UTF-8"
            },
            "name": "Reader",
            "category": "reader"
          },
          {
            "stepType": "holo",
            "parameter": {
              "maxConnectionCount": 9,
              "partition": "<partition_key>",// The partition key of the destination Hologres table.
              "datasource": "<holo_sink_name>",// The name of the Hologres data source to which you want to write data.
              "conflictMode": "ignore",
              "envType": 0,
              "column": [
                "<column1>",
                "<column2>",
                  ......,
                "<columnN>"
              ],
              "writeMode": "insert",
              "table": "<holo_table_name>"
            },
            "name": "Writer",
            "category": "writer"
          }
        ],
        "setting": {
          "executeMode": null,
          "errorLimit": {
            "record": ""
          },
          "speed": {
            "concurrent": 2,// The maximum number of parallel threads.
            "throttle": false//Specifies whether to enable throttling. The value false indicates that throttling is disabled, and the value true indicates that throttling is enabled.
          }
        },
        "order": {
          "hops": [
            {
              "from": "Reader",
              "to": "Writer"
            }
          ]
        }
      }
    • The following sample DDL statements are used to create a partitioned Hologres table:
      BEGIN;
      CREATE TABLE public.hologres_parent_table(
        a text ,
        b int,
        c timestamp,
        d text,
        ds text,
        primary key(ds,b)
        )
        PARTITION BY LIST(ds);
      CALL set_table_property('public.hologres_parent_table', 'orientation', 'column');
      CREATE TABLE public.holo_child_1 PARTITION OF public.hologres_parent_table FOR VALUES IN('20201215');
      CREATE TABLE public.holo_child_2 PARTITION OF public.hologres_parent_table FOR VALUES IN('20201216');
      CREATE TABLE public.holo_child_3 PARTITION OF public.hologres_parent_table FOR VALUES IN('20201217');
      COMMIT;