Hologres Writer allows you to write data to Hologres. You can import data from multiple data stores to Hologres and use Hologres to analyze data in real time.

Notice Currently, you can only use exclusive resource groups for Data Integration for Hologres Writer. The default resource group and custom resource groups are not supported. For more information, see Use exclusive resource groups for data integration and Add a custom resource group.

How it works

Hologres Writer obtains data from a Data Integration reader, and writes data to the destination database based on the values of the writeMode and conflictMode parameters.
  • If the writeMode parameter is set to SDK (Fast Write), Hologres Writer writes data to Hologres through the HoloHub API. This mode provides the optimal performance for you.
  • If the writeMode parameter is set to SQL (INSERT INTO), Hologres Writer writes data to Hologres through the INSERT INTO statement provided by PostgreSQL.
You can use the conflictMode parameter to specify how to process the conflicting data when a primary key conflict occurs.
  • If the conflictMode parameter is set to Replace, the new data overwrites the existing data.
  • If the conflictMode parameter is set to Ignore, the existing data is retained and the new data is ignored.
In different write modes, different methods are used to specify how to process the conflicting data. If the writeMode parameter is set to SDK (Fast Write), you need to configure the properties of the Hologres table to specify how to process the conflicting data.
Notice The conflictMode parameter is available only when the table has a primary key.

Parameters

Parameter Description Required Default value
endpoint The endpoint used to connect to the destination Hologres instance, in the format of instance-id-region-endpoint.hologres.aliyuncs.com:port. You can view the endpoints of a Hologres instance on the configuration page of the instance in the Hologres console.
The endpoint of a Hologres instance varies with the network types, including the classic network, Internet, and Virtual Private Cloud (VPC). Select an appropriate endpoint based on the network where the resource group for Data Integration and the Hologres instance reside. Otherwise, the connection may fail or the performance may be poor. The formats of these three types of endpoints are as follows:
  • Public endpoint: instance-id-region-endpoint.hologres.aliyuncs.com:port
  • Classic network endpoint: instance-id-region-endpoint-internal.hologres.aliyuncs.com:port
  • VPC endpoint: instance-id-region-endpoint-vpc.hologres.aliyuncs.com:port

We recommend that you deploy the resource group for Data Integration and the Hologres instance in the same zone of the same region to guarantee network connection and optimal performance.

Yes None
accessId The AccessKey ID of the account used to access Hologres. Yes None
accessKey The AccessKey secret of the account used to access Hologres. Specify an AccessKey secret of an account that is authorized to write data to the destination table. Yes None
database The name of the destination database in the Hologres instance. Yes None
table The name of the destination Hologres table. You can specify the table name in the format of Schema name.Table name. Yes None
writeMode The write mode. Valid values: SDK (Fast Write) and SQL (INSERT INTO). For more information, see How it works.
In the code editor, you can set the following parameters if you use the SDK (Fast Write) mode.
  • maxCommitSize: specifies the maximum size of data that can be written to Hologres at a time. Default value: 1048576. This parameter is optional.
  • maxRetryCount: specifies the maximum number of retries for writing data to Hologres in the SDK (Fast Write) mode. Default value: 500. This parameter is optional.
  • retryInterval: specifies the retry interval at which data is written in the SDK (Fast Write) mode. Unit: milliseconds. Default value: 1000. This parameter is optional.
Yes None
conflictMode The mode in which the conflicting data is processed. Valid values: Replace and Ignore. For more information, see How it works. Yes None
column The columns in the destination Hologres table to which data is written. The primary key columns of the destination table must be included. Set the value to an asterisk (*) if data is written to all the columns in the destination table. That is, set the column parameter as follows: "column":["*"]. Yes None
partition The partition key column and the corresponding value of the destination Hologres table, in the format of column=value. This parameter is valid for partitioned tables.
Note
  • Currently, Hologres only supports list partitioning and you can only specify a single column as the partition key column. The data type of the partition key column must be INT4 or TEXT.
  • The parameter value must match the partition expression in the data definition language (DDL) statements used to create the destination table.
No Null, indicating a non-partitioned table.

Configure Hologres Writer by using the codeless UI

  1. Configure the connections.
    Configure the source and destination connections for the sync node.
    Parameter Description
    Connection The name of the connection.
    Table The table parameter in the preceding parameter description.
    Write Mode The writeMode parameter in the preceding parameter description.
    Solution to Data Write Conflicts The conflictMode parameter in the preceding parameter description.
  2. Configure field mapping, that is, the column parameter in the preceding parameter description. Fields in the source table on the left have a one-to-one mapping with fields in the destination table on the right.Mappings section
    GUI element Description
    Map Fields with the Same Name Click Map Fields with the Same Name to establish a mapping between fields with the same name. Note that the data types of the fields must match.
    Map Fields in the Same Line Click Map Fields in the Same Line to establish a mapping for fields in the same row. Note that the data types of the fields must match.
    Delete All Mappings Click Delete All Mappings to remove mappings that have been established.
    Auto Layout Click Auto Layout. The fields are automatically sorted based on specified rules.
  3. Configure channel control policies.Channel section
    Parameter Description
    Expected Maximum Concurrency The maximum number of concurrent threads to read data from or write data to data storage within the sync node. You can configure the concurrency for a node on the codeless user interface (UI).
    Bandwidth Throttling Specifies whether to enable bandwidth throttling. You can enable bandwidth throttling and set a maximum transmission rate to avoid heavy read workload of the source. We recommend that you enable bandwidth throttling and set the maximum transmission rate to a proper value.
    Dirty Data Records Allowed The maximum number of dirty data records allowed.

Configure Hologres Writer by using the code editor

For more information about how to use the code editor, see Create a sync node by using the code editor.

  • Use a non-partitioned table as the destination table
    • In the following code, a node is configured to synchronize data from the memory to a non-partitioned Hologres table in the SDK (Fast Write) mode.
      {
          "type":"job",
          "version":"2.0",// The version number.
          "steps":[
              {
                  "stepType":"stream",
                  "parameter":{},
                  "name":"Reader",
                  "category":"reader"
              },
              {
                  "stepType":"holo",
                  "parameter":{
                    "endpoint": "instance-id-region-endpoint.hologres.aliyuncs.com:port",
                      "accessId": "<yourAccessKeyId>", // The AccessKey ID of the account used to access Hologres.
                      "accessKey": "<yourAccessKeySecret>", // The AccessKey secret of the account used to access Hologres.
                      "database": "postgres",
                      "table": "<yourTableName>",
                      "writeMode": "sdk",
                      "conflictMode": "replace",
                      "column" : [
                          "tag",
                          "id",
                          "title"
                      ],
                      "maxCommitSize": 1048576,
                      "maxRetryCount": 500
                  },
                  "name":"Writer",
                  "category":"writer"
              }
          ],
          "setting":{
              "errorLimit":{
                  "record":"0"// The maximum number of dirty data records allowed.
              },
              "speed":{
                  "throttle":false,// Specifies whether to enable bandwidth throttling. A value of false indicates that the bandwidth is not throttled. A value of true indicates that the bandwidth is throttled. The maximum transmission rate takes effect only if you set this parameter to true.
                  "concurrent":1,// The maximum number of concurrent threads.
              }
          },
          "order":{
              "hops":[
                  {
                      "from":"Reader",
                      "to":"Writer"
                  }
              ]
          }
      }
    • The following section provides the sample DDL statements used to create a non-partitioned Hologres table.
      begin;
      drop table if exists test_holowriter_sdk_replace;
      create table test_holowriter_sdk_replace(
        tag text not null, 
        id int not null, 
        body text not null
        primary key (tag, id));
        call set_table_property('test_holowriter_sdk_replace', 'orientation', 'column');
        call set_table_property('test_holowriter_sdk_replace', 'shard_count', '3');
      commit;
  • Use a partitioned table as the destination table
    • In the following code, a node is configured to synchronize data from the memory to a child partitioned table in Hologres in the SDK (Fast Write) mode.
      Note Exercise caution when you set the partition parameter.
      {
          "type":"job",
          "version":"2.0",// The version number.
          "steps":[
              {
                  "stepType":"stream",
                  "parameter":{},
                  "name":"Reader",
                  "category":"reader"
              },
              {
                  "stepType":"holo",
                  "parameter":{
                    "endpoint": "instance-id-region-endpoint.hologres.aliyuncs.com:port",
                      "accessId": "<yourAccessKeyId>", // The AccessKey ID of the account used to access Hologres.
                      "accessKey": "<yourAccessKeySecret>", // The AccessKey secret of the account used to access Hologres.
                      "database": "postgres",
                      "table": "<yourTableName>",
                      "writeMode": "sdk",
                      "conflictMode": "ignore",
                      "column" : [
                          "*"
                      ],
                      "partition": "tag=foo",
                      "maxCommitSize": 1048576,
                      "maxRetryCount": 500
                  },
                  "name":"Writer",
                  "category":"writer"
              }
          ],
          "setting":{
              "errorLimit":{
                  "record":"0"// The maximum number of dirty data records allowed.
              },
              "speed":{
                  "throttle":false,// Specifies whether to enable bandwidth throttling. A value of false indicates that the bandwidth is not throttled. A value of true indicates that the bandwidth is throttled. The maximum transmission rate takes effect only if you set this parameter to true.
                  "concurrent":1,// The maximum number of concurrent threads.
              }
          },
          "order":{
              "hops":[
                  {
                      "from":"Reader",
                      "to":"Writer"
                  }
              ]
          }
      }
    • The following section provides the sample DDL statements used to create a partitioned Hologres table.
      begin;
      drop table if exists test_holowriter_part_table_sdk_ignore;
      create table test_holowriter_part_table_sdk_ignore(
        tag text not null, 
        id int not null, 
        title text not null, 
        body text, 
        primary key (tag, id))
        partition by list( tag );
        call set_table_property('test_holowriter_part_table_sdk_ignore', 'orientation', 'column');
        call set_table_property('test_holowriter_part_table_sdk_ignore', 'shard_count', '3');
      commit;