DataWorks Data Integration supports Hologres as both a source and a destination. Use the Hologres data source to run offline batch sync tasks, real-time single-table sync, and full-database real-time sync into Hologres.
Supported sync task types
| Task type | Supported sources | Configuration guide |
|---|---|---|
| Single-table offline | All Data Integration sources | Codeless UI / Script Mode |
| Single-table real-time | DataHub, Hologres, Kafka, LogHub | Configure a real-time synchronization task for a single table |
| Full-database offline | AnalyticDB for MySQL 3.0, ClickHouse, Doris, Hologres, Oracle, PolarDB, SQL Server | Configure a full-database offline synchronization task |
| Full-database real-time | AnalyticDB for OceanBase, MongoDB, MySQL, Oracle, PolarDB, PolarDB-X 2.0, PostgreSQL | Configure a real-time synchronization task for an entire database |
| Serverless full-database real-time | MySQL | Configure a Serverless synchronization task |
For frequently asked questions about real-time sync, see FAQ about real-time synchronization to Hologres.
Limitations
Offline read and write
-
The Hologres writer does not support writing to Hologres foreign tables.
-
The endpoint selection logic depends on whether the Hologres instance is in the same region as the task:
-
Same region: any Tunnel > Single Tunnel > Public
-
Different region: Public > Single Tunnel
-
Single-table real-time read
-
Hologres version 2.1 or later is required.
-
Incremental synchronization is not supported for partitioned tables.
-
Data Definition Language (DDL) change messages from Hologres tables are not supported.
-
Enable Hologres Binlog on the source table before configuring a real-time sync task. See Subscribe to Hologres Binlog.
-
Supported data types for incremental sync:
INTEGER,BIGINT,TEXT,CHAR(n),VARCHAR(n),REAL,JSON,SERIAL,OID,INT4[],INT8[],FLOAT8[],BOOLEAN[],TEXT[].
Full-database real-time write
-
Tables without a primary key are not supported.
-
When syncing from MySQL to Hologres, write data only to child tables of a partitioned table, not the parent table.
Supported field types
UUID is not supported for offline read, offline write, or real-time write. All other types listed below are fully supported across all three operations.
| Type | Offline read | Offline write | Real-time write |
|---|---|---|---|
| UUID | Not supported | Not supported | Not supported |
| CHAR | Supported | Supported | Supported |
| NCHAR | Supported | Supported | Supported |
| VARCHAR | Supported | Supported | Supported |
| LONGVARCHAR | Supported | Supported | Supported |
| NVARCHAR | Supported | Supported | Supported |
| LONGNVARCHAR | Supported | Supported | Supported |
| CLOB | Supported | Supported | Supported |
| NCLOB | Supported | Supported | Supported |
| SMALLINT | Supported | Supported | Supported |
| TINYINT | Supported | Supported | Supported |
| INTEGER | Supported | Supported | Supported |
| BIGINT | Supported | Supported | Supported |
| NUMERIC | Supported | Supported | Supported |
| DECIMAL | Supported | Supported | Supported |
| FLOAT | Supported | Supported | Supported |
| REAL | Supported | Supported | Supported |
| DOUBLE | Supported | Supported | Supported |
| TIME | Supported | Supported | Supported |
| DATE | Supported | Supported | Supported |
| TIMESTAMP | Supported | Supported | Supported |
| BINARY | Supported | Supported | Supported |
| VARBINARY | Supported | Supported | Supported |
| BLOB | Supported | Supported | Supported |
| LONGVARBINARY | Supported | Supported | Supported |
| BOOLEAN | Supported | Supported | Supported |
| BIT | Supported | Supported | Supported |
| JSON | Supported | Supported | Supported |
| JSONB | Supported | Supported | Supported |
How it works
Offline read
The Hologres reader uses PSQL to read data from a Hologres table. It launches multiple concurrent SELECT tasks based on the table's shard count, filtering rows per shard using the built-in hg_shard_id column.
The shard count is set when the table is created:
CALL set_table_property('table_name', 'shard_count', 'xx');
Run this command within the same CREATE TABLE transaction. If not specified, the table inherits the database's default shard count, which depends on your instance configuration.
Offline write
The Hologres writer uses the conflictMode parameter to resolve primary key conflicts during writes. conflictMode applies only to tables with a primary key.
conflictMode value |
Behavior |
|---|---|
Replace |
Overwrites the entire row. Columns not mapped in the task are set to NULL. |
Update |
Updates only the mapped columns. Unmapped columns retain their existing values. |
Ignore |
Discards the incoming row if a conflict exists. |
For details on write performance, see Technical principles.
Appendix: Script Mode parameters and examples
Use Script Mode to configure a batch synchronization task with the code editor. For general Script Mode setup, see Use the code editor.
Reader script
Both examples use the holo step type with the Reader category. The key difference is the partition parameter, which is required for partitioned tables.
Non-partitioned table
{
"transform": false,
"type": "job",
"version": "2.0",
"steps": [
{
"stepType": "holo",
"parameter": {
"datasource": "holo_db",
"envType": 1,
"column": [
"tag",
"id",
"title",
"body"
],
"where": "",
"table": "holo_reader_basic_src"
},
"name": "Reader",
"category": "reader"
},
{
"stepType": "stream",
"parameter": {
"print": false,
"fieldDelimiter": ","
},
"name": "Writer",
"category": "writer"
}
],
"setting": {
"executeMode": null,
"failoverEnable": null,
"errorLimit": {
"record": "0"
},
"speed": {
"concurrent": 2,
"throttle": false
}
},
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
}
}
Corresponding Hologres DDL:
BEGIN;
DROP TABLE IF EXISTS holo_reader_basic_src;
CREATE TABLE holo_reader_basic_src(
tag text NOT NULL,
id int NOT NULL,
title text NOT NULL,
body text,
PRIMARY KEY (tag, id));
CALL set_table_property('holo_reader_basic_src', 'orientation', 'column');
CALL set_table_property('holo_reader_basic_src', 'shard_count', '3');
COMMIT;
Partitioned table
Set the partition parameter to specify which child table to read from. The format is column=value and must match the partition key defined in the DDL.
{
"transform": false,
"type": "job",
"version": "2.0",
"steps": [
{
"stepType": "holo",
"parameter": {
"selectedDatabase": "public",
"partition": "tag=foo",
"datasource": "holo_db",
"envType": 1,
"column": [
"tag",
"id",
"title",
"body"
],
"tableComment": "",
"where": "",
"table": "public.holo_reader_basic_part_src"
},
"name": "Reader",
"category": "reader"
},
{
"stepType": "stream",
"parameter": {},
"name": "Writer",
"category": "writer"
}
],
"setting": {
"errorLimit": {
"record": "0"
},
"speed": {
"throttle": true,
"concurrent": 1,
"mbps": "12"
}
},
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
}
}
Corresponding Hologres DDL:
BEGIN;
DROP TABLE IF EXISTS holo_reader_basic_part_src;
CREATE TABLE holo_reader_basic_part_src(
tag text NOT NULL,
id int NOT NULL,
title text NOT NULL,
body text,
PRIMARY KEY (tag, id))
PARTITION BY LIST(tag);
CALL set_table_property('holo_reader_basic_part_src', 'orientation', 'column');
CALL set_table_property('holo_reader_basic_part_src', 'shard_count', '3');
COMMIT;
CREATE TABLE holo_reader_basic_part_src_1583161774228
PARTITION OF holo_reader_basic_part_src FOR VALUES IN ('foo');
Reader script parameters
| Parameter | Required | Default | Description |
|---|---|---|---|
database |
Yes | — | The Hologres database name. |
table |
Yes | — | The table name. For a partitioned table, specify the parent table name. |
column |
Yes | — | The columns to read. Use ["*"] to read all columns. |
partition |
No | Empty (non-partitioned) | The partition filter in column=value format. Hologres supports only list partitioning; the partition key must be a single INT4 or TEXT column. The specified child table must exist and contain data. |
Writer script
Both examples write to Hologres using the holo step type with the Writer category. The key difference is the partition parameter for partitioned tables.
Non-partitioned table
This example writes from a MySQL database to a non-partitioned Hologres table using JDBC mode.
{
"type": "job",
"version": "2.0",
"steps": [
{
"stepType": "mysql",
"parameter": {
"envType": 0,
"useSpecialSecret": false,
"column": [
"<column1>",
"<column2>",
"<columnN>"
],
"tableComment": "",
"connection": [
{
"datasource": "<mysql_source_name>",
"table": [
"<mysql_table_name>"
]
}
],
"where": "",
"splitPk": "",
"encoding": "UTF-8"
},
"name": "Reader",
"category": "reader"
},
{
"stepType": "holo",
"parameter": {
"selectedDatabase": "public",
"schema": "public",
"maxConnectionCount": 9,
"truncate": true,
"datasource": "<holo_sink_name>",
"conflictMode": "ignore",
"envType": 0,
"column": [
"<column1>",
"<column2>",
"<columnN>"
],
"tableComment": "",
"table": "<holo_table_name>",
"reShuffleByDistributionKey": false
},
"name": "Writer",
"category": "writer"
}
],
"setting": {
"executeMode": null,
"errorLimit": {
"record": "0"
},
"locale": "zh_CN",
"speed": {
"concurrent": 2,
"throttle": false
}
},
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
}
}
Replace the placeholders with actual values:
| Placeholder | Description |
|---|---|
<mysql_source_name> |
The name of the MySQL data source in DataWorks. |
<mysql_table_name> |
The source MySQL table name. |
<holo_sink_name> |
The name of the Hologres data source in DataWorks. |
<holo_table_name> |
The destination Hologres table name. |
<column1> ... <columnN> |
The column names to sync. |
Corresponding Hologres DDL:
BEGIN;
DROP TABLE IF EXISTS mysql_to_holo_test;
CREATE TABLE mysql_to_holo_test(
tag text NOT NULL,
id int NOT NULL,
body text NOT NULL,
birth date,
PRIMARY KEY (tag, id));
CALL set_table_property('mysql_to_holo_test', 'orientation', 'column');
CALL set_table_property('mysql_to_holo_test', 'distribution_key', 'id');
CALL set_table_property('mysql_to_holo_test', 'clustering_key', 'birth');
COMMIT;
Partitioned table
Hologres supports only list partitioning. The partition key must be a single INT4 or TEXT column. The partition value must match the partition configuration in the table's DDL.
{
"type": "job",
"version": "2.0",
"steps": [
{
"stepType": "mysql",
"parameter": {
"envType": 0,
"useSpecialSecret": false,
"column": [
"<column1>",
"<column2>",
"<columnN>"
],
"tableComment": "",
"connection": [
{
"datasource": "<mysql_source_name>",
"table": [
"<mysql_table_name>"
]
}
],
"where": "",
"splitPk": "<mysql_pk>",
"encoding": "UTF-8"
},
"name": "Reader",
"category": "reader"
},
{
"stepType": "holo",
"parameter": {
"selectedDatabase": "public",
"maxConnectionCount": 9,
"partition": "<partition_key>",
"truncate": "false",
"datasource": "<holo_sink_name>",
"conflictMode": "ignore",
"envType": 0,
"column": [
"<column1>",
"<column2>",
"<columnN>"
],
"tableComment": "",
"table": "<holo_table_name>",
"reShuffleByDistributionKey": false
},
"name": "Writer",
"category": "writer"
}
],
"setting": {
"executeMode": null,
"failoverEnable": null,
"errorLimit": {
"record": "0"
},
"speed": {
"concurrent": 2,
"throttle": false
}
},
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
}
}
Replace the placeholders with actual values:
| Placeholder | Description |
|---|---|
<mysql_source_name> |
The name of the MySQL data source in DataWorks. |
<mysql_table_name> |
The source MySQL table name. |
<mysql_pk> |
The primary key column of the MySQL table, used for data splitting. |
<partition_key> |
The partition key column of the Hologres table. |
<holo_sink_name> |
The name of the Hologres data source in DataWorks. |
<holo_table_name> |
The destination Hologres parent table name. |
<column1> ... <columnN> |
The column names to sync. |
Corresponding Hologres DDL:
BEGIN;
CREATE TABLE public.hologres_parent_table(
a text,
b int,
c timestamp,
d text,
ds text,
PRIMARY KEY(ds, b))
PARTITION BY LIST(ds);
CALL set_table_property('public.hologres_parent_table', 'orientation', 'column');
CREATE TABLE public.holo_child_1 PARTITION OF public.hologres_parent_table FOR VALUES IN('20201215');
CREATE TABLE public.holo_child_2 PARTITION OF public.hologres_parent_table FOR VALUES IN('20201216');
CREATE TABLE public.holo_child_3 PARTITION OF public.hologres_parent_table FOR VALUES IN('20201217');
COMMIT;
Writer script parameters
| Parameter | Required | Default | Description |
|---|---|---|---|
database |
Yes | — | The Hologres database name. |
table |
Yes | — | The table name. Include the schema if needed: schema_name.table_name. |
conflictMode |
Yes | — | How to handle primary key conflicts: Replace, Update, or Ignore. See Offline write for details. |
column |
Yes | — | The destination columns to write to. Must include all primary key columns. Use ["*"] to write to all columns. |
partition |
No | Empty (non-partitioned) | The partition filter in column=value format. Hologres supports only list partitioning; the partition key must be a single INT4 or TEXT column. |
reShuffleByDistributionKey |
No | false |
Routes data to specific shards based on the distribution key, enabling lock-free concurrent writes. Available only with a Serverless resource group. |
truncate |
No | false |
Whether to truncate the destination table before writing. Set to true to truncate. For statically partitioned tables, only the specified child table is truncated; the parent table is not affected. Truncating dynamically partitioned tables is not supported and causes the task to fail. |