The AnalyticDB for PostgreSQL data source lets you read data from and write data to AnalyticDB for PostgreSQL in DataWorks offline synchronization tasks.
What this connector supports:
-
Read from tables and views (offline synchronization)
-
Write data using
insert,copy, orupsertmode -
Concurrent reads using a split key for improved throughput
-
Incremental reads using a
wherefilter or custom SQL
Supported versions
Versions up to and including 7.0 are supported.
Supported field types
Offline read
AnalyticDB for PostgreSQL Reader supports most, but not all, AnalyticDB for PostgreSQL data types. Before you proceed, verify that your data types are supported. Types that require explicit conversion are noted in the table.
| Category | AnalyticDB for PostgreSQL data type | Notes |
|---|---|---|
| Integer types | BIGINT, BIGSERIAL, INTEGER, SMALLINT, SERIAL, GEOMETRY | — |
| Floating-point types | DOUBLE, PRECISION, MONEY, NUMERIC, REAL | MONEY requires explicit type conversion. Use a_inet::varchar syntax. |
| String types | VARCHAR, CHAR, TEXT, BIT, INET | BIT and INET require explicit type conversion. Use a_inet::varchar syntax. |
| Date and time types | DATE, TIME, TIMESTAMP | — |
| Boolean type | BOOL | — |
| Binary types | BYTEA | — |
Offline write
AnalyticDB for PostgreSQL Writer supports most, but not all, AnalyticDB for PostgreSQL data types. Before you proceed, verify that your data types are supported. Types that require explicit conversion are noted in the table.
| Category | AnalyticDB for PostgreSQL data type | Notes |
|---|---|---|
| LONG | BIGINT, BIGSERIAL, INTEGER, SMALLINT, SERIAL | — |
| DOUBLE | DOUBLE, PRECISION, MONEY, NUMERIC, REAL | MONEY requires explicit type conversion. Use a_inet::varchar syntax. |
| STRING | VARCHAR, CHAR, TEXT, BIT, INET, GEOMETRY | BIT and INET require explicit type conversion. Use a_inet::varchar syntax. |
| DATE | DATE, TIME, TIMESTAMP | — |
| BOOLEAN | BOOL | — |
| BYTES | BYTEA | — |
Limits
Offline synchronization tasks support reading data from views.
Prerequisites
Before you begin, ensure that you have:
-
Added the AnalyticDB for PostgreSQL data source to DataWorks. For instructions, see Data source management. Parameter descriptions are available in the DataWorks console when you add the data source.
Configure a data synchronization task
Choose a configuration guide based on your task type.
Offline sync task for a single table
-
Use the codeless UI or code editor to configure the task.
-
For all parameters and a script demo for the code editor, see Script demo and parameter descriptions.
Offline sync task for an entire database
Script demo and parameter descriptions
Reader script demo
{
"type": "job",
"steps": [
{
"parameter": {
"datasource": "test_004",
"column": [
"id",
"name",
"sex",
"salary",
"age"
],
"where": "id=1001",
"splitPk": "id",
"table": "public.person"
},
"name": "Reader",
"category": "reader"
},
{
"parameter": {},
"name": "Writer",
"category": "writer"
}
],
"version": "2.0",
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
},
"setting": {
"errorLimit": {
"record": ""
},
"speed": {
"concurrent": 6,
"throttle": true,
"mbps": "12"
}
}
}
Reader parameters
| Parameter | Description | Required | Default |
|---|---|---|---|
datasource |
The name of the data source. The value must match the name of the data source added in DataWorks. | Yes | None |
table |
The name of the table to read data from. | Yes | None |
column |
The columns to synchronize, specified as a JSON array. To synchronize all columns, use ["*"]. Supports column pruning, column reordering, constants, and functions. Example: ["id", "table", "1", "'mingya.wmy'", "to_char(a+1)", "2.3", "true"]. |
Yes | None |
splitPk |
The split key for concurrent reads. Data Integration partitions data by this column and runs concurrent tasks to improve performance. Use the table's primary key for even distribution. Supports integer columns only — strings, floating-point numbers, and dates are not supported. If left blank, data is read in a single channel. | No | None |
where |
A filter condition used to construct the SQL WHERE clause. Use this for incremental synchronization, for example: id>2 and sex=1. If left blank, all rows are read. |
No | None |
querySql |
A custom SQL query that overrides column, table, and where. Use this for complex scenarios such as multi-table joins, for example: select a,b from table_a join table_b on table_a.id = table_b.id. Available in the code editor only. |
No | None |
fetchSize |
The number of records fetched per batch from the database. A larger value reduces network round trips and improves throughput. Values greater than 2048 may cause an out-of-memory (OOM) error in the Data Integration process. | No | 512 |
Writer script demo
{
"type": "job",
"steps": [
{
"parameter": {},
"name": "Reader",
"category": "reader"
},
{
"parameter": {
"postSql": [],
"datasource": "test_004",
"column": [
"id",
"name",
"sex",
"salary",
"age"
],
"table": "public.person",
"preSql": []
},
"name": "Writer",
"category": "writer"
}
],
"version": "2.0",
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
},
"setting": {
"errorLimit": {
"record": ""
},
"speed": {
"throttle": true,
"concurrent": 6,
"mbps": "12"
}
}
}
Writer parameters
| Parameter | Description | Required | Default |
|---|---|---|---|
datasource |
The name of the data source. The value must match the name of the data source added in DataWorks. | Yes | None |
table |
The name of the table to write data to. | Yes | None |
writeMode |
The write mode. See the write mode comparison table below. | No | insert |
conflictMode |
The conflict resolution policy when writeMode is upsert and a primary key or unique index conflict occurs. Valid values: replace (new data overwrites existing data) and ignore (new data is discarded). Configurable in the code editor only. |
No | replace |
column |
The target columns to write data to, specified as a JSON array. To write to all columns in order, use ["*"]. Example: ["id", "name", "age"]. |
Yes | None |
preSql |
SQL statements to run before the synchronization task starts, such as truncating the target table. The codeless UI supports one statement; the code editor supports multiple. | No | None |
postSql |
SQL statements to run after the synchronization task completes, such as adding a timestamp column. The codeless UI supports one statement; the code editor supports multiple. | No | None |
batchSize |
The number of records written per batch. A larger value reduces network round trips and improves throughput, but values that are too large may cause an OOM error in the Data Integration process. | No | 1,024 |
Write mode comparison
| Write mode | How it works | Use when |
|---|---|---|
insert |
Runs INSERT INTO ... VALUES ... to write rows |
No conflicts are expected; recommended for most cases |
copy |
Uses the PostgreSQL COPY FROM command to bulk-load data |
Performance is a concern and you need higher write throughput |
upsert |
Handles conflicts based on conflictMode |
Data may already exist and you need to update or ignore duplicates |