All Products
Search
Document Center

DataWorks:PolarDB-X 2.0

Last Updated:Mar 26, 2026

DataWorks Data Integration supports PolarDB-X 2.0 as both a source and destination for offline (batch) synchronization tasks. This page covers supported capabilities, prerequisites, and script parameter reference for PolarDB-X 2.0 Reader and Writer.

Setup overview

To synchronize data between PolarDB-X 2.0 and other systems, complete these steps:

  1. Confirm you are using PolarDB-X 2.0 (not PolarDB-X 1.0).

  2. Grant the required permissions to the database account DataWorks will use.

  3. Add the PolarDB-X 2.0 data source in DataWorks.

  4. Configure and run an offline synchronization task.

Supported versions

Offline read and write: PolarDB-X 2.0. Offline synchronization can also read data from views.

Limits

PolarDB-X 2.0 data sources support serverless resource groups (recommended) and exclusive resource groups for Data Integration.

Supported field types

For a complete list of PolarDB-X 2.0 field types, see Data types. The table below lists the major field types and their support status.

Field type Offline read (PolarDB-X 2.0 Reader) Offline write (PolarDB-X 2.0 Writer)
TINYINT Supported Supported
SMALLINT Supported Supported
INTEGER Supported Supported
BIGINT Supported Supported
FLOAT Supported Supported
DOUBLE Supported Supported
DECIMAL/NUMERIC Supported Supported
REAL Not supported Not supported
VARCHAR Supported Supported
JSON Supported Supported
TEXT Supported Supported
MEDIUMTEXT Supported Supported
LONGTEXT Supported Supported
VARBINARY Supported Supported
BINARY Supported Supported
TINYBLOB Supported Supported
MEDIUMBLOB Supported Supported
LONGBLOB Supported Supported
ENUM Supported Supported
SET Supported Supported
BOOLEAN Supported Supported
BIT Supported Supported
DATE Supported Supported
DATETIME Supported Supported
TIMESTAMP Supported Supported
TIME Supported Supported
YEAR Supported Supported
LINESTRING Not supported Not supported
POLYGON Not supported Not supported
MULTIPOINT Not supported Not supported
MULTILINESTRING Not supported Not supported
MULTIPOLYGON Not supported Not supported
GEOMETRYCOLLECTION Not supported Not supported

Prerequisites

Before you begin, ensure that you have:

  • Confirmed you are running PolarDB-X 2.0. For PolarDB-X 1.0, use the DRDS data source instead.

  • A PolarDB-X 2.0 account with the permissions described below.

Grant account permissions

Create a dedicated PolarDB-X 2.0 account for DataWorks access, then grant the appropriate permissions based on your synchronization scenario.

Offline read (SELECT permission on source table)

The account must have the SELECT permission on the source table.

Offline write (write permissions on destination table)

The account must have INSERT, DELETE, and UPDATE permissions on the destination table.

Real-time synchronization — full database (binary logging access)

  • Privileged account: Can read binary logging (binlog) data by default.

  • Standard account: Grant SELECT, REPLICATION SLAVE, and REPLICATION CLIENT permissions using a privileged account:

-- Create a sync account and allow login from any host (% represents any host)
-- CREATE USER 'sync_account'@'%' IDENTIFIED BY 'password';

-- Grant permissions for real-time (CDC) synchronization
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'sync_account'@'%';

Add a data source

Add the PolarDB-X 2.0 data source to DataWorks before configuring any synchronization task. Follow the instructions in Data source management. Parameter descriptions are available in the DataWorks console when you add the data source.

Configure an offline synchronization task

For the entry point and configuration procedure, see Configure an offline sync task in the code editor.

For the script format and all available parameters, see Appendix: Script demo and parameter descriptions below.

Appendix: Script demo and parameter descriptions

Use the code editor to configure batch synchronization tasks in JSON format. For the unified script format requirements, see Configure a task in the code editor.

All examples use "type": "job" and "version": "2.0" at the top level.

Reader script demo

{
    "type": "job",
    "version": "2.0",
    "steps": [
        {
            "stepType": "polardbx20",
            "parameter": {
                "connection": [
                    {
                        "datasource": "",
                        "table": [
                            "t1"
                        ]
                    }
                ],
                "column": [
                    "c1",
                    "c2",
                    "'const'"
                ],
                "where": "",
                "splitPk": "",
                "checkSlave": "true",
                "slaveDelayLimit": "300"
            },
            "name": "Reader",
            "category": "reader"
        },
        {
            "stepType": "stream",
            "parameter": {},
            "name": "Writer",
            "category": "writer"
        }
    ],
    "setting": {
        "errorLimit": {
            "record": "0"
        },
        "speed": {
            "throttle": true,
            "concurrent": 1,
            "mbps": "12"
        }
    },
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    }
}

Reader script parameters

Parameter Description Required Default
datasource The data source name. Must match the name configured on the Data Source Management page. Yes None
table The table to synchronize. Only a single table is supported per connection block. Yes None
column The columns to synchronize, as a JSON array. Use ["*"] to include all columns. Cannot be blank. Supports column pruning (select specific columns), column reordering (order need not match the table schema), and constants following PolarDB-X 2.0 SQL syntax. Example: ["id", "table", "1", "'mingya.wmy'", "'null'", "to_char(a+1)", "2.3", "true"]. Yes None
splitPk The column to use for data partitioning, enabling concurrent reads. Set to the primary key for balanced shards. Supports integer-type columns only — string, floating-point, and date columns are ignored, and data falls back to a single channel. If blank or omitted, data is read through a single channel. No None
where A SQL WHERE filter condition for incremental synchronization. For example, gmt_create>$bizdate synchronizes only the current day's data. Cannot be set to LIMIT 10. If omitted, all data is synchronized. No None
checkSlave When the data source is a read-only instance, checks replication lag before the task starts to prevent data loss. No true
slaveDelayLimit The maximum allowed replication lag in seconds. If the actual lag exceeds this value, the task fails. No 30

Writer script demo

{
    "type": "job",
    "version": "2.0",
    "steps": [
        {
            "stepType": "stream",
            "parameter": {},
            "name": "Reader",
            "category": "reader"
        },
        {
            "stepType": "PolarDB-X 2.0",
            "parameter": {
                "postSql": [],
                "datasource": "",
                "column": [
                    "id",
                    "value"
                ],
                "writeMode": "insert",
                "batchSize": 1024,
                "table": "",
                "preSql": [
                    "delete from XXX;"
                ]
            },
            "name": "Writer",
            "category": "writer"
        }
    ],
    "setting": {
        "errorLimit": {
            "record": "0"
        },
        "speed": {
            "throttle": true,
            "concurrent": 1,
            "mbps": "12"
        }
    },
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    }
}

Writer script parameters

Parameter Description Required Default
datasource The data source name. Must match the name configured on the Data Source Management page. Yes None
table The destination table name. Yes None
column The destination columns to write to, as a JSON array. Example: ["id", "name", "age"]. Use ["*"] to write to all columns in schema order. Yes None
writeMode The write conflict mode. Set to insert (insert into) or replace (replace into). See Write modes below. No insert
preSql SQL statement(s) to run before the task starts — for example, truncate table tablename;. In the codeless UI, only one statement is allowed. In the code editor, multiple statements are supported. Transactions are not supported for multiple statements. No None
postSql SQL statement(s) to run after the task completes — for example, adding a timestamp column. In the codeless UI, only one statement is allowed. In the code editor, multiple statements are supported. Transactions are not supported for multiple statements. No None
batchSize The number of records submitted per batch. A larger value reduces network round trips and improves throughput, but may cause memory overflow if set too high. No 256

Write modes

Mode Script value Behavior on conflict
insert into insert If a primary key or unique index conflict occurs, the conflicting row is skipped and recorded as dirty data.
replace into replace If no conflict occurs, behaves the same as insert into. If a conflict occurs, the existing row is deleted and the new row is inserted, replacing all fields.

Job-level settings

Parameter Description Default
errorLimit.record The number of error records allowed before the task fails. "0"
speed.throttle Whether to apply a rate limit. Set to true to enable; false disables the rate limit and the mbps parameter has no effect. true
speed.concurrent The number of concurrent channels. 1
speed.mbps The maximum synchronization rate in Mbps. Controls read/write pressure on the source and destination. Takes effect only when throttle is true. "12"