DRDS (PolarDB-X 1.0)

Get started

To start synchronizing data with DRDS (PolarDB-X 1.0), complete these steps in order:

Create a database account with replace into permission. See Prerequisites.
Add DRDS (PolarDB-X 1.0) as a data source in DataWorks. See Add a data source.
Configure and run a synchronization task. See Develop a data synchronization task.

Prerequisites

Before you begin, ensure that you have:

A DRDS (PolarDB-X 1.0) database account with the replace into permission. For setup instructions, see Create an account.

Limitations

Offline read and write

The DRDS (PolarDB-X 1.0) plugin is compatible only with the MySQL engine. DRDS (PolarDB-X 1.0) is a distributed MySQL database, and most of its communication protocols follow MySQL standards.
MySQL 8.0 in DRDS (PolarDB-X 1.0) supports Serverless resource groups (recommended) and exclusive resource groups for Data Integration.
DRDS (PolarDB-X 1.0) Writer connects to the proxy of a remote DRDS (PolarDB-X 1.0) database through Java Database Connectivity (JDBC) and runs replace into statements to write data. The destination table must have a primary key or a unique index to prevent duplicate writes.
DRDS (PolarDB-X 1.0) Writer retrieves data from a Reader through the data synchronization framework, then writes it using replace into statements: The Writer accumulates data and commits it to the DRDS (PolarDB-X 1.0) proxy, which determines whether to route the data to one or more tables.
- If no primary key or unique index conflict occurs, the behavior is equivalent to insert into.
- If a conflict occurs, the new row replaces all fields of the existing row.
Note
The task requires at least the replace into permission. Additional permissions depend on the SQL statements specified in preSql and postSql.
Reading from views is supported.

Supported field types

DRDS (PolarDB-X 1.0) Reader and Writer support most data types. Verify that your data types are in the following list before configuring a synchronization task.

Type category	DRDS (PolarDB-X 1.0) data types
Integer types	INT, TINYINT, SMALLINT, MEDIUMINT, BIGINT
Floating-point types	FLOAT, DOUBLE, DECIMAL
String types	VARCHAR, CHAR, TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT
Date and time types	DATE, DATETIME, TIMESTAMP, TIME, YEAR
Boolean types	BIT, BOOL
Binary types	TINYBLOB, MEDIUMBLOB, BLOB, LONGBLOB, VARBINARY

Add a data source

Before developing a synchronization task in DataWorks, add DRDS (PolarDB-X 1.0) as a data source. Follow the instructions in Data source management. Parameter descriptions are available in the DataWorks console when you add the data source.

Develop a data synchronization task

Configure an offline sync task for a single table

Configure using the codeless UI or the code editor. See Configure a task in the codeless UI and Configure a task in the code editor.
For a full parameter reference and script examples, see Appendix: Script demo and parameters.

Configure an offline sync task for an entire database

See Configure an offline sync task for an entire database.

FAQ

Why can't DRDS (PolarDB-X 1.0) Reader guarantee data consistency across shards?

DRDS (PolarDB-X 1.0) is a distributed database. When the Reader extracts data from different underlying sharded tables, it captures snapshots at different points in time — not from a single consistent time slice. Strong consistency across sharded databases and tables cannot be guaranteed.

How does encoding affect data synchronization?

DRDS (PolarDB-X 1.0) supports encoding settings at the field, table, database, and instance levels. The priority of encoding settings, from highest to lowest, is field, table, database, and then instance. Set the encoding to UTF-8 at the database level to avoid issues.

DRDS (PolarDB-X 1.0) Reader uses JDBC for data extraction, which handles encoding conversion automatically. However, if the encoding used when writing data to the underlying layer differs from the declared encoding, the Reader cannot detect the mismatch and the synchronization result may contain garbled characters.

How do I configure incremental data synchronization?

DRDS (PolarDB-X 1.0) Reader uses JDBC SELECT statements for data extraction. Use a WHERE clause to filter for only the records added or changed since the last sync:

Timestamp-based: If your application writes a modify field with a change timestamp for each record (covering additions, updates, or logical deletions), add a WHERE clause using the timestamp of the last synchronization run.
Auto-increment ID-based: For append-only data streams, add a WHERE clause using the maximum auto-increment ID from the previous sync.

If your data has no field that distinguishes new or modified records from existing ones, the Reader cannot support incremental extraction — only full data synchronization is available.

Note

Filter conditions based on physical table names are not supported in the WHERE clause.

Appendix: Script demo and parameters

To configure a batch synchronization task using the code editor, follow the unified script format described in Configure a task in the code editor. The following sections describe the parameters and provide script examples.

Reader script demo

{
    "type": "job",
    "version": "2.0",
    "steps": [
        {
            "stepType": "drds",
            "parameter": {
                "datasource": "",
                "column": [
                    "id",
                    "name"
                ],
                "where": "",
                "table": "",
                "splitPk": ""
            },
            "name": "Reader",
            "category": "reader"
        },
        {
            "stepType": "stream",
            "parameter": {},
            "name": "Writer",
            "category": "writer"
        }
    ],
    "setting": {
        "errorLimit": {
            "record": "0"
        },
        "speed": {
            "throttle": true,
            "concurrent": 1,
            "mbps": "12"
        }
    },
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    }
}

Reader script parameters

Parameter	Description	Required	Default value
`datasource`	The name of the data source. The value must match the name of the data source added in DataWorks.	Yes	None
`table`	The table from which to synchronize data.	Yes	None
`column`	The columns to synchronize, specified as a JSON array. Use `["*"]` to select all columns. Supports column pruning, reordering, constants, and MySQL function expressions. For example: `["id", "`table`", "1", "'bazhen.csy'", "null", "to_char(a + 1)", "2.3", "true"]`.	Yes	None
`where`	The filter condition used to construct the `SELECT` statement. If left blank, the entire table is synchronized. Supports incremental synchronization via date or ID conditions. For example: `STRTODATE('${bdp.system.bizdate}','%Y%m%d') <= today AND today < DATEADD(STRTODATE('${bdp.system.bizdate}', '%Y%m%d'), interval 1 day)`.	No	None
`splitPk`	The shard key used for parallel reads.	No	None

Writer script demo

{
    "type": "job",
    "version": "2.0",
    "steps": [
        {
            "stepType": "stream",
            "parameter": {},
            "name": "Reader",
            "category": "reader"
        },
        {
            "stepType": "drds",
            "parameter": {
                "postSql": [],
                "datasource": "",
                "column": [
                    "id"
                ],
                "writeMode": "insert ignore",
                "batchSize": "1024",
                "table": "test",
                "preSql": []
            },
            "name": "Writer",
            "category": "writer"
        }
    ],
    "setting": {
        "errorLimit": {
            "record": "0"
        },
        "speed": {
            "throttle": true,
            "concurrent": 1,
            "mbps": "12"
        }
    },
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    }
}

Writer script parameters

Parameter	Description	Required	Default value
`datasource`	The name of the data source. The value must match the name of the data source added in DataWorks.	Yes	None
`table`	The destination table to which data is written.	Yes	None
`writeMode`	The write mode. Valid values: `insert ignore` (ignore rows that violate primary key or constraint rules) and `replace into` (replace existing rows on conflict).	No	`insert ignore`
`column`	The destination columns to write data to, specified as a comma-separated list. Example: `["id", "name", "age"]`. Use `["*"]` to write to all columns in order.	Yes	None
`preSql`	SQL statements to run before the synchronization task starts. In the codeless UI, only one statement is supported. In the code editor, multiple statements are supported. Example: `delete * from table xxx;`.	No	None
`postSql`	SQL statements to run after the synchronization task completes. In the codeless UI, only one statement is supported. In the code editor, multiple statements are supported. Example: `delete * from table xxx where xx=xx;`.	No	None
`batchSize`	The number of records to commit per batch. Larger values reduce network round trips and improve throughput, but very large values may cause out-of-memory (OOM) errors.	No	`1024`