All Products
Search
Document Center

DataWorks:HybridDB for MySQL

Last Updated:Mar 26, 2026

DataWorks provides HybridDB for MySQL Reader and HybridDB for MySQL Writer to synchronize data between HybridDB for MySQL and other data sources.

Limits

  • HybridDB for MySQL Reader can read tables and views. When reading a table, you can select all or a subset of columns, reorder columns, add constant fields, and use HybridDB for MySQL functions such as now().

  • Views are supported in batch synchronization.

Data type mappings

HybridDB for MySQL Reader and Writer map data types as follows.

Category HybridDB for MySQL data types Notes
Integer INT, TINYINT, SMALLINT, MEDIUMINT, BIGINT, YEAR TINYINT(1) is treated as an integer data type
Floating point FLOAT, DOUBLE, DECIMAL
String VARCHAR, CHAR, TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT
Date and time DATE, DATETIME, TIMESTAMP, TIME
Boolean BOOL
Binary TINYBLOB, MEDIUMBLOB, BLOB, LONGBLOB, VARBINARY

Data types not listed above are not supported.

Add a data source

Add the HybridDB for MySQL data source to DataWorks before creating a synchronization task. For instructions, see Data source management. Parameter descriptions are available in the DataWorks console when you add the data source.

Develop a data synchronization task

Synchronize data from a single table

Use either of the following methods:

For the full list of parameters and sample code for the code editor, see Appendix: Code and parameters.

Synchronize all data from a database

See Configure a real-time full-database synchronization task.

Appendix: Code and parameters

HybridDB for MySQL Reader

Sample code

{
    "type": "job",
    "steps": [
        {
            "parameter": {
                "datasource": "px_aliyun_hymysql",
                "column": [
                    "id",
                    "name",
                    "sex",
                    "salary",
                    "age",
                    "pt"
                ],
                "where": "id=10001",
                "splitPk": "id",
                "table": "person"
            },
            "name": "Reader",
            "category": "reader"
        },
        {
            "parameter": {}
        }
    ],
    "version": "2.0",
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    },
    "setting": {
        "errorLimit": {
            "record": ""
        },
        "speed": {
            "concurrent": 7,
            "throttle": true,
            "mbps": 1
        }
    }
}

Parameters

Parameter Description Required Default
datasource Name of the data source. Must match the name configured in DataWorks. Yes None
table Name of the source table. Each synchronization task reads from one table. Yes None
column Columns to read, specified as a JSON array. Use ["*"] to read all columns. Supports reordering, constant values, and expressions — for example, ["id", "table", "1", "'mingya.wmy'", "'null'", "to_char(a+1)", "2.3", "true"]. Yes None
splitPk Column used to shard data for parallel reads. Set to a primary key column for the most even distribution. Only integer columns are supported — string, floating-point, and date columns are ignored, and the reader falls back to a single thread. No None
where WHERE clause to filter rows, for example gmt_create > $bizdate to read only today's data. Leave blank to read all rows. Do not use LIMIT clauses. No None
querySql Custom SQL for advanced filtering or multi-table joins. Overrides table, column, where, and splitPk when set. No None
singleOrMulti Required for sharded tables. Set to multi. Yes (sharded tables only) multi
Important

When configuring a sharded table in the code editor directly (without switching from the codeless UI), "singleOrMulti": "multi" is not generated automatically. Add it manually. Without this parameter, the reader reads data only from the first shard.

When querySql is set, the reader ignores table, column, where, and splitPk. Connection credentials are still read from datasource.

HybridDB for MySQL Writer

Sample code

{
    "type": "job",
    "steps": [
        {
            "parameter": {}
        },
        {
            "parameter": {
                "postSql": [],
                "datasource": "px_aliyun_hy***",
                "column": [
                    "id",
                    "name",
                    "sex",
                    "salary",
                    "age",
                    "pt"
                ],
                "writeMode": "insert",
                "batchSize": 256,
                "encoding": "UTF-8",
                "table": "person_copy",
                "preSql": []
            },
            "name": "Writer",
            "category": "writer"
        }
    ],
    "version": "2.0",
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    },
    "setting": {
        "errorLimit": {
            "record": ""
        },
        "speed": {
            "concurrent": 7,
            "throttle": true,
            "mbps": 1
        }
    }
}

Write modes

Mode Behavior when no conflict Behavior when primary key or unique index conflict occurs
insert (default) Writes data directly Cannot write to conflicting rows; conflicting data is treated as dirty data
replace Writes data directly Updates the specified fields in the conflicting rows

Parameters

Parameter Description Required Default
datasource Name of the data source. Must match the name configured in DataWorks. Yes None
table Name of the destination table. Yes None
writeMode Write mode. See Write modes above. No insert
column Columns to write, specified as a JSON array — for example, ["id", "name", "age"]. Use ["*"] to write to all columns. Yes None
preSql SQL statement to run before the synchronization task, for example to delete outdated data. The codeless UI supports one statement; the code editor supports multiple statements. No None
postSql SQL statement to run after the synchronization task, for example to add a timestamp. The codeless UI supports one statement; the code editor supports multiple statements. No None
batchSize Number of records to write per batch. Tune based on your workload. A value that is too large may cause out of memory (OOM) errors. No 1,024