The MariaDB data source connects DataWorks to MariaDB for bidirectional offline synchronization—reading from and writing to MariaDB databases and views. This page covers supported versions, field types, account setup, and script parameters for offline synchronization tasks.
Before you begin
To connect DataWorks to a MariaDB database, complete the following steps in order:
Supported MariaDB versions
Supported versions: 5.5.x, 10.0.x, 10.1.x, 10.2.x, and 10.3.x.
Offline synchronization also supports reading from views.
Supported field types
The following table lists the supported field types for MariaDB 10.3.x. For other versions, see the official MariaDB documentation.
| Field type | Offline read (MariaDB Reader) | Offline write (MariaDB Writer) |
|---|---|---|
| TINYINT | Supported | Supported |
| SMALLINT | Supported | Supported |
| INTEGER | Supported | Supported |
| BIGINT | Supported | Supported |
| FLOAT | Supported | Supported |
| DOUBLE | Supported | Supported |
| DECIMAL/NUMERIC | Supported | Supported |
| REAL | Not supported | Not supported |
| VARCHAR | Supported | Supported |
| JSON | Supported | Supported |
| TEXT | Supported | Supported |
| MEDIUMTEXT | Supported | Supported |
| LONGTEXT | Supported | Supported |
| VARBINARY | Supported | Supported |
| BINARY | Supported | Supported |
| TINYBLOB | Supported | Supported |
| MEDIUMBLOB | Supported | Supported |
| LONGBLOB | Supported | Supported |
| ENUM | Supported | Supported |
| SET | Supported | Supported |
| BOOLEAN | Supported | Supported |
| BIT | Supported | Supported |
| DATE | Supported | Supported |
| DATETIME | Supported | Supported |
| TIMESTAMP | Supported | Supported |
| TIME | Supported | Supported |
| YEAR | Supported | Supported |
| LINESTRING | Not supported | Not supported |
| POLYGON | Not supported | Not supported |
| MULTIPOINT | Not supported | Not supported |
| MULTILINESTRING | Not supported | Not supported |
| MULTIPOLYGON | Not supported | Not supported |
| GEOMETRYCOLLECTION | Not supported | Not supported |
Configure account permissions
Create a dedicated MariaDB account for DataWorks to access the data source.
-
Run the following statement to check the MariaDB version:
SELECT VERSION(); -
(Optional) Create a sync account. See Create a MariaDB user.
-
Grant the required permissions based on your sync direction:
-
Offline read: The account needs the SELECT permission on the tables to be synchronized.
-
Offline write: The account needs the INSERT, DELETE, and UPDATE permissions on the tables to be synchronized.
-
-
Run the following command to grant permissions. Replace
sync_accountwith your account name.-- CREATE USER 'sync_account'@'%' IDENTIFIED BY 'password'; GRANT SELECT, INSERT, DELETE, UPDATE ON *.* TO 'sync_account'@'%';Alternatively, grant the SUPER permission directly.
Add a data source
Add the MariaDB data source in DataWorks before developing a synchronization task. Follow the instructions in Data source management. Parameter descriptions are available in the DataWorks console when you add the data source.
Develop a data synchronization task
Offline synchronization for a single table
To configure the task in the code editor, see Configure a task in the code editor. For all script parameters and examples, see the Script parameters and examples section.
Script parameters and examples
All examples use the code editor format with "version": "2.0". Configure the script based on the unified format requirements described in Configure a task in the code editor.
Reader
Script example
{
"type": "job",
"version": "2.0",
"steps": [
{
"stepType": "mariadb",
"parameter": {
"column": [
"id"
],
"connection": [
{
"querySql": ["select a,b from join1 c join join2 d on c.id = d.id;"],
"datasource": "",
"table": [
"xxx"
]
}
],
"where": "",
"splitPk": "",
"encoding": "UTF-8"
},
"name": "Reader",
"category": "reader"
},
{
"stepType": "stream",
"parameter": {},
"name": "Writer",
"category": "writer"
}
],
"setting": {
"errorLimit": {
"record": "0"
},
"speed": {
"throttle": true,
"concurrent": 1,
"mbps": "12"
}
},
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
}
}
Reader parameters
| Parameter | Description | Required | Default |
|---|---|---|---|
datasource |
The data source name. Must match the name of the data source added in DataWorks. | Yes | None |
table |
The table to read from. Supports range syntax for sharded tables—for example, table_[0-99] reads from table_0 through table_99. For fixed-length numeric suffixes, use ["table_00[0-9]", "table_0[10-99]", "table_[100-999]"]. The task reads from all matched tables and fails if a table or specified column does not exist. |
Yes | None |
column |
The columns to synchronize, as a JSON array. Defaults to all columns (["*"]). Supports column pruning, column reordering, and constant values following MariaDB SQL syntax. Must not be blank. |
Yes | None |
splitPk |
The column used to partition data for concurrent reads. Use the primary key for even distribution. Only integer types are supported—unsupported types fall back to single-channel sync. Leave blank to use a single channel. | No | None |
where |
A SQL WHERE filter condition, for example gmt_create>$bizdate for incremental daily sync. Omit the parameter or leave it blank to sync all data. LIMIT clauses are not supported. |
No | None |
querySql |
A custom SQL query that overrides table, column, where, and splitPk. Use this for multi-table joins, for example select a,b from table_a join table_b on table_a.id = table_b.id. The datasource parameter is still used for credentials. This parameter is case-sensitive (querySql, not querysql) and is not supported in the codeless UI. |
No | None |
Writer
Script example
{
"type": "job",
"version": "2.0",
"steps": [
{
"stepType": "stream",
"parameter": {},
"name": "Reader",
"category": "reader"
},
{
"stepType": "mariadb",
"parameter": {
"postSql": [],
"datasource": "",
"column": [
"id",
"value"
],
"writeMode": "insert",
"batchSize": 1024,
"table": "",
"preSql": [
"delete from XXX;"
]
},
"name": "Writer",
"category": "writer"
}
],
"setting": {
"errorLimit": {
"record": "0"
},
"speed": {
"throttle": true,
"concurrent": 1,
"mbps": "12"
}
},
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
}
}
Writer parameters
| Parameter | Description | Required | Default |
|---|---|---|---|
datasource |
The data source name. Must match the name of the data source added in DataWorks. | Yes | None |
table |
The destination table to write to. | Yes | None |
column |
The destination columns to write to, as a JSON array—for example, ["id", "name", "age"]. To write to all columns in order, use ["*"]. |
Yes | None |
writeMode |
The write mode. Three modes are supported: insert (INSERT INTO—conflicts are treated as dirty data), update (ON DUPLICATE KEY UPDATE—conflicts update the fields specified in updateColumn), and replace (REPLACE INTO—conflicts delete the existing row and insert the new row). |
No | insert into |
updateColumn |
The columns to update when a primary key or unique index conflict occurs. This parameter is used when writeMode is set to update. Separate columns with commas—for example, ["name", "age"]. |
No | None |
preSql |
An SQL statement to run before the sync task. The codeless UI supports one statement; the code editor supports multiple. Example: truncate table tablename. Transactions are not supported for multiple statements. |
No | None |
postSql |
An SQL statement to run after the sync task. The codeless UI supports one statement; the code editor supports multiple. Example: ALTER TABLE tablename ADD colname TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP. Transactions are not supported for multiple statements. |
No | None |
batchSize |
The number of records submitted per batch. Larger values reduce network round trips and improve throughput, but very large values may cause out-of-memory (OOM) errors. | No | 256 |