DataWorks Data Integration supports DM (Dameng) as both a source and a destination for offline batch synchronization. Use DM Reader to extract data from DM databases and DM Writer to load data into them.
Capabilities
| Capability | DM Reader | DM Writer |
|---|---|---|
| Offline (batch) sync | Yes | Yes |
| Read from views | Yes | — |
| Column pruning | Yes | — |
| Column reordering | Yes | — |
Parallel read (splitPk) |
Yes (integer columns only) | — |
Incremental sync (where) |
Yes | — |
| Pre SQL execution | Yes | Yes |
| Post SQL execution | — | Yes |
| Serverless resource groups | Yes | Yes |
| Exclusive resource groups for Data Integration | Yes | Yes |
Supported field types
DM Reader and DM Writer support most common relational database data types. The following table lists the DM data types that DM Reader can convert. Unsupported types cause a read error—verify your schema before configuring a sync task.
| Category | DM data types |
|---|---|
| Integer | INT, TINYINT, SMALLINT, BIGINT |
| Floating-point | REAL, FLOAT, DOUBLE, NUMBER, DECIMAL |
| String | CHAR, VARCHAR, LONGVARCHAR, TEXT |
| Date and time | DATE, DATETIME, TIMESTAMP, TIME |
| Boolean | BIT |
| Binary | BINARY, VARBINARY, BLOB |
Configure a sync task
Single-table offline sync
Configure a task using either the codeless UI or the code editor:
-
Codeless UI: Configure a task in the codeless UI
-
Code editor: Configure a task in the code editor
For the full script reference and parameter descriptions, see Script reference.
Entire-database offline sync
Script reference
When configuring a batch synchronization task in the code editor, use the unified script format. The following sections cover the DM-specific parameters for Reader and Writer.
Reader
Script example
{
"type": "job",
"version": "2.0",
"order": {
"hops": [
{ "from": "Reader", "to": "Writer" }
]
},
"setting": {
"errorLimit": { "record": "0" },
"speed": {
"throttle": true,
"concurrent": 1,
"mbps": "12"
}
},
"steps": [
{
"category": "reader",
"name": "Reader",
"stepType": "dm",
"parameter": {
"datasource": "dm_datasource",
"table": "table",
"column": ["*"],
"preSql": ["delete from XXX;"],
"fetchSize": 2048
}
},
{
"category": "writer",
"name": "Writer",
"stepType": "stream",
"parameter": {}
}
]
}
Reader parameters
| Parameter | Required | Default | Description |
|---|---|---|---|
datasource |
Yes | — | Name of the DM data source. See Configure a DM data source. |
table |
Yes | — | Table to read data from. |
column |
Yes | — | Columns to sync, as a JSON array. Use ["*"] for all columns. Supports column pruning, column reordering, and constants (integer, string, null, function expression, floating-point, Boolean). |
splitPk |
No | Empty | Column used to split data for parallel reads. Set this to the primary key for even distribution and to avoid data hot spots. Only integer columns are supported—floating-point, string, date, and other types cause an error. If not set, the table is read with a single channel. |
where |
No | — | SQL filter condition appended to the query—for example, gmt_create>$bizdate for incremental sync, or limit 10 for testing. If not set, all rows are read. |
querySql |
No | — | Custom SQL query—for example, select a,b from table_a join table_b on table_a.id = table_b.id. When set, the column, table, and where parameters are ignored. |
fetchSize |
No | 1,024 | Number of rows fetched per batch. Higher values reduce network round-trips and improve throughput. Values above 2,048 may cause an out-of-memory (OOM) error. |
preSql |
No | — | SQL statement executed before the sync task starts. Only one statement is supported. |
Writer
Script example
{
"type": "job",
"version": "2.0",
"order": {
"hops": [
{ "from": "Reader", "to": "Writer" }
]
},
"setting": {
"errorLimit": { "record": "" },
"speed": {
"throttle": true,
"concurrent": 2,
"mbps": "12"
}
},
"steps": [
{
"category": "reader",
"name": "Reader",
"stepType": "oracle",
"parameter": {
"datasource": "aaa",
"column": ["PROD_ID", "name"],
"where": "",
"splitPk": "",
"encoding": "UTF-8",
"table": "PENGXI.SALES"
}
},
{
"category": "writer",
"name": "Writer",
"stepType": "dm",
"parameter": {
"datasource": "dm_datasource",
"table": "table",
"column": ["id", "name"],
"preSql": ["delete from XXX;"]
}
}
]
}
Writer parameters
| Parameter | Required | Default | Description |
|---|---|---|---|
datasource |
Yes | — | Name of the DM data source. See Configure a DM data source. |
table |
Yes | — | Destination table. If the table's schema differs from the username in the data source configuration, use the schema.table format. |
column |
Yes | — | Destination columns to write to, separated by commas. Do not use the default column settings. |
preSql |
No | — | SQL statement executed before the sync task starts—for example, to purge old data. Only one statement is supported; multiple statements disable transaction support. |
postSql |
No | — | SQL statement executed after the sync task completes—for example, to add a timestamp. Only one statement is supported; multiple statements disable transaction support. |
batchSize |
No | 1,024 | Number of rows written per batch. Higher values reduce network interactions and improve throughput. Values that are too large may cause an OOM error. |