The Code Editor gives you full control over offline sync task configuration by letting you write a JSON script directly. Use it to sync full or incremental data from a single source table or sharded tables to a destination table, with support for DataWorks scheduling parameters to automate periodic runs.
For data-source-specific configuration details, see Data source list.
When to use the Code Editor
Use the Code Editor instead of the codeless UI in these situations:
The data source does not support codeless UI configuration.
The data source page in the DataWorks UI indicates whether a data source supports the codeless UI.

The data source has configuration parameters that are only available in the Code Editor.
The data source cannot be created in the DataWorks UI.
Prerequisites
Before you begin, make sure that:
The required source and destination data sources are configured on the Data Source page of the DataWorks console. See Data source list, Supported data sources and sync solutions, and Data Source Configuration
A resource group with a suitable specification is purchased and attached to the workspace. See Use a Serverless resource group for Data Integration and Use exclusive resource groups for Data Integration
Network connectivity between the resource group and the data source is established. See Configure network connections
Step 1: Create a batch synchronization node
Data Studio (new version)
Log on to the DataWorks console. Switch to the destination region. In the left navigation pane, choose Data Development & O&M > Data Development. Select the desired workspace from the drop-down list and click Go to Data Studio.
Create a workflow. See Orchestrate workflows.
Create a batch synchronization node using one of these methods:
Method 1: Click the
icon in the upper-right corner of the workflow list and choose Create Node > Data Integration > Batch Synchronization.Method 2: Double-click the workflow name and drag the Batch Synchronization node from the Data Integration directory to the workflow editor.
Configure the basic information, source, and destination, then click OK.
DataStudio (legacy version)
Log in to the DataWorks console. Switch to the destination region. In the left navigation pane, click Data Development & O&M > Data Development. Select the desired workspace from the drop-down list and click Go to Data Development.
Create a workflow. See Create a workflow.
Create a batch synchronization node using one of these methods:
Method 1: Expand the workflow, right-click Data Integration, and select Create Node > Batch Synchronization.
Method 2: Double-click the workflow name and drag the Batch Synchronization node from the Data Integration directory to the workflow editor.
Create the node as prompted.
Step 2: Configure the data source and resource group
Switch from the codeless UI to the Code Editor at any step. For a fully populated JSON script, follow this order:
Select the data source and resource group in the codeless UI and test network connectivity. The system automatically populates the generated JSON script with this information.
Switch to the Code Editor.
Alternatively, switch to the Code Editor directly, specify the data source in the JSON code, and set the resource group and required resources in the Advanced Settings panel on the right.
If a resource group is not displayed, check whether it is attached to the workspace. See Use a Serverless resource group and Use exclusive resource groups for Data Integration. For recommended resource specifications, see Resource group performance metrics - Data Integration.
Step 3: Switch to the Code Editor and import a template
In the toolbar, click the Code Editor
icon.

If the script is not yet configured, click the Import Template
icon in the toolbar and follow the on-screen instructions to import a script template.
Step 4: Edit the script
The sync task script has the following top-level structure:
{
"type": "job",
"version": "2.0",
"steps": [
{
"stepType": "<reader-plugin>",
"parameter": {
"column": [],
"where": "",
"splitPk": ""
},
"name": "Reader",
"category": "reader"
},
{
"stepType": "<writer-plugin>",
"parameter": {
"writeMode": "",
"preSql": [],
"postSql": []
},
"name": "Writer",
"category": "writer"
}
],
"setting": {
"executeMode": null,
"speed": {
"concurrent": 1,
"throttle": false
}
},
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
}
}Thetypeandversionfields have default values and cannot be changed. You can ignore processor-related configurations.
The script has three functional sections: reader, writer, and channel control (in the setting section). Configuration details vary by plug-in. For plug-in-specific parameters, see the Reader Script Demo and Writer Script Demo sections for each data source in the Data source list.
Reader parameters
Configure the basic information and field mappings for reading source data.
| Parameter | Description | Required? |
|---|---|---|
where | A filter condition (a WHERE clause without the where keyword) to limit which source data is synced. Combine with scheduling parameters for incremental sync — for example, gmt_create >= '${bizdate}' syncs only records created on the current business date. If not set, all data is synced. See Scenario: Configure a batch synchronization task for incremental data and Supported formats of scheduling parameters. | No |
splitPk | The field used to split source data into shards for concurrent reading. Use the table's primary key — primary keys are typically evenly distributed, which prevents data hot spots. Only integer fields are supported; strings, floating-point numbers, and dates are not. If not set or left blank, data is synced through a single channel. Not all plug-ins support this parameter. | No |
column | An array of source fields to sync. Supports constants, variables (for example, ${variable_name}), and functions (for example, now()). | Yes |
The incremental sync method varies by data source (plug-in).
Writer parameters
Configure how data is written to the destination.
| Parameter | Description | Required? |
|---|---|---|
preSql | SQL statements to run on the destination before data is written. For example, configure truncate table tablename in MySQL Writer to clear existing data before the sync starts. | No |
postSql | SQL statements to run on the destination after data is written. | No |
writeMode | Defines how to write data when conflicts occur, such as path or primary key conflicts. The behavior and available values vary by data source and writer plug-in. | Yes |
Channel control parameters
Configure performance settings in the setting section.
| Parameter | Description | Required? |
|---|---|---|
executeMode | Controls distributed processing. Set to distribute to split the task into shards and distribute them across multiple execution nodes for concurrent execution — this allows sync speed to scale horizontally with the cluster size. Set to null for single-node mode, where concurrency is limited to a single machine. A concurrency of 8 or more is required to enable distributed mode. If an out-of-memory (OOM) error occurs at runtime, disable distributed mode. | No |
concurrent | The maximum number of threads for parallel reading from the source or writing to the destination. The actual concurrency at runtime may be less than or equal to the configured value, depending on resource specifications. See Performance metrics. | No |
throttle | Controls the sync rate. Set to true to enable throttling and protect the source database from excessive extraction load — also set the mbps parameter to define the rate (minimum 1 MB/s). Set to false to use the maximum transfer performance within the configured concurrency limits. | No |
errorLimit | The threshold for dirty data records. If not set, dirty data is allowed and the task continues running. Set to 0 to fail the task on any dirty data. Set a positive integer to allow dirty data up to that count — the task fails if the count is exceeded. | No |
For executeMode: If the exclusive resource group has only one machine, distributed mode cannot leverage multi-machine resources. If a single machine meets your speed requirements, use single-node mode to simplify task execution.For throttle: The traffic metric is internal to Data Integration and does not represent actual network interface card (NIC) traffic. NIC traffic is typically 1–2 times the channel traffic.Dirty data is any record that fails to write to the destination due to errors such as type mismatches (for example, writing a VARCHAR value to an INT column). An excessive amount of dirty data can reduce overall sync speed.
Overall sync speed is also affected by the source data source performance and the network environment. See Optimize an offline sync task.
Step 5: Configure scheduling properties
For periodically scheduled batch synchronization, configure the scheduling properties. On the node's edit page, click Scheduling on the right to configure scheduling parameters, a scheduling policy, a scheduling time, and scheduling dependencies.
For Data Studio (new version): see Node scheduling (new version)
For DataStudio (legacy version): see Node scheduling configuration (legacy version)
For scheduling parameter usage examples, see Common scenarios of scheduling parameters in Data Integration.
Step 6: Submit and publish the task
Configure test parameters
On the batch synchronization task configuration page, click Debugging Configurations on the right and set the following:
| Configuration item | Description |
|---|---|
| Resource Group | Select a resource group that is connected to the data source. |
| Script Parameters | Assign values to placeholder parameters in the sync script. For example, if the script uses ${bizdate}, enter a date in yyyymmdd format. |
Run the task
Click the
Run icon in the toolbar. After the task completes, create a node of the destination table type to query the destination table and verify that the synced data meets your expectations.
Publish the task
After the task runs successfully, click the
icon in the toolbar to publish the task to the production environment. See Publish tasks.
What's next
After publishing, go to Operation Center in the production environment to view and manage the scheduled task. See O&M for batch synchronization tasks.