All Products
Search
Document Center

DataWorks:CreateDIJob

Last Updated:Mar 27, 2026

Create a new version of a data integration task.

Operation description

  • To use this API, you must purchase DataWorks Basic Edition or a higher edition.

  • This API creates a data integration synchronization task. Key parameters include the source configuration SourceDataSourceSettings, the destination configuration DestinationDataSourceSettings, and the supported migration type MigrationType. The TransformationRules parameter defines transformation rules for synchronized tables, such as adding columns and replacing table names. The TableMappings parameter specifies the tables to synchronize and their corresponding mapping rules. The JobSettings parameter defines task settings, including column mapping and scheduling.

Try it now

Try this API in OpenAPI Explorer, no manual signing needed. Successful calls auto-generate SDK code matching your parameters. Download it with built-in credential security for local usage.

Test

RAM authorization

No authorization for this operation. If you encounter issues with this operation, contact technical support.

Request parameters

Parameter

Type

Required

Description

Example

DestinationDataSourceType

string

Yes

The type of the destination data source. Valid values: Hologres, OSS-HDFS, OSS, MaxCompute, LogHub, StarRocks, DataHub, AnalyticDB for MySQL, Kafka, and Hive.

Hologres

Description

string

No

The description of the job.

DI Job Demo

SourceDataSourceType

string

Yes

The type of the source data source. Valid values: PolarDB, MySQL, Kafka, LogHub, Hologres, Oracle, OceanBase, MongoDB, Redshift, Hive, SQL Server, Doris, and ClickHouse.

MySQL

ProjectId

integer

No

The ID of the DataWorks workspace. You can log on to the DataWorks console and go to the Workspace Management page to obtain the workspace ID.

This parameter is used to specify the DataWorks workspace for this API call.

10000

Name

string

No

The name of the job.

mysql_to_holo_sync_8772

MigrationType

string

Yes

The synchronization type. Valid values:

  • FullAndRealtimeIncremental: one-time and real-time synchronization for an entire database.

  • RealtimeIncremental: real-time synchronization for a single table.

  • Full: one-time synchronization for an entire database in batch mode.

  • OfflineIncremental: incremental synchronization for an entire database in batch mode.

  • FullAndOfflineIncremental: one-time and incremental synchronization for an entire database in batch mode.

FullAndRealtimeIncremental

JobType

string

No

The job type. Valid values:

  • DatabaseRealtimeMigration: synchronizes multiple tables from multiple source databases in real time. This job type supports one-time, incremental, or one-time and incremental synchronization.

  • DatabaseOfflineMigration: synchronizes multiple tables from multiple source databases in batches. This job type supports one-time, incremental, or one-time and incremental synchronization.

  • SingleTableRealtimeMigration: synchronizes a single source table in real time.

DatabaseRealtimeMigration

SourceDataSourceSettings

array<object>

Yes

The settings for the source data source.

array<object>

No

The settings for a single source data source.

DataSourceName

string

No

The name of the data source.

mysql_datasource_1

DataSourceProperties

object

No

The properties of the data source.

Encoding

string

No

The database encoding format.

UTF-8

Timezone

string

No

The time zone.

Asia/Shanghai

DestinationDataSourceSettings

array<object>

Yes

The settings for the destination data source.

object

No

The settings for a single destination data source.

DataSourceName

string

No

The name of the data source.

holo_datasource_1

ResourceSettings

object

Yes

The resource settings.

OfflineResourceSettings

object

No

The resources for batch synchronization.

RequestedCu

number

No

The computing units (CUs) of the resource group for data integration that is used for batch synchronization.

2

ResourceGroupIdentifier

string

No

The identifier of the resource group for data integration that is used for batch synchronization.

S_res_group_111_222

RealtimeResourceSettings

object

No

The resources for real-time synchronization.

RequestedCu

number

No

The CUs of the resource group for data integration that is used for real-time synchronization.

2

ResourceGroupIdentifier

string

No

The identifier of the resource group for data integration that is used for real-time synchronization.

S_res_group_111_222

ScheduleResourceSettings

object

No

The scheduling resources.

RequestedCu

number

No

The CUs of the scheduling resource group that is used for batch synchronization jobs.

2

ResourceGroupIdentifier

string

No

The identifier of the scheduling resource group that is used for batch synchronization jobs.

S_res_group_222_333

TransformationRules

array<object>

No

The transformation rules for the objects to be synchronized.

Note

[ { "RuleName":"my_database_rename_rule", "RuleActionType":"Rename", "RuleTargetType":"Schema", "RuleExpression":"{"expression":"${srcDatasoureName}_${srcDatabaseName}"}" } ]

object

No

A transformation rule.

RuleActionType

string

No

The action type. Valid values:

  • DefinePrimaryKey: defines a primary key.

  • Rename: renames an object.

  • AddColumn: adds a column.

  • HandleDml: handles DML operations.

  • DefineIncrementalCondition: defines an incremental condition.

  • DefineCycleScheduleSettings: defines periodic scheduling settings.

  • DefinePartitionKey: defines a partition key.

Rename

RuleExpression

string

No

The rule expression, which must be a JSON string.

  1. Renaming rule (Rename)

  • Example: {"expression":"${srcDatasourceName}_${srcDatabaseName}_0922"}

  • expression: the expression of the renaming rule. The expression can contain the following variables: ${srcDatasourceName} (the name of the source data source), ${srcDatabaseName} (the name of the source database), and ${srcTableName} (the name of the source table).

  1. Rule for adding a column (AddColumn)

  • Example: {"columns":[{"columnName":"my_add_column","columnValueType":"Constant","columnValue":"123"}]}

  • If this rule is not specified, no columns are added.

  • columnName: the name of the column to be added.

  • columnValueType: the value type of the column to be added. Valid values: Constant and Variable.

  • columnValue: the value of the column to be added. If columnValueType is Constant, the value is a custom constant of the String type. If columnValueType is Variable, the value is a built-in variable. Valid values of the built-in variable: EXECUTE_TIME (execution time, which is of the Long type), DB_NAME_SRC (name of the source database, which is of the String type), DATASOURCE_NAME_SRC (name of the source data source, which is of the String type), TABLE_NAME_SRC (name of the source table, which is of the String type), DB_NAME_DEST (name of the destination database, which is of the String type), DATASOURCE_NAME_DEST (name of the destination data source, which is of the String type), TABLE_NAME_DEST (name of the destination table, which is of the String type), and DB_NAME_SRC_TRANSED (name of the transformed database, which is of the String type).

  1. Rule for specifying the primary key columns of the destination table (DefinePrimaryKey)

  • Example: {"columns":["ukcolumn1","ukcolumn2"]}

  • If this rule is not specified, the primary key columns of the source table are used by default.

  • If the destination is an existing table, Data Integration does not modify the schema of the destination table. If the specified primary key column does not exist in the destination table, an error is reported when the job starts.

  • If the destination is a table that is automatically created, Data Integration automatically creates the schema of the destination table that contains the defined primary key columns. If the specified primary key column does not exist in the source table, an error is reported when the job starts.

  1. DML handling rule (HandleDml)

  • Example: {"dmlPolicies":[{"dmlType":"Delete","dmlAction":"Filter","filterCondition":"id > 1"}]}

  • If this rule is not specified, the default action is Normal for Insert, Update, and Delete operations.

  • dmlType: the DML operation type. Valid values: Insert, Update, and Delete.

  • dmlAction: the DML handling policy. Valid values: Normal (normal processing), Ignore, Filter (conditional processing, which is used when dmlType is Update or Delete), and LogicalDelete (logical deletion).

  • filterCondition: the DML filter condition, which is used when dmlAction is Filter.

  1. Incremental condition (DefineIncrementalCondition)

  • Example: {"where":"id > 0"}

  • Specifies the filter condition for incremental synchronization.

  1. Parameters for periodic scheduling (DefineCycleScheduleSettings)

  • Example: {"cronExpress":" * * * * * *", "cycleType":"1"}

  • Specifies the parameters for periodic scheduling of a job.

  1. Rule for specifying a partition key (DefinePartitionKey)

  • Example: {"columns":["id"]}

  • Specifies a partition key.

{ "expression": "${srcDatasoureName}_${srcDatabaseName}" }

RuleName

string

No

The name of the rule. If the action type and the object type to which the action applies are the same, the rule name must be unique. The name can be up to 50 characters in length.

rename_rule_1

RuleTargetType

string

No

The type of the object to which the action applies. Valid values:

  • Table

  • Schema

  • Database

Table

TableMappings

array<object>

Yes

The transformation mappings for the objects to be synchronized. Each element in the list describes a group of selection rules for source objects and the transformation rules that are applied to this group of objects.

Note

[ { "SourceObjectSelectionRules":[ { "ObjectType":"Database", "Action":"Include", "ExpressionType":"Exact", "Expression":"biz_db" }, { "ObjectType":"Schema", "Action":"Include", "ExpressionType":"Exact", "Expression":"s1" }, { "ObjectType":"Table", "Action":"Include", "ExpressionType":"Exact", "Expression":"table1" } ], "TransformationRuleNames":[ { "RuleName":"my_database_rename_rule", "RuleActionType":"Rename", "RuleTargetType":"Schema" } ] } ]

array<object>

No

Each rule selects a table to be synchronized.

SourceObjectSelectionRules

array<object>

No

Each rule can select a set of source objects to be synchronized. Multiple rules are combined to select a table.

object

No

Each rule can select different object types of the source objects to be synchronized, such as the source database and source data table.

Action

string

No

The selection action. Valid values: Include and Exclude.

Include

Expression

string

No

The expression.

mysql_table_1

ExpressionType

string

No

The expression type. Valid values: Exact and Regex.

Exact

ObjectType

string

No

The object type. Valid values:

  • Table

  • Schema

  • Database

Table

TransformationRules

array<object>

No

The transformation rules. Each element in the list is a transformation rule.

object

No

The transformation rule that is applied to the source object.

RuleName

string

No

The name of the transformation rule. The rule name must be unique for a specific action type and object type. The name can be up to 50 characters in length.

rename_rule_1

RuleActionType

string

No

The action type. Valid values:

  • DefinePrimaryKey: defines a primary key.

  • Rename: renames an object.

  • AddColumn: adds a column.

  • HandleDml: handles DML operations.

  • DefineIncrementalCondition: defines an incremental condition.

  • DefineCycleScheduleSettings: defines periodic scheduling settings.

  • DefinePartitionKey: defines a partition key.

Rename

RuleTargetType

string

No

The type of the object on which the action is performed. Valid values:

  • Table

  • Schema

  • Database

Table

JobSettings

object

No

The settings of the synchronization job, including DDL processing policies, mapping policies for data types of columns in the source and destination, and runtime parameters.

ChannelSettings

string

No

The settings of the channel in the synchronization job. You can configure special settings for specific channels. The following channels are supported: Holo2Holo (synchronization from Hologres to Hologres) and Holo2Kafka (synchronization from Hologres to Kafka).

  1. Holo2Kafka

  • Example: {"destinationChannelSettings":{"kafkaClientProperties":[{"key":"linger.ms","value":"100"}],"keyColumns":["col3"],"writeMode":"canal"}} kafkaClientProperties: the parameters of the Kafka producer, which are used when data is written to Kafka.

  • keyColumns: the value of the Kafka column to which data is to be written.

  • writeMode: the format of data written to Kafka. Valid values: json and canal.

  1. Holo2Holo

  • Example: {"destinationChannelSettings":{"conflictMode":"replace","dynamicColumnAction":"replay","writeMode":"replay"}}

  • conflictMode: the conflict handling policy for writing data to Hologres. Valid values: replace (overwrite) and ignore.

  • writeMode: the method of writing data to Hologres. Valid values: replay and insert.

  • dynamicColumnAction: the method of handling dynamic columns when data is written to Hologres. Valid values: replay, insert, and ignore.

{ "structInfo": "MANAGED", "storageType": "TEXTFILE", "writeMode": "APPEND", "partitionColumns": [ { "columnName": "pt", "columnType": "STRING", "comment": "" } ], "fieldDelimiter": "" }

ColumnDataTypeSettings

array<object>

No

The array of data type mappings for columns.

Note

["ColumnDataTypeSettings":[ { "SourceDataType":"Bigint", "DestinationDataType":"Text" } ]

object

No

A data type mapping for a single column.

DestinationDataType

string

No

The data type in the destination, such as bigint, boolean, string, text, datetime, timestamp, decimal, and binary. Data types vary with data sources.

text

SourceDataType

string

No

The data type in the source, such as bigint, boolean, string, text, datetime, timestamp, decimal, and binary. Data types vary with data sources.

bigint

CycleScheduleSettings

object

No

The settings for periodic scheduling.

CycleMigrationType

string

No

The synchronization type that requires periodic scheduling. Valid values:

  • Full: one-time synchronization.

  • OfflineIncremental: incremental synchronization in batch mode.

Full

ScheduleParameters

string

No

The scheduling parameters.

bizdate=$bizdate

DdlHandlingSettings

array<object>

No

The array of DDL handling settings.

Note

["DDLHandlingSettings":[ { "Type":"Insert", "Action":"Normal" } ]

object

No

A DDL handling setting.

Action

string

No

The handling action. Valid values:

  • Ignore: ignores the DDL message.

  • Critical: reports an error.

  • Normal: processes the DDL message.

Critical

Type

string

No

The DDL type. Valid values:

  • RenameColumn: renames a column.

  • ModifyColumn: renames a column.

  • CreateTable: renames a column.

  • TruncateTable: truncates a table.

  • DropTable: drops a table.

  • DropColumn: drops a column.

  • AddColumn: adds a column.

AddColumn

RuntimeSettings

array<object>

No

The runtime settings.

object

No

A runtime setting.

Name

string

No

The name of the setting. Valid values:

  • src.offline.datasource.max.connection: the maximum number of connections to the source of a batch synchronization job.

  • dst.offline.truncate: specifies whether to clear the destination table before a batch synchronization job starts.

  • runtime.offline.speed.limit.enable: specifies whether to enable throttling for a batch synchronization job.

  • runtime.offline.concurrent: the concurrency level of a batch synchronization job.

  • runtime.enable.auto.create.schema: specifies whether to automatically create a schema in the destination.

  • runtime.realtime.concurrent: the concurrency level of a real-time synchronization job.

  • runtime.realtime.failover.minute.dataxcdc: the amount of time to wait for a failover restart. Unit: minutes.

  • runtime.realtime.failover.times.dataxcdc: the number of failover restart attempts.

runtime.offline.concurrent

Value

string

No

The value of the setting.

1

JobName deprecated

string

No

This parameter is deprecated. Use the Name parameter instead.

mysql_to_holo_sync_8772

Owner

string

No

The owner of the job.

3726346

Response elements

Element

Type

Description

Example

object

The response schema.

Id

integer

The data integration job ID.

11792

RequestId

string

The request ID. Use it to locate logs and troubleshoot issues.

4F6AB6B3-41FB-5EBB-AFB2-0C98D49DA2BB

DIJobId deprecated

integer

This field is deprecated. Use the Id parameter instead.

11792

Examples

Success response

JSON format

{
  "Id": 11792,
  "RequestId": "4F6AB6B3-41FB-5EBB-AFB2-0C98D49DA2BB",
  "DIJobId": 11792
}

Error codes

See Error Codes for a complete list.

Release notes

See Release Notes for a complete list.