Real-time integration enables you to combine and collect data from multiple source data sources into a single destination data source. This establishes a real-time synchronization link for data sync. This topic describes how to create a real-time integration task.
Prerequisites
You must configure at least one data source before creating a real-time integration task. This lets you select the source and destination data sources when configuring the task. For more information, see Supported data sources for real-time integration.
Background information
If the destination data source is Oracle or MySQL, the Java Database Connectivity (JDBC) protocol is used. Messages are processed according to the following policies.
If the sink table does not have a primary key.
When an INSERT message is received, it is appended directly.
When an UPDATE_BEFORE message is received, it is discarded. When an UPDATE_AFTER message is received, it is appended directly.
When a DELETE message is received, it is discarded.
If the sink table has a primary key.
When an INSERT message is received, it is processed as an UPSERT message.
When an UPDATE_BEFORE message is received, it is discarded. When an UPDATE_AFTER message is received, it is processed as an UPSERT message.
When a DELETE message is received, it is processed as a DELETE message.
The JDBC protocol writes data immediately. If a task fails over and the sink table has no primary key, duplicate data may result. Exactly-once delivery is not guaranteed.
The JDBC protocol supports only DDL statements for creating tables and adding fields. DDL messages of other types are discarded.
Oracle supports only basic data types. The INTERVAL YEAR, INTERVAL DAY, BFILE, SYS.ANY, XML, map, ROWID, and UROWID data types are not supported.
MySQL supports only basic data types. The map data type is not supported.
To prevent data inconsistency caused by out-of-order data, only a single concurrent task is supported.
The Oracle data source supports Oracle Database 11g, Oracle Database 19c, and Oracle Database 21c.
The MySQL data source supports MySQL 8.0, MySQL 8.4, and MySQL 5.7.
Step 1: Create a real-time integration task
In the top menu bar of the Dataphin homepage, choose Develop > Data Integration.
In the top menu bar, select a project. If you are in Dev-Prod mode, select an environment.
In the left navigation pane, select Integration > Real-time Integration.
Click the
icon in the real-time integration list and select Real-time Integration Task to open the Create Real-time Integration Task dialog box.In the Create Real-time Integration Task dialog box, configure the following parameters.
Parameter
Description
Task Name
Enter a name for the real-time task.
The name must start with a letter, contain only lowercase letters, digits, and underscores (_), and be 4 to 63 characters in length.
Production/Development environment queue resource
You can select all resource groups that are configured for real-time tasks.
NoteThis configuration item is supported only when the compute source used by the project is a Flink compute source in Kubernetes deployment mode.
Description
Enter a brief description of the task. The description can be up to 1,000 characters in length.
Select Directory
Select the folder where the real-time task is stored.
If no folder is created, you can create one as follows:
Above the real-time task list on the left, click the
icon to open the New Folder dialog box.In the New Folder dialog box, enter a folder Name and Select Directory as needed.
Click OK.
After you complete the configuration, click OK.
Step 2: Configure the real-time integration task
The supported source and destination data sources depend on the real-time computing engine. For more information, see Supported data sources for real-time integration.
Source data source
If the source data source is an external data source and you select Entire Database or Select Tables with Batch Select, the table names are retrieved from the Metadata Center. If no metadata acquisition task is configured for the data source, go to Metadata > Acquisition Task to create one.
MySQL
Parameter | Description | |
Data Source Configuration | Data Source Type | Select MySQL. |
Datasource | Select a data source. You can also click New to create a data source on the Datasource page. For more information, see Create a MySQL data source. Important Enable logging for the data source and make sure that the configured account has permissions to read logs. Otherwise, the system cannot synchronize data from this data source in real time. | |
Time Zone | The time zone configured for the selected data source. | |
Sync Rule Configuration | Sync Solution | Select Real-time Incremental or Real-time Incremental + Full. The default value is Real-time Incremental.
Note If the destination data source is Hive (Hudi table format), MaxCompute, or Databricks, you can set Sync Solution to Real-time Incremental + Full. |
Selection Method | You can select Entire Database, Select Tables, or Exclude Tables.
| |
Microsoft SQL Server
Parameter | Description | |
Data Source Configuration | Data Source Type | Select Microsoft SQL Server. |
Datasource | Select a data source. You can also click New to create a data source on the Datasource page. For more information, see Create a Microsoft SQL Server data source. Important Enable logging for the data source and make sure that the configured account has permissions to read logs. Otherwise, the system cannot synchronize data from this data source in real time. | |
Time Zone | The time zone configured for the selected data source. | |
Sync Rule Configuration | Sync Solution | Only Real-time Incremental is supported. Collects incremental changes from the source database and writes them to the downstream destination database in real time in the order they occur. |
Selection Method | You can select Entire Database, Select Tables, or Exclude Tables.
| |
PostgreSQL
Parameter | Description | |
Data Source Configuration | Data Source Type | Select PostgreSQL. |
Datasource | Select a data source. You can also click New to create a data source on the Datasource page. For more information, see Create a PostgreSQL data source. Important Enable logging for the data source and make sure that the configured account has permissions to read logs. Otherwise, the system cannot synchronize data from this data source in real time. | |
Time Zone | The time zone configured for the selected data source. | |
Sync Rule Configuration | Sync Solution | Only Real-time Incremental is supported. Collects incremental changes from the source database and writes them to the downstream destination database in real time in the order they occur. |
Selection Method | You can select Entire Database or Select Tables.
| |
Oracle
Parameter | Description | |
Data Source Configuration | Data Source Type | Select Oracle. |
Datasource | Select a data source. You can also click New to create a data source on the Datasource page. For more information, see Create an Oracle data source. Important Enable logging for the data source and make sure that the configured account has permissions to read logs. Otherwise, the system cannot synchronize data from this data source in real time. | |
Time Zone | The time zone configured for the selected data source. | |
Sync Rule Configuration | Sync Solution | Only Real-time Incremental is supported. Collects incremental changes from the source database and writes them to the downstream destination database in real time in the order they occur. |
Selection Method | You can select Entire Database, Select Tables, or Exclude Tables.
| |
IBM DB2
Parameter | Description | |
Data Source Configuration | Data Source Type | Select IBM DB2. |
Datasource | Select a data source. You can also click New to create a data source on the Datasource page. For more information, see Create an IBM DB2 data source. Important Enable logging for the data source and make sure that the configured account has permissions to read logs. Otherwise, the system cannot synchronize data from this data source in real time. | |
Sync Rule Configuration | Sync Solution | Only Real-time Incremental is supported. Collects incremental changes from the source database and writes them to the downstream destination database in real time in the order they occur. |
Selection Method | You can select Entire Database, Select Tables, or Exclude Tables.
| |
Kafka
Parameter | Description | |
Data Source Configuration | Data Source Type | Select Kafka. |
Datasource | Select a data source. You can also click New to create a data source on the Datasource page. For more information, see Create a Kafka data source. Important Enable logging for the data source and make sure that the configured account has permissions to read logs. Otherwise, the system cannot synchronize data from this data source in real time. | |
Source topic | Select the topic of the source data. You can enter a keyword in the topic name to perform a fuzzy search. | |
Data format | Only Canal JSON is supported. Canal JSON is a format compatible with Canal. Data is stored in Canal JSON format. | |
Key Type | The key type for Kafka, which determines the key.deserializer configuration when initializing KafkaConsumer. Only STRING is supported. | |
Value Type | The value type for Kafka, which determines the value.deserializer configuration when initializing KafkaConsumer. Only STRING is supported. | |
Consumer Group ID (optional) | Enter the ID of the consumer group. The consumer group ID is used to report the status offset. | |
Sync Rule Configuration | Table List | Enter the names of the tables to be synchronized. Separate multiple table names with line breaks. The value can be up to 1,024 characters in length. Table names can be in one of the following three formats: |
Hive (Hudi table format)
You can select Hive (Hudi data source) as the source data source only when the real-time engine is Apache Flink and the compute source is a Flink on YARN deployment.
Parameter | Description | |
Data Source Configuration | Data Source Type | Select Hive. |
Datasource | You can only select a Hive data source in Hudi table format. You can also click New to create a data source on the Datasource page. For more information, see Create a Hive data source. Important Enable logging for the data source and make sure that the configured account has permissions to read logs. Otherwise, the system cannot synchronize data from this data source in real time. | |
Sync Rule Configuration | Sync Solution | Only Real-time Incremental is supported. Collects incremental changes from the source database and writes them to the downstream destination database in real time in the order they occur. |
Select Table | Select a single table for real-time synchronization. | |
PolarDB (MySQL database type)
Parameter | Description | |
Data Source Configuration | Data Source Type | Select PolarDB. |
Datasource | You can only select a PolarDB data source of the MySQL database type. You can also click New to create a data source on the Datasource page. For more information, see Create a PolarDB data source. Important Enable logging for the data source and make sure that the configured account has permissions to read logs. Otherwise, the system cannot synchronize data from this data source in real time. | |
Time Zone | The time zone configured for the selected data source. | |
Sync Rule Configuration | Sync Solution | Select Real-time Incremental or Real-time Incremental + Full. The default value is Real-time Incremental.
Note If the destination data source is Hive (Hudi table format), MaxCompute, or Databricks, you can set Sync Solution to Real-time Incremental + Full. |
Selection Method | You can select Entire Database, Select Tables, or Exclude Tables.
| |
Destination data source
MaxCompute
Parameter | Description | |
Data Source Configuration | Data Source Type | Select MaxCompute. |
Datasource | Select a destination data source. You can select a MaxCompute data source and project. You can also click New to create a data source on the data source page. For more information, see Create a MaxCompute data source. | |
Sink Table Creation Configuration | New Table Type | Select Standard Table or Delta Table. The default value is Standard Table. If you select Delta Table and set the sink table creation method to Auto-create table, a MaxCompute Delta table is created. Additional fields are not used when creating a Delta table. Note After you configure the sink table, if you change the new table type, the system asks for confirmation. If you click OK in the dialog box, the sink table configuration is cleared and you must re-enter it. |
Table Name Transform | Sink table names can contain only letters, digits, and underscores (_). If a source table name contains other characters, you must configure a table name transform rule. Click Configure Table Name Transform to open the Configure Table Name Transform Rules dialog box.
Note
| |
Partition Format | If you select standard table as the table type, the partition format supports only Multi-partition. If you select Delta table as the table type, the partition format supports No Partition or Multi-partition. | |
Partition Interval | If you set Partition Format to No Partition, you cannot configure the partition interval. If you set Partition Format to Multiple Partitions, you can set the partition interval to hour or day. Note
| |
MySQL
Parameter | Description | |
Data Source Configuration | Data Source Type | Select MySQL. |
Datasource | Select a data source. You can also click New to create a data source on the Datasource page. For more information, see Create a MySQL data source. | |
Time Zone | The time zone configured for the selected data source. | |
Sink Table Creation Configuration | Table Name Transform | Sink table names can contain only letters, digits, and underscores (_). If a source table name contains other characters, you must configure a table name transform rule. Click Configure Table Name Transform to open the Configure Table Name Transform Rules dialog box.
Note
|
Microsoft SQL Server
Parameter | Description | |
Data Source Configuration | Data Source Type | Select Microsoft SQL Server. |
Datasource | Select a data source. You can also click New to create a data source on the Datasource page. For more information, see Create a Microsoft SQL Server data source. | |
Time Zone | The time zone configured for the selected data source. | |
Sink Table Creation Configuration | Table Name Transform | Sink table names can contain only letters, digits, and underscores (_). If a source table name contains other characters, you must configure a table name transform rule. Click Configure Table Name Transform to open the Configure Table Name Transform Rules dialog box.
Note
|
Oracle
Parameter | Description | |
Data Source Configuration | Data Source Type | Select Oracle. |
Datasource | Select a data source. You can also click New to create a data source on the Datasource page. For more information, see Create an Oracle data source. | |
Time Zone | The time zone configured for the selected data source. | |
Sink Table Creation Configuration | Table Name Transform | Sink table names can contain only letters, digits, and underscores (_). If a source table name contains other characters, you must configure a table name transform rule. Click Configure Table Name Transform to open the Configure Table Name Transform Rules dialog box.
Note
|
Kafka
Parameter | Description | |
Data Source Configuration | Data Source Type | Select Kafka. |
Datasource | Select a data source. You can also click New to create a data source on the Datasource page. For more information, see Create a Kafka data source. | |
Destination Topic | The topic for the destination data. You can select Single Topic or Multiple Topics. If you select Single Topic, select a destination topic. You can enter a keyword in the topic name to search. If you select Multiple Topics, you can configure topic name transform and topic parameters.
| |
Data format | Set the storage format for the written data. Supported formats include DTS Avro and Canal Json.
Note If you set Destination Topic to Multiple Topics, you can only set Data format to Canal Json. | |
Destination topic configuration | Topic Name Transform | Click Configure Topic Name Transform. In the Configure Topic Name Transform Rules dialog box, configure Topic Name Transform Rules and a prefix and suffix for the topic name.
Note
|
Topic Parameters | Additional parameters for creating a topic. The format is Note This item can be configured only when Destination Topic is set to Multiple Topics. | |
DataHub
Parameter | Description | |
Destination Data | Data Source Type | Select DataHub. |
Datasource | Select a destination data source. The system provides a shortcut to create a data source. You can click New to create a DataHub data source on the data source page. For more information, see Create a DataHub data source. | |
Destination Topic Creation Method | You can select New Topic or Use Existing Topic.
| |
Destination Topic |
| |
Databricks
Parameter | Description | |
Data Source Configuration | Data Source Type | Select Databricks. |
Datasource | Select a destination data source. You can select a Databricks data source and project. You can also click New to create a data source on the data source page. For more information, see Create a Databricks data source. | |
Time Zone | Time-formatted data is processed based on the current time zone. By default, this is the time zone configured in the selected data source and cannot be modified. Note Time zone conversion is supported only when the source data source type is MySQL or PostgreSQL and the destination data source type is Databricks. | |
Sink Table Creation Configuration | Table Name Transform | Sink table names can contain only letters, digits, and underscores (_). If a source table name contains other characters, you must configure a table name transform rule. Click Configure Table Name Transform to open the Configure Table Name Transform Rules dialog box.
Note
|
Partition Format | You can select No Partition or Multiple Partitions. | |
Partition Interval | If you set Partition Format to No Partition, you cannot configure the partition interval. If you set Partition Format to Multiple Partitions, you can set the partition interval to hour or day. Note
| |
SelectDB
Parameter | Description | |
Data Source Configuration | Data Source Type | Select SelectDB. |
Datasource | Select a data source. You can also click New to create a data source on the Datasource page. For more information, see Create a SelectDB data source. | |
Sink Table Creation Configuration | Table Name Transform | Sink table names can contain only letters, digits, and underscores (_). If a source table name contains other characters, you must configure a table name transform rule. Click Configure Table Name Transform to open the Configure Table Name Transform Rules dialog box.
Note
|
Hive
Parameter | Description | |
Data Source Configuration | Data Source Type | Set Data Source Type to Hive. |
Datasource | Select a data source. You can also click New to create a data source on the Datasource page. For more information, see Create a Hive data source. | |
Sink Table Creation Configuration | Data lake table format | You can select None, Hudi, Iceberg, or Paimon.
Note This item can be configured only when Data lake table format configuration is enabled for the selected Hive data source. |
Hudi Table Type/Paimon Table Type | For Hudi Table Type, you can select MOR (merge on read) or COW (copy on write). For Paimon Table Type, you can select MOR (merge on read), COW (copy on write), or MOW (merge on write). Note This item can be configured only when Data lake table format is set to Hudi or Paimon. | |
Table Creation Execution Engine | You can select Hive or Spark. If you select a data lake table format, Spark is selected by default.
| |
Table Name Transform | Sink table names can contain only letters, digits, and underscores (_). If a source table name contains other characters, you must configure a table name transform rule. Click Configure Table Name Transform to open the Configure Table Name Transform Rules dialog box.
Note
| |
Partition Format | You can select Single Partition, Multiple Partitions, or Fixed Partition. Note If you select Single Partition or Fixed Partition, the default partition field name is | |
Partition Interval | The default value is hour. You can also select day. Click the
Note This configuration item is supported only when Partition Format is set to Single Partition or Multiple Partitions. | |
Partition Value | Enter a fixed partition value, for example, 20250101. Note This configuration item is supported only when Partition Format is set to Fixed Partition. | |
Hologres
Parameter | Description | |
Data Source Configuration | Data Source Type | Select Hologres. |
Datasource | Select a destination data source. You can select a Hologres data source and project. You can also click New to create a data source on the data source page. For more information, see Create a Hologres data source. | |
Schema | Select a destination schema. | |
Sink Table Creation Configuration | Table Name Transform | Sink table names can contain only letters, digits, and underscores (_). If a source table name contains other characters, you must configure a table name transform rule. Click Configure Table Name Transform to open the Configure Table Name Transform Rules dialog box.
Note
|
StarRocks
Parameter | Description | |
Data Source Configuration | Data Source Type | Select StarRocks. |
Datasource | Select a data source. You can also click New to create a data source on the Datasource page. For more information, see. | |
Sink Table Creation Configuration | Table Name Transform | Sink table names can contain only letters, digits, and underscores (_). If a source table name contains other characters, you must configure a table name transform rule. Click Configure Table Name Transform to open the Configure Table Name Transform Rules dialog box.
Note
|
Mapping configuration
Mapping configuration is not supported if the destination data source is DataHub or Kafka (with a single destination topic).
If the destination data source is an external data source, the sink table names in the mapping configuration are retrieved from the Metadata Center. In this case, the sink table creation method does not support auto-create table. You must manually create the sink table in the database.
Destination data source is not Kafka

Block | Description |
① View additional fields | During real-time incremental synchronization, additional fields are automatically added when a table is auto-created to facilitate data use. Click View additional fields to view the fields. In the Additional Fields dialog box, you can view the currently added fields. Important
Click View DDL for Adding Fields to view the DDL statement for adding the additional fields. Note Viewing additional fields is not supported when the source data source type is Kafka. |
② Search and filter area | Search by Source Table and Sink Table Name. To quickly filter sink tables, click the |
③ Add global fields, Refresh mapping |
|
④ Destination database list | The destination database list includes Serial Number, Source Table, Mapping Status, Sink Table Creation Method, and Sink Table Name. You can also add fields, view fields, refresh, or delete a sink table.
|
⑤ Batch operations | You can Delete sink tables in batches. |
Destination data source is Kafka (with multiple destination topics)

Block | Description |
① Search and filter area | Search by Source Table and Destination Topic Name. To quickly filter sink tables, click the |
② Refresh mapping | To refresh the sink table configuration list, click Refresh mapping. Important If the destination topic configuration already has content, reselecting the data source type and data source will reset the destination topic list and mapping. Proceed with caution. |
③ List | The list includes Serial Number, Source Table, Mapping Status, Destination Topic Creation Method, and Destination Topic Name. You can also delete a sink table.
|
④ Batch operations | You can Delete sink tables in batches. |
DDL processing policy
DDL processing policies are not supported when the source data source type is DataHub or Kafka.
DDL processing policies are not supported when the destination data source type is PostgreSQL or Hive (Hudi table format).
When the destination data source type is Hive (Hudi table format) and the data lake table format is Hudi, all DDL processing policies support only Ignore.
When the source data source type is Kafka, all DDL processing policies support only Ignore.
New columns added to existing partitions of Hive or MaxCompute tables cannot have their data synchronized. The data for these new columns in existing partitions will be NULL. The next new partition will be effective and available.
Create Table and Add Column, among others: Normal processing (includes creating tables, adding columns, deleting columns, renaming columns, and modifying column types). This DDL information is passed to the destination data source for processing. Processing policies vary by destination data source.
Ignore: Discards this DDL information and does not send it to the destination data source.
Error: Stops the real-time sync task with an error status.
Step 3: Configure real-time integration task properties
Click Resource Configuration in the top menu bar of the current real-time integration task tab, or click Property in the right sidebar to open the Property panel.
Configure the Basic Information and Resource Configuration for the current real-time integration task.
Basic Information: Select the Development Owner and Operation Owner for the current real-time integration task, and enter a Description for the task. The description can be up to 1,000 characters long.
Resource Configuration: For more information, see Real-time integration resource configuration.
Step 4: Submit the real-time integration task
Click Submit to submit the current real-time integration task.
In the Submit dialog box, enter Submission notes and click OK and Submit.
After submission, you can view the submission details in the Submit dialog box.
If the project is in Dev-Prod mode, you must publish the real-time integration task to the production environment. For more information, see Manage publish tasks.
What to do next
You can view and manage the real-time integration task in the Operation Center to ensure that it runs as expected. For more information, see View and manage real-time tasks.