All Products
Search
Document Center

DataWorks:ApsaraDB for OceanBase

Last Updated:Mar 27, 2026

The ApsaraDB for OceanBase data source lets you read from and write to ApsaraDB for OceanBase in DataWorks. Configure it to run batch and real-time data synchronization tasks.

Supported versions and tenant modes

The ApsaraDB for OceanBase Reader and Writer plug-ins support OceanBase 2.x, 3.x, and 4.x. The following table shows which capabilities are available for each version and tenant mode.

Capability OceanBase version Tenant mode
Batch read 2.x, 3.x, 4.x Oracle and MySQL
Batch write 2.x, 3.x, 4.x Oracle and MySQL
Real-time read 2.x, 3.x, 4.x (V3.0 or later for entire database) MySQL only

Prerequisites

Before you begin, make sure you have:

The required permissions depend on your synchronization mode:

Sync mode Required permissions
Batch read If you use partition-based reads, the account needs system permissions.
Batch write At least insert into permissions. Additional permissions may be required based on the SQL statements in preSql and postSql.
Real-time read Binlog feature enabled. See Binlog-related operations (Alibaba Cloud instances) and Binlog-related operations (OceanBase Cloud instances).

Add a data source

Add ApsaraDB for OceanBase as a data source in DataWorks before configuring any synchronization task. Follow the instructions in Data source management. Parameter descriptions are available in the DataWorks console when you add the data source.

Configure a data synchronization task

Choose the synchronization mode that fits your use case. Use batch synchronization if you need periodic bulk transfers or incremental loads. Use real-time synchronization if you need low-latency change capture from OceanBase to a downstream system.

For real-time synchronization of an entire database, the ApsaraDB for OceanBase instance must be version V3.0 or later. Real-time synchronization also requires MySQL tenant mode.
Sync mode Supported sources Configuration guide
Batch synchronization (single table) All data sources supported by Data Integration Configure a task in the codeless UI
Real-time synchronization (single table) Kafka Configure a real-time full-database synchronization task
Real-time synchronization (entire database) MySQL Configure a real-time full-database synchronization task

Limitations

Batch read

  • ApsaraDB for OceanBase supports two tenant modes: Oracle and MySQL. The where clause and function columns in the column parameter must use syntax compatible with your instance's tenant mode.

  • Reading from views is supported.

  • Do not modify data while a batch synchronization task is running. Concurrent modifications can cause data duplication or data loss.

  • Partition-based reads require the account to have system permissions.

Batch write

  • Use the batch write method. It buffers rows and submits a single write request after reaching the configured row count, reducing round trips to the database.

  • The preSql and postSql syntax must be compatible with the tenant mode of your instance.

Real-time read

  • Only the MySQL tenant mode is supported.

  • Data sources that use a connection string are not supported for full-database real-time synchronization.

  • ApsaraDB for OceanBase is a distributed relational database that consolidates multiple physically distributed databases into a single logical database. When synchronizing in real time to AnalyticDB for MySQL, only a single physical database is supported — logical databases are not.

Appendix: Script mode parameters

Use the code editor to configure batch synchronization tasks as JSON scripts. All examples use the unified script format. For format requirements, see Use the code editor.

Reader

Script example

{
    "type": "job",
    "steps": [
        {
            "stepType": "apsaradb_for_OceanBase",
            "parameter": {
                "datasource": "",
                "column": [
                    "id",
                    "name"
                ],
                "where": "",
                "splitPk": ""
            },
            "name": "Reader",
            "category": "reader"
        },
        {
            "stepType": "stream",
            "parameter": {
                "print": false,
                "fieldDelimiter": ","
            },
            "name": "Writer",
            "category": "writer"
        }
    ],
    "version": "2.0",
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    },
    "setting": {
        "errorLimit": {
            "record": "0"
        },
        "speed": {
            "throttle": true,
            "concurrent": 1,
            "mbps": "12"
        }
    }
}

Reader parameters

Parameter Required Default Description
datasource Yes None The ApsaraDB for OceanBase data source name. Use this parameter or the jdbcUrl + username + password combination.
jdbcUrl No None The JDBC URL of the source database. Specify as a JSON array to configure multiple URLs — the plug-in tries them in order until a connection succeeds. If all connections fail, the task fails. Must be nested within connection. Format: jdbc:oceanbase://127.0.0.1:3306/database.
username No None The database account username.
password No None The database account password.
table Yes None The tables to read from, specified as a JSON array. All tables must share the same schema. Must be nested within connection.
column Yes None The columns to read, specified as a JSON array. Set to [*] to read all columns. Supports column subsets, any column order, constant values (for example, '123'), and function columns (for example, date('now')). Cannot be blank.
splitPk No Blank The column used for data sharding. Enables concurrent reads, which improves performance. Use the primary key for even data distribution. Supports integer data types only — other types cause an error. Leave blank to read with a single thread.
where No None A filter condition appended to the SQL query. Leave blank to run a full table synchronization. Supports incremental synchronization — for example, set to gmt_create>$bizdate.
querySql No None A custom SQL statement for complex filter conditions. When set, the plug-in ignores table, column, where, and splitPk.
fetchSize No 1,024 The number of records fetched per batch. Larger values reduce network round trips. Values above 2,048 may cause out-of-memory (OOM) errors.

Writer

Script example

{
    "type": "job",
    "version": "2.0",
    "steps": [
        {
            "stepType": "stream",
            "parameter": {},
            "name": "Reader",
            "category": "reader"
        },
        {
            "stepType": "apsaradb_for_OceanBase",
            "parameter": {
                "datasource": "The data source name",
                "table": "apsaradb_for_OceanBase_table",
                "column": [
                    "id",
                    "name"
                ],
                "obWriteMode": "insert",
                "preSql": [
                    "delete from @table where db_id = -1"
                ],
                "postSql": [
                    "update @table set db_modify_time = now() where db_id = 1"
                ]
            },
            "name": "Writer",
            "category": "writer"
        }
    ],
    "setting": {
        "errorLimit": {
            "record": "0"
        },
        "speed": {
            "throttle": true,
            "concurrent": 1,
            "mbps": "12"
        }
    },
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    }
}

Writer parameters

Parameter Required Default Description
datasource No None The ApsaraDB for OceanBase data source name.
jdbcUrl Yes None The JDBC URL of the destination database. Specify only one URL — writing to multiple primary databases is not supported. Must be nested within connection. Format: jdbc:oceanbase://127.0.0.1:3306/database.
username Yes None The database account username.
password Yes None The database account password.
table Yes None The destination table name. Only one table is supported. Must be nested within connection.
column Yes None The destination columns to write to, specified as a JSON array. Example: ["id", "name", "age"]. Cannot be blank.
obWriteMode No insert The write mode. Valid values: insert — uses insert into ... (conflicts cause the row to fail); update — uses ... on duplicate key update ... (MySQL tenant mode; conflicts update the row); merge — uses merge into ... when matched then update ... (Oracle tenant mode; conflicts update the row).
onClauseColumns No None Oracle tenant mode only. Required when obWriteMode is merge; otherwise the plug-in falls back to insert mode. Specify primary key or unique constraint columns, separated by commas. Example: ID,C1.
obUpdateColumns No All columns The columns to update when a write conflict occurs. Applies only when obWriteMode is merge or update. Separate multiple columns with commas. Example: c2,c3.
preSql No None An SQL statement to run before the task writes data. Use @table to reference the destination table name.
postSql No None An SQL statement to run after the task writes data.
batchSize No 1,024 The number of records submitted per batch. Larger values reduce network round trips. Values above 2,048 may cause out-of-memory (OOM) errors.