All Products
Search
Document Center

DataWorks:Oracle

Last Updated:Mar 27, 2026

The Oracle data source lets you read from and write to an Oracle database using a Data Synchronization task in DataWorks. Configure the task with the Codeless UI for a guided experience, or use the Code Editor for full script control.

Sync capabilities overview

Before you start, confirm that the Oracle data source supports your use case:

Capability Supported Notes
Batch read Yes All supported versions
Batch write Yes Uses INSERT INTO only; conflicts are treated as dirty data
Real-time read (CDC) Yes Requires archive logs and supplemental logging on the primary database
Read from views Batch only Real-time Synchronization does not support views
Read from standby database Yes (physical only) Archive log generation speed determines latency
Index-organized tables (IOTs) No
Full-database synchronization Yes Batch, real-time, and sharding modes
Tables where all columns are LOB types No BLOB, CLOB, or NCLOB all-LOB tables are not supported

Supported versions

Batch read and write

Version Oracle Reader Oracle Writer
Oracle 11.2 (11gR2) Supported Supported
Oracle 12.1 (12cR1) Supported Supported
Oracle 12.2 (12cR2) Supported (new version features not supported) Supported (new version features not supported)
Oracle 18.3 Supported (new version features not supported) Supported (new version features not supported)
Oracle 19.x Supported (new version features not supported) Supported (new version features not supported)
Oracle 21.1 Supported (new version features not supported) Supported (new version features not supported)
Important

The Oracle Writer plugin uses the ojdbc6-12.1.1.jar driver.

Real-time read

Supported versions: Oracle 11g R2 and later, 12c non-CDB, 18c non-CDB, 19c non-CDB, and 19c CDB.

Not supported: Oracle 12c CDB and 18c CDB.

Note

A Container Database (CDB) is an architecture introduced in Oracle Database 12c. It can host one or more Pluggable Databases (PDBs). To use Oracle as a real-time sync source, enable the Archive Log feature. See Prepare the Oracle environment.

Limitations

  • NLS_CHARACTERSET and NLS_NCHAR_CHARACTERSET must be set to one of: AL32UTF8, AL16UTF16, ZHS16GBK, or UTF8.

  • Real-time Synchronization supports up to 500 GB of incremental data per day for a single Oracle instance.

  • Real-time reads from a standby database are limited to physical standby databases. DataWorks pulls archive logs, which introduces higher latency than syncing from a primary database. Latency depends on archive log generation speed.

  • Real-time Synchronization does not support views. Batch Synchronization supports reading views.

  • DataWorks uses Oracle LogMiner for real-time reads. Supplemental logging and archive logs must be enabled on the Oracle primary database.

  • Index-organized tables (IOTs) are not supported.

  • Tables where all columns are LOB types (BLOB, CLOB, or NCLOB) are not supported.

  • Table and column names cannot contain Chinese characters.

  • Table names cannot exceed 30 bytes.

Supported data types

Support by operation

Data type Batch read Batch write Real-time read
NUMBER Supported Supported Supported
BINARY FLOAT Supported Supported Supported
BINARY DOUBLE Supported Supported Supported
CHAR Supported Supported Supported
NCHAR Supported Supported Supported
VARCHAR2 Supported Supported Supported
NVARCHAR2 Supported Supported Supported
DATE Supported Supported Supported
TIMESTAMP Supported Supported Supported
TIMESTAMP WITH TIME ZONE Supported Supported Not supported
TIMESTAMP WITH LOCAL TIME ZONE Supported Supported Not supported
CLOB Supported Supported Supported
BLOB Supported Supported Supported
RAW Supported Supported Supported
ROWID Not supported Not supported Supported
UROWID Not supported Not supported Supported
FLOAT Supported Supported Supported
INTERVAL DAY TO SECOND Not supported Not supported Supported
INTERVAL YEAR TO MONTH Not supported Not supported Supported
NCLOB Supported Supported Not supported
STRUCT Supported Supported Not supported
BFILE Not supported Not supported Not supported
LONG Not supported Not supported Not supported
LONG RAW Not supported Not supported Not supported
User-defined types Not supported Not supported Not supported
AnyType Not supported Not supported Not supported
AnyData Not supported Not supported Not supported
AnyDataSet Not supported Not supported Not supported
XmlType Not supported Not supported Not supported
Spatial types Not supported Not supported Not supported
Media types Not supported Not supported Not supported

Oracle Reader data type mappings

Category Oracle data types
Integer NUMBER, RAWID, INTEGER, INT, SMALLINT
Floating-point NUMERIC, DECIMAL, FLOAT, DOUBLE PRECISION, REAL
String LONG, CHAR, NCHAR, VARCHAR, VARCHAR2, NVARCHAR2, CLOB, NCLOB, CHARACTER, CHARACTER VARYING, CHAR VARYING, NATIONAL CHARACTER, NATIONAL CHAR, NATIONAL CHARACTER VARYING, NATIONAL CHAR VARYING, NCHAR VARYING
Date and time TIMESTAMP, DATE
Boolean BIT, BOOL
Binary BLOB, BFILE, RAW, LONG RAW

Prepare the Oracle environment

Complete the following steps before configuring a Data Synchronization task in DataWorks.

Verify the Oracle database version

Check your Oracle database version to confirm it is supported for your sync type.

  1. Run one of the following statements to get the version:

    select * from v$version;

    or:

    select version from v$instance;
  2. If the version is 12c or 18c, run the following statement to determine whether the database is a CDB. Real-time Synchronization does not support CDB-type Oracle databases of version 12c or 18c.

    select name, cdb, open_mode, con_id from v$database;

Create an account and grant permissions

Create a dedicated database account for Data Synchronization and grant the required permissions.

  1. Create the account. See CREATE USER in the Oracle documentation.

  2. Grant the base permissions required for all sync types. Replace <sync_user> with your actual account name.

    grant create session to <sync_user>;
    grant connect to <sync_user>;
    grant select on nls_database_parameters to <sync_user>;
    grant select on all_users to <sync_user>;
    grant select on all_objects to <sync_user>;
    grant select on DBA_MVIEWS to <sync_user>;
    grant select on DBA_MVIEW_LOGS to <sync_user>;
    grant select on DBA_CONSTRAINTS to <sync_user>;
    grant select on DBA_CONS_COLUMNS to <sync_user>;
    grant select on all_tab_cols to <sync_user>;
    grant select on sys.obj$ to <sync_user>;
    grant select on SYS.COL$ to <sync_user>;
    grant select on sys.USER$ to <sync_user>;
    grant select on sys.cdef$ to <sync_user>;
    grant select on sys.con$ to <sync_user>;
    grant select on all_indexes to <sync_user>;
    grant select on v_$database to <sync_user>;
    grant select on V_$ARCHIVE_DEST to <sync_user>;
    grant select on v_$log to <sync_user>;
    grant select on v_$logfile to <sync_user>;
    grant select on v_$archived_log to <sync_user>;
    grant select on V_$LOGMNR_CONTENTS to <sync_user>;
    grant select on DUAL to <sync_user>;
    grant select on v_$parameter to <sync_user>;
    grant select any transaction to <sync_user>;
    grant execute on SYS.DBMS_LOGMNR to <sync_user>;
    grant alter session to <sync_user>;
    grant select on dba_objects to <sync_user>;
    grant select on v_$standby_log to <sync_user>;
    grant select on v_$ARCHIVE_GAP to <sync_user>;
    grant select on sys.ICOL$ to <sync_user>;
    grant select on V$THREAD to <sync_user>;
  3. For Batch Full Synchronization, also grant table read access:

    grant select any table to <sync_user>;
  4. For Oracle 12c and later, grant Log Mining permissions (not required for earlier versions, where Log Mining is built in):

    grant LOGMINING to <sync_user>;

Enable archive logs and supplemental logging

Real-time synchronization requires archive logs and supplemental logging on the primary database.

  1. Enable archive logging:

    shutdown immediate;
    startup mount;
    alter database archivelog;
    alter database open;
  2. Enable supplemental logging. Choose the appropriate scope: Database level (applies to all tables):

    alter database add supplemental log data(primary key) columns;
    alter database add supplemental log data(unique) columns;

    Table level (applies to specific tables):

    alter <schema>.<table> add supplemental log data(primary key) columns;
    alter <schema>.<table> add supplemental log data(unique) columns;
  3. Switch the redo log file. Run the following command five times to flush the current log and switch to the next one:

    Note

    Running this command multiple times fills the current log file, forcing a switch. This prevents transaction record loss and simplifies data recovery.

    alter system switch logfile;

Check the database character encoding

Run the following to check the character encoding settings:

select * from v$nls_parameters where PARAMETER IN ('NLS_CHARACTERSET', 'NLS_NCHAR_CHARACTERSET');
  • NLS_CHARACTERSET: the database character set

  • NLS_NCHAR_CHARACTERSET: the national character set

Data Synchronization supports only UTF8, AL32UTF8, AL16UTF16, and ZHS16GBK. If your database uses a different encoding, change it before running any sync task.

Check table data types

To view the data types for a specific table, run:

select COLUMN_NAME, DATA_TYPE from all_tab_columns where TABLE_NAME='<tablename>';

Replace <tablename> with the actual table name.

  • COLUMN_NAME: the column name

  • DATA_TYPE: the column's data type

  • all_tab_columns: a system view containing column metadata for all tables, views, and clusters

Add a data source

Add the Oracle data source to DataWorks before creating a sync task. Follow the instructions in Data source management. Parameter descriptions are available in the DataWorks console when you add the data source.

Configure sync tasks

Single-table batch synchronization

Configure the task in the Codeless UI or Code Editor.

For a complete parameter reference and script template, see Appendix: Script examples and parameters.

Single-table real-time synchronization

See Configure a real-time synchronization task in DataStudio.

Full-database synchronization

See Configure a real-time full-database synchronization task for batch, real-time, and sharding configurations.

FAQ

Why does real-time sync show recurring errors?

See Recurring errors in real-time sync tasks for a list of common errors and fixes.

What happens when a primary-to-standby failover occurs during sync?

If the primary database fails and the system switches to the standby database, the standby continuously restores data from the primary using archive logs. The inherent replication lag — compounded by network latency — can cause the standby's data to differ from the primary's state at the time of failure. Data synchronized during this window may not represent a complete snapshot of the primary.

Does Oracle Reader guarantee consistency when reading concurrently?

In single-thread mode, Oracle Reader uses the database's native snapshot feature to ensure consistency: it does not fetch data written by other clients while a sync task is running. In concurrent mode (using splitPk), multiple sub-tasks run in separate transactions at different times, so the result is not a consistent snapshot. To maintain consistency in concurrent reads, either use single-thread synchronization without data sharding, or prevent writes to the source table during the sync (for example, by applying a table lock).

How do I perform incremental synchronization?

Oracle Reader uses JDBC SELECT statements. Use a WHERE clause to filter for new or modified records:

  • If your tables have a modify timestamp column updated on every insert, update, and logical delete, filter by records with a timestamp later than the last sync.

  • If your tables use an auto-incrementing ID for append-only data, filter by records with an ID greater than the maximum ID from the previous sync.

If your data model has no field that distinguishes new or changed records, perform a full sync instead.

Why doesn't Oracle Reader require an explicit character encoding?

Oracle Reader uses JDBC, which automatically detects and converts character encodings. No manual encoding configuration is needed.

Is the querySql parameter safe against SQL injection?

Oracle Reader does not validate the SQL passed to the querySql parameter. Treat this parameter as trusted input and do not expose it to untrusted user input.

Appendix: Script examples and parameters

Reader script example

All Reader parameters are set under "category": "reader" in the job script.

{
    "type": "job",
    "version": "2.0",
    "steps": [
        {
            "stepType": "oracle",
            "parameter": {
                "selectedDatabase": "AUTOTEST",
                "indexes": [],
                "datasource": "oracle_test",
                "envType": 0,
                "useSpecialSecret": true,
                "column": [
                    "id"
                ],
                "where": "",
                "splitPk": "id",
                "encoding": "UTF-8",
                "table": "AUTOTEST.table01"
            },
            "name": "Reader",
            "category": "reader"
        },
        {
            "stepType": "odps",
            "parameter": {},
            "name": "Writer",
            "category": "writer"
        },
        {
            "name": "Processor",
            "stepType": null,
            "category": "processor",
            "copies": 1,
            "parameter": {
                "nodes": [],
                "edges": [],
                "groups": [],
                "version": "2.0"
            }
        }
    ],
    "setting": {
        "executeMode": null,
        "errorLimit": {
            "record": ""
        },
        "speed": {
            "concurrent": 2,
            "throttle": false
        }
    },
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    }
}

Reader parameters

Parameter Description Required Default
datasource The data source name. Must match the name of the data source added in DataWorks. Yes None
selectedDatabase The schema of the database to sync. Yes None
table The source table name in schema.tableName format. For example, if selectedDatabase is AUTOTEST and the table is table01, set this to AUTOTEST.table01. Yes None
column The columns to sync, as a JSON array. To sync all columns, use ["*"]. Supports column reordering, constants, expressions, and null values. Yes None
splitPk The column used to shard data for concurrent sync. Use the primary key for even distribution. Supports numeric and string values. If not set, the table is read by a single channel. Cannot be ROWID in a view. No None
splitMode The sharding strategy. Set to averageInterval when splitPk is numeric; set to randomSampling when splitPk is a string. Must be used with splitPk. No randomSampling
splitFactor The number of shards per concurrent thread. Total shards = concurrent threads x splitFactor. Keep between 1 and 100 to avoid out-of-memory (OOM) errors. No 5
where A SQL WHERE filter condition. Combined with column and table to form the read query. Useful for incremental synchronization. If blank, all rows are synced. No None
querySql A custom SELECT statement for complex filters such as multi-table joins. When set, table, column, and where are ignored. Available in script mode only. No None
fetchSize The number of rows fetched per network round trip. Larger values improve read throughput but values above 2,048 may cause OOM errors. No 1024

Writer script example

All Writer parameters are set under "category": "writer" in the job script.

{
    "type": "job",
    "version": "2.0",
    "steps": [
        {
            "stepType": "stream",
            "parameter": {},
            "name": "Reader",
            "category": "reader"
        },
        {
            "stepType": "oracle",
            "parameter": {
                "postSql": [],
                "datasource": "",
                "session": [],
                "column": [
                    "id",
                    "name"
                ],
                "encoding": "UTF-8",
                "batchSize": 1024,
                "table": "",
                "preSql": []
            },
            "name": "Writer",
            "category": "writer"
        }
    ],
    "setting": {
        "errorLimit": {
            "record": "0"
        },
        "speed": {
            "throttle": true,
            "concurrent": 1,
            "mbps": "12"
        }
    },
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    }
}

Writer parameters

Parameter Description Required Default
datasource The data source name. Must match the name of the data source added in DataWorks. Yes None
table The destination table name. If the schema differs from the configured username, use schema.table format. Yes None
column The destination columns to write, as a JSON array. For example: ["id", "name", "age"]. To write to all columns in order, use ["*"]. Yes None
writeMode The write mode. Only INSERT INTO is supported. On primary key or unique index conflict, the conflicting row is treated as dirty data. No insert into
preSql SQL statements to run before the sync task. In wizard mode, one statement is supported; in script mode, multiple are supported. No None
postSql SQL statements to run after the sync task completes. In wizard mode, one statement is supported; in script mode, multiple are supported. No None
batchSize The number of rows submitted per batch. Larger values reduce network round trips and improve throughput, but very large values may cause OOM errors. No 1024