The Oracle data source lets you read from and write to an Oracle database using a Data Synchronization task in DataWorks. Configure the task with the Codeless UI for a guided experience, or use the Code Editor for full script control.
Sync capabilities overview
Before you start, confirm that the Oracle data source supports your use case:
| Capability | Supported | Notes |
|---|---|---|
| Batch read | Yes | All supported versions |
| Batch write | Yes | Uses INSERT INTO only; conflicts are treated as dirty data |
| Real-time read (CDC) | Yes | Requires archive logs and supplemental logging on the primary database |
| Read from views | Batch only | Real-time Synchronization does not support views |
| Read from standby database | Yes (physical only) | Archive log generation speed determines latency |
| Index-organized tables (IOTs) | No | — |
| Full-database synchronization | Yes | Batch, real-time, and sharding modes |
| Tables where all columns are LOB types | No | BLOB, CLOB, or NCLOB all-LOB tables are not supported |
Supported versions
Batch read and write
| Version | Oracle Reader | Oracle Writer |
|---|---|---|
| Oracle 11.2 (11gR2) | Supported | Supported |
| Oracle 12.1 (12cR1) | Supported | Supported |
| Oracle 12.2 (12cR2) | Supported (new version features not supported) | Supported (new version features not supported) |
| Oracle 18.3 | Supported (new version features not supported) | Supported (new version features not supported) |
| Oracle 19.x | Supported (new version features not supported) | Supported (new version features not supported) |
| Oracle 21.1 | Supported (new version features not supported) | Supported (new version features not supported) |
The Oracle Writer plugin uses the ojdbc6-12.1.1.jar driver.
Real-time read
Supported versions: Oracle 11g R2 and later, 12c non-CDB, 18c non-CDB, 19c non-CDB, and 19c CDB.
Not supported: Oracle 12c CDB and 18c CDB.
A Container Database (CDB) is an architecture introduced in Oracle Database 12c. It can host one or more Pluggable Databases (PDBs). To use Oracle as a real-time sync source, enable the Archive Log feature. See Prepare the Oracle environment.
Limitations
-
NLS_CHARACTERSETandNLS_NCHAR_CHARACTERSETmust be set to one of:AL32UTF8,AL16UTF16,ZHS16GBK, orUTF8. -
Real-time Synchronization supports up to 500 GB of incremental data per day for a single Oracle instance.
-
Real-time reads from a standby database are limited to physical standby databases. DataWorks pulls archive logs, which introduces higher latency than syncing from a primary database. Latency depends on archive log generation speed.
-
Real-time Synchronization does not support views. Batch Synchronization supports reading views.
-
DataWorks uses Oracle LogMiner for real-time reads. Supplemental logging and archive logs must be enabled on the Oracle primary database.
-
Index-organized tables (IOTs) are not supported.
-
Tables where all columns are LOB types (BLOB, CLOB, or NCLOB) are not supported.
-
Table and column names cannot contain Chinese characters.
-
Table names cannot exceed 30 bytes.
Supported data types
Support by operation
| Data type | Batch read | Batch write | Real-time read |
|---|---|---|---|
| NUMBER | Supported | Supported | Supported |
| BINARY FLOAT | Supported | Supported | Supported |
| BINARY DOUBLE | Supported | Supported | Supported |
| CHAR | Supported | Supported | Supported |
| NCHAR | Supported | Supported | Supported |
| VARCHAR2 | Supported | Supported | Supported |
| NVARCHAR2 | Supported | Supported | Supported |
| DATE | Supported | Supported | Supported |
| TIMESTAMP | Supported | Supported | Supported |
| TIMESTAMP WITH TIME ZONE | Supported | Supported | Not supported |
| TIMESTAMP WITH LOCAL TIME ZONE | Supported | Supported | Not supported |
| CLOB | Supported | Supported | Supported |
| BLOB | Supported | Supported | Supported |
| RAW | Supported | Supported | Supported |
| ROWID | Not supported | Not supported | Supported |
| UROWID | Not supported | Not supported | Supported |
| FLOAT | Supported | Supported | Supported |
| INTERVAL DAY TO SECOND | Not supported | Not supported | Supported |
| INTERVAL YEAR TO MONTH | Not supported | Not supported | Supported |
| NCLOB | Supported | Supported | Not supported |
| STRUCT | Supported | Supported | Not supported |
| BFILE | Not supported | Not supported | Not supported |
| LONG | Not supported | Not supported | Not supported |
| LONG RAW | Not supported | Not supported | Not supported |
| User-defined types | Not supported | Not supported | Not supported |
| AnyType | Not supported | Not supported | Not supported |
| AnyData | Not supported | Not supported | Not supported |
| AnyDataSet | Not supported | Not supported | Not supported |
| XmlType | Not supported | Not supported | Not supported |
| Spatial types | Not supported | Not supported | Not supported |
| Media types | Not supported | Not supported | Not supported |
Oracle Reader data type mappings
| Category | Oracle data types |
|---|---|
| Integer | NUMBER, RAWID, INTEGER, INT, SMALLINT |
| Floating-point | NUMERIC, DECIMAL, FLOAT, DOUBLE PRECISION, REAL |
| String | LONG, CHAR, NCHAR, VARCHAR, VARCHAR2, NVARCHAR2, CLOB, NCLOB, CHARACTER, CHARACTER VARYING, CHAR VARYING, NATIONAL CHARACTER, NATIONAL CHAR, NATIONAL CHARACTER VARYING, NATIONAL CHAR VARYING, NCHAR VARYING |
| Date and time | TIMESTAMP, DATE |
| Boolean | BIT, BOOL |
| Binary | BLOB, BFILE, RAW, LONG RAW |
Prepare the Oracle environment
Complete the following steps before configuring a Data Synchronization task in DataWorks.
Verify the Oracle database version
Check your Oracle database version to confirm it is supported for your sync type.
-
Run one of the following statements to get the version:
select * from v$version;or:
select version from v$instance; -
If the version is 12c or 18c, run the following statement to determine whether the database is a CDB. Real-time Synchronization does not support CDB-type Oracle databases of version 12c or 18c.
select name, cdb, open_mode, con_id from v$database;
Create an account and grant permissions
Create a dedicated database account for Data Synchronization and grant the required permissions.
-
Create the account. See CREATE USER in the Oracle documentation.
-
Grant the base permissions required for all sync types. Replace
<sync_user>with your actual account name.grant create session to <sync_user>; grant connect to <sync_user>; grant select on nls_database_parameters to <sync_user>; grant select on all_users to <sync_user>; grant select on all_objects to <sync_user>; grant select on DBA_MVIEWS to <sync_user>; grant select on DBA_MVIEW_LOGS to <sync_user>; grant select on DBA_CONSTRAINTS to <sync_user>; grant select on DBA_CONS_COLUMNS to <sync_user>; grant select on all_tab_cols to <sync_user>; grant select on sys.obj$ to <sync_user>; grant select on SYS.COL$ to <sync_user>; grant select on sys.USER$ to <sync_user>; grant select on sys.cdef$ to <sync_user>; grant select on sys.con$ to <sync_user>; grant select on all_indexes to <sync_user>; grant select on v_$database to <sync_user>; grant select on V_$ARCHIVE_DEST to <sync_user>; grant select on v_$log to <sync_user>; grant select on v_$logfile to <sync_user>; grant select on v_$archived_log to <sync_user>; grant select on V_$LOGMNR_CONTENTS to <sync_user>; grant select on DUAL to <sync_user>; grant select on v_$parameter to <sync_user>; grant select any transaction to <sync_user>; grant execute on SYS.DBMS_LOGMNR to <sync_user>; grant alter session to <sync_user>; grant select on dba_objects to <sync_user>; grant select on v_$standby_log to <sync_user>; grant select on v_$ARCHIVE_GAP to <sync_user>; grant select on sys.ICOL$ to <sync_user>; grant select on V$THREAD to <sync_user>; -
For Batch Full Synchronization, also grant table read access:
grant select any table to <sync_user>; -
For Oracle 12c and later, grant Log Mining permissions (not required for earlier versions, where Log Mining is built in):
grant LOGMINING to <sync_user>;
Enable archive logs and supplemental logging
Real-time synchronization requires archive logs and supplemental logging on the primary database.
-
Enable archive logging:
shutdown immediate; startup mount; alter database archivelog; alter database open; -
Enable supplemental logging. Choose the appropriate scope: Database level (applies to all tables):
alter database add supplemental log data(primary key) columns; alter database add supplemental log data(unique) columns;Table level (applies to specific tables):
alter <schema>.<table> add supplemental log data(primary key) columns; alter <schema>.<table> add supplemental log data(unique) columns; -
Switch the redo log file. Run the following command five times to flush the current log and switch to the next one:
NoteRunning this command multiple times fills the current log file, forcing a switch. This prevents transaction record loss and simplifies data recovery.
alter system switch logfile;
Check the database character encoding
Run the following to check the character encoding settings:
select * from v$nls_parameters where PARAMETER IN ('NLS_CHARACTERSET', 'NLS_NCHAR_CHARACTERSET');
-
NLS_CHARACTERSET: the database character set -
NLS_NCHAR_CHARACTERSET: the national character set
Data Synchronization supports only UTF8, AL32UTF8, AL16UTF16, and ZHS16GBK. If your database uses a different encoding, change it before running any sync task.
Check table data types
To view the data types for a specific table, run:
select COLUMN_NAME, DATA_TYPE from all_tab_columns where TABLE_NAME='<tablename>';
Replace <tablename> with the actual table name.
-
COLUMN_NAME: the column name -
DATA_TYPE: the column's data type -
all_tab_columns: a system view containing column metadata for all tables, views, and clusters
Add a data source
Add the Oracle data source to DataWorks before creating a sync task. Follow the instructions in Data source management. Parameter descriptions are available in the DataWorks console when you add the data source.
Configure sync tasks
Single-table batch synchronization
Configure the task in the Codeless UI or Code Editor.
For a complete parameter reference and script template, see Appendix: Script examples and parameters.
Single-table real-time synchronization
See Configure a real-time synchronization task in DataStudio.
Full-database synchronization
See Configure a real-time full-database synchronization task for batch, real-time, and sharding configurations.
FAQ
Why does real-time sync show recurring errors?
See Recurring errors in real-time sync tasks for a list of common errors and fixes.
What happens when a primary-to-standby failover occurs during sync?
If the primary database fails and the system switches to the standby database, the standby continuously restores data from the primary using archive logs. The inherent replication lag — compounded by network latency — can cause the standby's data to differ from the primary's state at the time of failure. Data synchronized during this window may not represent a complete snapshot of the primary.
Does Oracle Reader guarantee consistency when reading concurrently?
In single-thread mode, Oracle Reader uses the database's native snapshot feature to ensure consistency: it does not fetch data written by other clients while a sync task is running. In concurrent mode (using splitPk), multiple sub-tasks run in separate transactions at different times, so the result is not a consistent snapshot. To maintain consistency in concurrent reads, either use single-thread synchronization without data sharding, or prevent writes to the source table during the sync (for example, by applying a table lock).
How do I perform incremental synchronization?
Oracle Reader uses JDBC SELECT statements. Use a WHERE clause to filter for new or modified records:
-
If your tables have a
modifytimestamp column updated on every insert, update, and logical delete, filter by records with a timestamp later than the last sync. -
If your tables use an auto-incrementing ID for append-only data, filter by records with an ID greater than the maximum ID from the previous sync.
If your data model has no field that distinguishes new or changed records, perform a full sync instead.
Why doesn't Oracle Reader require an explicit character encoding?
Oracle Reader uses JDBC, which automatically detects and converts character encodings. No manual encoding configuration is needed.
Is the querySql parameter safe against SQL injection?
Oracle Reader does not validate the SQL passed to the querySql parameter. Treat this parameter as trusted input and do not expose it to untrusted user input.
Appendix: Script examples and parameters
Reader script example
All Reader parameters are set under "category": "reader" in the job script.
{
"type": "job",
"version": "2.0",
"steps": [
{
"stepType": "oracle",
"parameter": {
"selectedDatabase": "AUTOTEST",
"indexes": [],
"datasource": "oracle_test",
"envType": 0,
"useSpecialSecret": true,
"column": [
"id"
],
"where": "",
"splitPk": "id",
"encoding": "UTF-8",
"table": "AUTOTEST.table01"
},
"name": "Reader",
"category": "reader"
},
{
"stepType": "odps",
"parameter": {},
"name": "Writer",
"category": "writer"
},
{
"name": "Processor",
"stepType": null,
"category": "processor",
"copies": 1,
"parameter": {
"nodes": [],
"edges": [],
"groups": [],
"version": "2.0"
}
}
],
"setting": {
"executeMode": null,
"errorLimit": {
"record": ""
},
"speed": {
"concurrent": 2,
"throttle": false
}
},
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
}
}
Reader parameters
| Parameter | Description | Required | Default |
|---|---|---|---|
datasource |
The data source name. Must match the name of the data source added in DataWorks. | Yes | None |
selectedDatabase |
The schema of the database to sync. | Yes | None |
table |
The source table name in schema.tableName format. For example, if selectedDatabase is AUTOTEST and the table is table01, set this to AUTOTEST.table01. |
Yes | None |
column |
The columns to sync, as a JSON array. To sync all columns, use ["*"]. Supports column reordering, constants, expressions, and null values. |
Yes | None |
splitPk |
The column used to shard data for concurrent sync. Use the primary key for even distribution. Supports numeric and string values. If not set, the table is read by a single channel. Cannot be ROWID in a view. | No | None |
splitMode |
The sharding strategy. Set to averageInterval when splitPk is numeric; set to randomSampling when splitPk is a string. Must be used with splitPk. |
No | randomSampling |
splitFactor |
The number of shards per concurrent thread. Total shards = concurrent threads x splitFactor. Keep between 1 and 100 to avoid out-of-memory (OOM) errors. |
No | 5 |
where |
A SQL WHERE filter condition. Combined with column and table to form the read query. Useful for incremental synchronization. If blank, all rows are synced. |
No | None |
querySql |
A custom SELECT statement for complex filters such as multi-table joins. When set, table, column, and where are ignored. Available in script mode only. |
No | None |
fetchSize |
The number of rows fetched per network round trip. Larger values improve read throughput but values above 2,048 may cause OOM errors. | No | 1024 |
Writer script example
All Writer parameters are set under "category": "writer" in the job script.
{
"type": "job",
"version": "2.0",
"steps": [
{
"stepType": "stream",
"parameter": {},
"name": "Reader",
"category": "reader"
},
{
"stepType": "oracle",
"parameter": {
"postSql": [],
"datasource": "",
"session": [],
"column": [
"id",
"name"
],
"encoding": "UTF-8",
"batchSize": 1024,
"table": "",
"preSql": []
},
"name": "Writer",
"category": "writer"
}
],
"setting": {
"errorLimit": {
"record": "0"
},
"speed": {
"throttle": true,
"concurrent": 1,
"mbps": "12"
}
},
"order": {
"hops": [
{
"from": "Reader",
"to": "Writer"
}
]
}
}
Writer parameters
| Parameter | Description | Required | Default |
|---|---|---|---|
datasource |
The data source name. Must match the name of the data source added in DataWorks. | Yes | None |
table |
The destination table name. If the schema differs from the configured username, use schema.table format. |
Yes | None |
column |
The destination columns to write, as a JSON array. For example: ["id", "name", "age"]. To write to all columns in order, use ["*"]. |
Yes | None |
writeMode |
The write mode. Only INSERT INTO is supported. On primary key or unique index conflict, the conflicting row is treated as dirty data. |
No | insert into |
preSql |
SQL statements to run before the sync task. In wizard mode, one statement is supported; in script mode, multiple are supported. | No | None |
postSql |
SQL statements to run after the sync task completes. In wizard mode, one statement is supported; in script mode, multiple are supported. | No | None |
batchSize |
The number of rows submitted per batch. Larger values reduce network round trips and improve throughput, but very large values may cause OOM errors. | No | 1024 |