Oracle data synchronization in DataWorks - DataWorks - Alibaba Cloud Documentation Center

DataWorks provides Oracle Reader and Oracle Writer to read from and write to Oracle data sources. You can use the codeless user interface (UI) or code editor to configure data synchronization tasks for Oracle data sources. This topic describes the supported data synchronization capabilities for Oracle data sources.

Supported versions

Batch data read and write

Version	Batch data read (Oracle Reader)	Batch data write (Oracle Writer)
Oracle 11.2 or 11gR2	Supported	Supported
Oracle 12.1 or 12cR1	Supported	Supported
Oracle 12.2 or 12cR2	Supported (without new features)	Supported (without new features)
Oracle 18.3	Supported (without new features)	Supported (without new features)
Oracle 19.x	Supported (without new features)	Supported (without new features)
Oracle 21.1	Supported (without new features)	Supported (without new features)

Important

The Oracle Writer plug-in uses the ojdbc6-12.1.1.jar driver.

Real-time data read

Supported versions:
11g R2 and later, 12c non-CDB, 18c non-CDB, 19c non-CDB, or 19c CDB.
Unsupported versions:
12c CDB and 18c CDB.
To use an Oracle database as a source for a real-time full-database synchronization task, you must enable the archive log feature. For more information, see Prepare an Oracle environment.

Note

A container database (CDB) is a new feature in Oracle 12c and later. A CDB can host multiple pluggable databases (PDBs).

Limitations

The NLS_CHARACTERSET and NLS_NCHAR_CHARACTERSET of the Oracle instance must be one of the following: AL32UTF8, AL16UTF16, ZHS16GBK, or UTF8.
For real-time synchronization, a single Oracle instance supports up to 500 GB of incremental data per day.
DataWorks supports real-time reads only from physical Oracle secondary databases. This method pulls archive logs, which causes higher latency than synchronizing from a primary database. The latency depends on the archive log generation speed.
While real-time synchronization does not support views, batch synchronization supports reading from them.
DataWorks uses the Oracle LogMiner utility for real-time data reads. To synchronize data from either a primary or secondary database, you must enable supplemental logging and archive logs on the primary database.
Index-organized tables are not supported.
Tables containing only LOB-type columns (BLOB, CLOB, or NCLOB) are not supported.
Oracle table names and column names that contain Chinese characters are not supported.
Oracle table names cannot exceed 30 bytes.

Supported data types

Type	Batch read (Oracle Reader)	Batch write (Oracle Writer)	Real-time read
NUMBER	Supported	Supported	Supported
BINARY FLOAT	Supported	Supported	Supported
BINARY DOUBLE	Supported	Supported	Supported
CHAR	Supported	Supported	Supported
NCHAR	Supported	Supported	Supported
VARCHAR2	Supported	Supported	Supported
NVARCHAR2	Supported	Supported	Supported
DATE	Supported	Supported	Supported
TIMESTAMP	Supported	Supported	Supported
TIMESTAMP WITH TIME ZONE	Supported	Supported	Not supported
TIMESTAMP WITH LOCAL TIME ZONE	Supported	Supported	Not supported
CLOB	Supported	Supported	Supported
BLOB	Supported	Supported	Supported
RAW	Supported	Supported	Supported
ROWID	Not supported	Not supported	Supported
UROWID	Not supported	Not supported	Supported
FLOAT	Supported	Supported	Supported
INTERVAL DAY TO SECOND	Not supported	Not supported	Supported
INTERVAL YEAR TO MONTH	Not supported	Not supported	Supported
BFILE	Not supported	Not supported	Not supported
LONG	Not supported	Not supported	Not supported
LONG RAW	Not supported	Not supported	Not supported
NCLOB	Supported	Supported	Not supported
STRUCT	Supported	Supported	Not supported
user-defined types	Not supported	Not supported	Not supported
AnyType	Not supported	Not supported	Not supported
AnyData	Not supported	Not supported	Not supported
AnyDataSet	Not supported	Not supported	Not supported
XmlType	Not supported	Not supported	Not supported
spatial types	Not supported	Not supported	Not supported
media types	Not supported	Not supported	Not supported

The following table categorizes the Oracle data types for Oracle Reader.

Category	Oracle data type
integer	NUMBER, RAWID, INTEGER, INT, and SMALLINT
floating point	NUMERIC, DECIMAL, FLOAT, DOUBLE PRECISION, and REAL
string	LONG, CHAR, NCHAR, VARCHAR, VARCHAR2, NVARCHAR2, CLOB, NCLOB, CHARACTER, CHARACTER VARYING, CHAR VARYING, NATIONAL CHARACTER, NATIONAL CHAR, NATIONAL CHARACTER VARYING, NATIONAL CHAR VARYING, and NCHAR VARYING
date and time	TIMESTAMP and DATE
boolean	BIT and BOOL
binary	BLOB, BFILE, RAW, and LONG RAW

Prepare an Oracle environment

To synchronize data from Oracle using DataWorks, you must first prepare your environment. This topic outlines the required preparations to ensure that data synchronization tasks run as expected.

Check the Oracle database version

The supported Oracle database versions vary by synchronization scenario. Before synchronizing data, verify that your Oracle database version is supported.

To check your Oracle database version, run either of the following statements.
- Statement 1:
```
select * from v$version;
```
- Statement 2:
```
select version from v$instance;
```
If your Oracle database version is 12c or 18c, you need to use the following statement to confirm whether the database is a cdb database. DataWorks Data Integration real-time synchronization tasks do not currently support Oracle cdb databases of version 12c or 18c.
```
select name,cdb,open_mode,con_id from v$database;
```

Create an account and grant privileges

Create a database account with the required privileges in Oracle.

For details, see Create an Oracle account.

Grant privileges.

Run the following commands to grant the required privileges to the account. In the following statements, replace 'synchronization account' with the account you created.

grant create session to 'synchronization account';  // Grants the privilege to log on to the database.
grant connect to 'synchronization account';  // Grants the privilege to connect to the database.
grant select on nls_database_parameters to 'synchronization account';  // Grants the privilege to query the nls_database_parameters system configuration.
grant select on all_users to 'synchronization account';  // Grants the privilege to query all users in the database.
grant select on all_objects to 'synchronization account';  // Grants the privilege to query all objects in the database.
grant select on DBA_MVIEWS to 'synchronization account';  // Grants the privilege to view the materialized views of the database.
grant select on DBA_MVIEW_LOGS to 'synchronization account';  // Grants the privilege to view the materialized view logs of the database.
grant select on DBA_CONSTRAINTS to 'synchronization account';  // Grants the privilege to view constraint information for all tables in the database.
grant select on DBA_CONS_COLUMNS to 'synchronization account';  // Grants the privilege to view information about columns in specified constraints for all tables in the database.
grant select on all_tab_cols to 'synchronization account';  // Grants the privilege to view information about columns in tables, views, and clusters.
grant select on sys.obj$ to 'synchronization account';  // Grants the privilege to view objects in the database. The sys.obj$ table is a base table in the Oracle data dictionary that stores all Oracle objects.
grant select on SYS.COL$ to 'synchronization account';  // Grants the privilege to view column definitions in database tables. SYS.COL$ stores column definition information.
grant select on sys.USER$ to 'synchronization account';  // Grants the privilege to view the sys.USER$ system table, which contains details about database users.
grant select on sys.cdef$ to 'synchronization account';  // Grants the privilege to view a system table of the database.
grant select on sys.con$ to 'synchronization account';  // Grants the privilege to view constraint information. sys.con$ records Oracle constraint information.
grant select on all_indexes to 'synchronization account';  // Grants the privilege to view all indexes in the database.
grant select on v_$database to 'synchronization account';  // Grants the privilege to view the v_$database view.
grant select on V_$ARCHIVE_DEST to 'synchronization account';  // Grants the privilege to view the V_$ARCHIVE_DEST view.
grant select on v_$log to 'synchronization account';  // Grants the privilege to view the v_$log view, which displays log file information from the control file.
grant select on v_$logfile to 'synchronization account';  // Grants the privilege to view the v_$logfile view, which contains information about redo log files.
grant select on v_$archived_log to 'synchronization account';  // Grants the privilege to view the v$archived_log view, which contains information about archived logs.
grant select on V_$LOGMNR_CONTENTS to 'synchronization account';  // Grants the privilege to view the V_$LOGMNR_CONTENTS view.
grant select on DUAL to 'synchronization account';   // Grants the privilege to view the DUAL table. DUAL is a virtual table used to conform to SELECT syntax rules. In Oracle, DUAL contains only one record.
grant select on v_$parameter to 'synchronization account';  // Grants the privilege to view the v_$parameter view. v$parameter is a dynamic data dictionary table that stores database parameter settings.
grant select any transaction to 'synchronization account';  // Grants the privilege to view any transaction in the database.
grant execute on SYS.DBMS_LOGMNR to 'synchronization account';  // Grants the privilege to use the LogMiner tool, which helps you analyze transactions and recover lost data.
grant alter session to 'synchronization account';  // Grants the privilege to modify the database session.
grant select on dba_objects to 'synchronization account';  // Grants the privilege to view all objects in the database.
grant select on v_$standby_log to 'synchronization account';  // Grants the privilege to view the v_$standby_log view, which contains information about archived logs of the standby database.
grant select on v_$ARCHIVE_GAP to 'synchronization account';  // Grants the privilege to query for missing archived logs.
grant select on sys.ICOL$ to 'synchronization account';  // Grants the privilege to view information about index-organized table columns.
grant select on V_$THREAD to 'synchronization account';  // Grants the privilege to view instance thread information. The V_$THREAD view is a thread status table in Oracle dynamic performance views. It stores the thread status of each instance in a RAC cluster and is used during incremental parsing to determine the instance status and log source.

To perform offline full synchronization, also run the following statement to grant the synchronization account the privilege to query all tables.

grant select any table to 'synchronization account';

For Oracle 12c and later versions, run the following statement to grant log mining privileges. This command is not required for versions earlier than Oracle 12c, as log mining is a built-in feature.

grant LOGMINING TO 'synchronization account';

Enable logging and switch redo log files

To enable archive logging, run the following SQL statements:

shutdown immediate;
startup mount;
alter database archivelog;
alter database open;

Enable supplemental logging.

Enable supplemental logging at the database or table level to meet your business requirements:

// Enable database-level supplemental logging to capture all data changes for real-time synchronization.
alter database add supplemental log data(primary key) columns; // Enables supplemental logging for the primary key columns of the database.
alter database add supplemental log data(unique) columns; // Enables supplemental logging for the unique index columns of the database.

// Alternatively, enable table-level supplemental logging:
alter schema.table add supplemental log data(primary key) columns; // Enables supplemental logging for the primary key columns of a specified table.
alter schema.table add supplemental log data(unique) columns; // Enables supplemental logging for the unique index columns of a specified table.

Switch redo log files.
After enabling supplemental logging, run the following command multiple times (five times is recommended) to switch the redo log files.
```
alter system switch logfile;
```
Note
Running this command multiple times ensures that the new supplemental logging settings are applied to all redo log files, which is critical for complete data capture and recovery.

Check the database character encoding

Run the following command to check the character encoding of your database.

select * from v$nls_parameters where PARAMETER IN ('NLS_CHARACTERSET', 'NLS_NCHAR_CHARACTERSET');

The v$nls_parameters view stores database parameter settings.
NLS_CHARACTERSET and NLS_NCHAR_CHARACTERSET are the database character set and national character set, respectively. They define the encoding for character data in Oracle.

Note

DataWorks data synchronization supports only the UTF8, AL32UTF8, AL16UTF16, and ZHS16GBK character encodings. If your database uses an unsupported character encoding, change it before performing data synchronization.

Check table data types

Use a SELECT statement to query the data types of a table. The following example shows how to query the data types for a table named 'tablename'.

select COLUMN_NAME,DATA_TYPE from all_tab_columns where TABLE_NAME='tablename';

COLUMN_NAME: the name of a column in the table.
DATA_TYPE: the data type of the corresponding column.
all_tab_columns: a view that stores information about all columns in the tables of the database.
TABLE_NAME: the name of the table to query. When you run this statement, replace 'tablename' with the name of your table.

You can also execute select * from 'tablename'; to query all information from the target table and obtain the data types.

Add a data source

Before you develop a synchronization task in DataWorks, you must add the required data source to DataWorks by following the instructions in Data source management. You can view parameter descriptions in the DataWorks console to understand the meanings of the parameters when you add a data source.