ApsaraDB for OceanBase is a financial-grade distributed relational database that is developed by Alibaba Cloud and Ant Financial. This topic describes the parameters that are supported by ApsaraDB for OceanBase Reader and how to configure ApsaraDB for OceanBase Reader by using the codeless user interface (UI) and code editor.

Notice ApsaraDB for OceanBase Reader supports only exclusive resource groups for Data Integration, but not shared resource groups or custom resource groups. For more information, see Create and use an exclusive resource group for Data Integration and Create a custom resource group for Data Integration.

Background information

ApsaraDB for OceanBase implements automated and non-disruptive disaster recovery across cities based on the Five Data Centers Across Three Regions solution. It provides high availability for financial services based on conventional hardware. ApsaraDB for OceanBase is a database service developed by Alibaba Cloud. It has undergone strict verification in terms of functionality, stability, scalability, and performance.

ApsaraDB for OceanBase Reader reads data from tables stored in ApsaraDB for OceanBase databases.

ApsaraDB for OceanBase Reader connects to a remote ApsaraDB for OceanBase database by using a Java client, generates an SQL statement based on your configurations, and then sends the statement to the database. The system executes the statement on the database and returns data. Then, ApsaraDB for OceanBase Reader assembles the returned data into abstract datasets of the data types supported by Data Integration and sends the datasets to a writer.
  • ApsaraDB for OceanBase Reader generates an SQL statement based on the table, column, and where parameters that you have configured and sends the generated statement to the ApsaraDB for OceanBase database.
  • If you set the querySql parameter, ApsaraDB for OceanBase Reader directly sends the value of this parameter to the ApsaraDB for OceanBase database.
Note ApsaraDB for OceanBase supports the Oracle and MySQL tenant modes. Make sure that the WHERE clause and the columns that you specify in the columns parameter comply with the SQL syntax constraints that Oracle or MySQL supports. Otherwise, the SQL statement may fail to be executed.
ApsaraDB for OceanBase Reader accesses an ApsaraDB for OceanBase database by using the OceanBase driver. Confirm the compatibility between the driver version and your ApsaraDB for OceanBase database. ApsaraDB for OceanBase Reader uses the following version of the OceanBase database driver:
<dependency>
    <groupId>com.alipay.OceanBase</groupId>
    <artifactId>OceanBase-connector-java</artifactId>
    <version>3.1.0</version>
</dependency>

Parameters

Parameter Description Required Default value
datasource The name of the data source. It must be the same as the name of the ApsaraDB for OceanBase database that you added in DataWorks.

You can connect to the ApsaraDB for OceanBase database based on the settings of the jdbcUrl or username parameter.

Yes No default value
jdbcUrl The JDBC URL of the source database. You can specify multiple JDBC URLs in a JSON array for a database.

If you specify multiple JDBC URLs, ApsaraDB for OceanBase Reader verifies the connectivity of the URLs in sequence to find a valid URL.

If no URL is valid, ApsaraDB for OceanBase Reader returns an error.
Note The jdbcUrl parameter must be included in the connection parameter.

The value of the jdbcUrl parameter must be in compliance with the standard format that ApsaraDB for OceanBase supports. You can also specify the information of the attachment facility. An example JDBC URL is jdbc:OceanBase://127.0.0.1:3306/database. You must specify either jdbcUrl or username.

Yes No default value
username The username that is used to connect to the ApsaraDB for OceanBase database. Yes No default value
password The password that is used to connect to the ApsaraDB for OceanBase database. Yes No default value
table The name of the table from which you want to read data. ApsaraDB for OceanBase Reader can read data from multiple tables. The tables are specified in a JSON array.
If you specify multiple tables, make sure that the tables have the same schema. ApsaraDB for OceanBase Reader does not check whether the tables have the same schema.
Note The table parameter must be included in the connection parameter.
Yes No default value
column The names of the columns from which you want to read data. Specify the names in a JSON array. The default value is [ * ], which indicates all the columns.
  • You can select specific columns to read.
  • The column order can be changed. This indicates that you can specify columns in an order different from the order specified by the schema of the source table.
  • Constants are supported. Example: '123'.
  • Functions are supported. Example: date('now').
  • The column parameter must explicitly specify all the columns from which you want to read data. The parameter cannot be left empty.
Yes No default value
splitPk The field that is used for data sharding when ApsaraDB for OceanBase Reader reads data. If you specify this parameter, the source table is sharded based on the value of this parameter. Data Integration then runs parallel threads to read data. This way, data can be synchronized more efficiently.
  • We recommend that you set the splitPk parameter to the name of the primary key column of the table. Data can be evenly distributed to different shards based on the primary key column, instead of being intensively distributed only to specific shards.
  • The splitPk parameter supports sharding for data only of integer data types. If you set this parameter to a field of an unsupported data type, such as a string, floating point, or date data type, ApsaraDB for OceanBase Reader returns an error.
  • If you leave the splitPk parameter empty, ApsaraDB for OceanBase Reader uses a single thread to read data.
Yes Left empty
where The WHERE clause. ApsaraDB for OceanBase Reader generates an SQL statement based on the column, table, and where parameters that you have configured and uses the generated statement to read data.
For example, when you perform a test, you can set the where parameter to limit 10. You can set this parameter to gmt_create > $bizdate to read data on the current day.
  • You can use the WHERE clause to read incremental data.
  • If the where parameter is not provided or is left empty, ApsaraDB for OceanBase Reader reads all data.
Yes No default value
querySql The SQL statement that is used for refined data filtering. If you specify this parameter, Data Integration filters data based on the value of this parameter.

If you specify this parameter, ApsaraDB for OceanBase Reader ignores the settings of the table, column, where, and splitPk parameters.

Yes No default value
fetchSize The number of data records to read at a time. This parameter determines the number of interactions between Data Integration and the database and affects read efficiency.
Note If you set this parameter to a value greater than 2048, an out of memory (OOM) error may occur during data synchronization.
Yes 1,024

Configure ApsaraDB for OceanBase Reader by using the codeless UI

This method is not supported.

Configure ApsaraDB for OceanBase Reader by using the code editor

In the following code, a synchronization node is configured to read data from ApsaraDB for OceanBase. For more information, see Create a sync node by using the code editor.
{
    "type": "job",
    "steps": [
        {
            "stepType": "apsaradb_for_OceanBase", // The reader type.
            "parameter": {
                "datasource": "", // The name of the data source.
                "where": "",
                "column": [ // The names of the columns from which you want to read data.
                    "id",
                    "name"
                ],
                "splitPk": ""
            },
            "name": "Reader",
            "category": "reader"
        },
        {
            "stepType": "stream",
            "parameter": {
                "print": false,
                "fieldDelimiter": ","
            },
            "name": "Writer",
            "category": "writer"
        }
    ],
    "version": "2.0",
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    },
    "setting": {
        "errorLimit": {
            "record": "0" // The maximum number of dirty data records allowed.
        },
        "speed": {
            "throttle":true,// Specifies whether to enable bandwidth throttling. The value false indicates that bandwidth throttling is disabled, and the value true indicates that bandwidth throttling is enabled. The mbps parameter takes effect only when the throttle parameter is set to true. 
            "concurrent": 1, // The maximum number of parallel threads.
            "mbps":"12"// The maximum transmission rate.
        }
    }
}