Relational database management system (RDBMS) Reader is a common plug-in for reading data from relational databases. You can add or register a database driver to enable RDBMS Reader to read data from various relational databases. This topic describes how to add a relational database driver for RDBMS Reader.

Prerequisites

Before you add a relational database driver, an Elastic Compute Service (ECS) instance is purchased as your custom resource group. We recommend that you purchase an ECS instance that meets the following requirements:
  • The ECS instance runs CentOS V6, CentOS V7, or AliOS.
  • If you want to run MaxCompute nodes or sync nodes on the ECS instance, the ECS instance runs Python V2.6 or V2.7. CentOS V5 uses Python V2.4. Other operating systems use a Python version later than V2.6.
  • The ECS instance is accessible from the Internet. To check the Internet access, send PING messages to www.aliyun.com on the ECS instance.
  • We recommend that you configure the ECS instance with an 8-core CPU and 16 GB memory.

Background information

RDBMS Reader connects to a remote RDBMS database by using Java Database Connectivity (JDBC), generates a SELECT statement based on your configurations, and then sends the statement to the database. The RDBMS database executes the statement and returns the results. Then, RDBMS Reader assembles the returned data to abstract datasets of custom data types that are supported by Data Integration, and passes the datasets to a writer. For more information, see RDBMS Reader.

Create a custom resource group

  1. Log on to the DataWorks console.
  2. In the left-side navigation pane, click Workspaces.
  3. Find the required workspace and click Data Integration.
    If you are using another service of DataWorks, click the More icon in the upper-left corner and choose All Products > Data Integration to go to the Data Integration page.
  4. In the left-side navigation pane, click Custom Resource Group. The Custom Resource Groups page appears.
  5. Click Add Resource Group in the upper-right corner.
  6. Install and initialize the agent.
    If the server status is Available, a custom resource group is created.
    If the server status is still Stopped after you refresh the dialog box, switch to the admin account and run the following command to restart alisa:
    /home/admin/alisatasknode/target/alisatasknode/bin/serverct1 restart

Add a MySQL driver

  1. Log on to the ECS instance on which the custom resource group is created.
  2. Run the following commands to go to the directory of RDBMS Reader. ${DATAX_HOME} indicates the home directory of Data Integration. RDBMS Reader resides in the /home/admin/datax3/plugin/reader/rdbmsreader directory.
    [root@izbp1czjk**** rdbmsreader]# pwd
    /home/admin/datax3/plugin/reader/rdbmsreader
    [root@izbp1czjkv**** rdbmsreader]# ls
    libs plugin.json rdbmsreader-0.0.1-SNAPSHOT.jar
  3. Find the plugin.json file in the directory of RDBMS Reader. Add a specific database driver, such as com.mysql.jdbc.Driver in the following code, to the drivers array in the file.
    RDBMS Reader dynamically selects an appropriate database driver for connecting to a database when nodes are run.
    [root@izbp1cz**** rdbmsreader]# vim plugin.json
    {
        "name": "rdbmsreader",
        "class": "com.alibaba.datax.plugin.reader.rdbmsreader.RdbmsReader",
        "description": "useScene: prod. mechanism: Jdbc connection using the database, execute select sql, retrieve data from the ResultSet. warn: The more you know about the database, the less problems you encounter.",
        "developer": "alibaba",
        "drivers":["dm.jdbc.driver.DmDriver", "com.sybase.jdbc3.jdbc.SybDriver", "com.edb.Driver","com.mysql.jdbc.Driver"]
    }
  4. Add the MySQL JAR package that you downloaded to the libs folder in the rdbmsreader folder.
    For example, you can add the mysql-connector-java-5.1.47.jar package, as shown in the following figure. Download the JAR package

Configure a sync node

You can use RDBMS Reader to configure a sync node only in the code editor. The following sample code shows how to configure a sync node:
{
"job": {
        "setting": {
            "speed": {
                "byte": 1048576
            },
            "errorLimit": {
                "record": 0,
                "percentage": 0.02
            }
        },
        "content": [
            {
                "reader": {
                    "name": "rdbmsreader",
                    "parameter": {
                        "username": "xxxxx",
                        "password": "yyyyyy",
                        "column": [
                            "*",   
                        ],
                        "splitPk": "id",
                        "connection": [
                            {
                                "table": [
                                    "a2"
                                ],
                                "jdbcUrl": [
                                    "jdbc:mysql://xxx.mysql.yy.aliyuncs.com:3306/xxx" // Enter the JDBC URL of your MySQL database.
                               ]
                            }
                        ],

                        "where": ""
                    }
                },
                "writer": { // Configure the writer as required. 
                   "name": "streamwriter",
                    "parameter": {
                        "print": true
                    }
                }
            }
        ]
    }
}