Relational database management system (RDBMS) Reader is a common plug-in for reading data from relational databases. You can add or register the driver of a relational database type to enable RDBMS Reader to read data from this type of database. This topic describes how to add a relational database driver for RDBMS Reader.

Prerequisites

An Elastic Compute Service (ECS) instance is purchased as a resource in the custom resource group. We recommend that you purchase an ECS instance that meets the following requirements:
  • The ECS instance runs CentOS V6, CentOS V7, or AliOS.
  • The ECS instance runs Python V2.6 or V2.7 if you want to run MaxCompute nodes or sync nodes on the ECS instance. CentOS V5 uses Python V2.4. Other operating systems use a Python version later than V2.6.
  • The ECS instance can access the Internet. To determine whether the ECS instance can access the Internet, ping www.aliyun.com on the ECS instance. If the website can be pinged, the ECS instance can access the Internet.
  • We recommend that you configure the ECS instance with an 8-core CPU and 16 GB memory.

Background information

RDBMS Reader connects to a remote RDBMS database through Java Database Connectivity (JDBC), generates a SELECT statement based on your configurations, and then sends the statement to the database. The RDBMS database executes the statement and returns the result. Then, RDBMS Reader assembles the returned data to abstract datasets in custom data types supported by Data Integration, and passes the datasets to a writer. For more information, see Configure RDBMS Reader.

Add a custom resource group

  1. Log on to the DataWorks console. In the left-side navigation pane, click Workspaces. On the Workspaces page, find the target workspace and click Data Integration in the Actions column.
    If you have logged on to a module of DataWorks, click the DataWorks icon in the upper-left corner and choose All Products > Data Integration to go to Data Integration page.
  2. In the left-side navigation pane, click Custom Resource Group. The Custom Resource Group page appears.
  3. Click Add Resource Group in the upper-right corner.
  4. Set parameters in the dialog box that appears and install and initialize the agent. For more information, see Add a custom resource group.
    If the server status is Running, a custom resource group is added.
    Note If the server status is still Stopped after you refresh the dialog box, switch to the admin account and run the following command to restart alisa:
    /home/admin/alisatasknode/target/alisatasknode/bin/serverct1 restart

Add a MySQL driver

  1. Go to the directory of RDBMS Reader. ${DATAX_HOME} indicates the home directory of Data Integration. RDBMS Reader resides in the /home/admin/datax3/plugin/reader/rdbmsreader directory.
    [root@izbp1czjkv9fpzmsbv0qcdz rdbmsreader]# pwd
    /home/admin/datax3/plugin/reader/rdbmsreader
    [root@izbp1czjkv9fpzmsbv0qcdz rdbmsreader]# ls
    libs plugin.json rdbmsreader-0.0.1-SNAPSHOT.jar
  2. Find the plugin.json file in the directory of RDBMS Reader. Add the driver of your database, for example, com.mysql.jdbc.Driver in the following code, to the drivers array in the plugin.json file.
    RDBMS Reader automatically selects an appropriate driver for connecting to a database.
    [root@izbp1czjkv9fpzmsbv0qcdz rdbmsreader]# vim plugin.json
    {
        "name": "rdbmsreader",
        "class": "com.alibaba.datax.plugin.reader.rdbmsreader.RdbmsReader",
        "description": "useScene: prod. mechanism: Jdbc connection using the database, execute select sql, retrieve data from the ResultSet. warn: The more you know about the database, the less problems you encounter.",
        "developer": "alibaba",
        "drivers":["dm.jdbc.driver.DmDriver", "com.sybase.jdbc3.jdbc.SybDriver", "com.edb.Driver","com.mysql.jdbc.Driver"]
    }
  3. Add the MySQL JAR package you downloaded to the libs directory in the rdbmsreader directory.
    For example, you can add the mysql-connector-java-5.1.47.jar package shown in the following figure.MySQL JAR package

Configure a sync node.

Currently, you can only use RDBMS Reader to configure a sync node in the code editor. The following sample code shows how to configure a sync node:

{
"job": {
        "setting": {
            "speed": {
                "byte": 1048576
            },
            "errorLimit": {
                "record": 0,
                "percentage": 0.02
            }
        },
        "content": [
            {
                "reader": {
                    "name": "rdbmsreader",
                    "parameter": {
                        "username": "xxxxx",
                        "password": "yyyyyy",
                        "column": [
                            "*",   
                        ],
                        "splitPk": "id",
                        "connection": [
                            {
                                "table": [
                                    "a2"
                                ],
                                "jdbcUrl": [
                                    "jdbc:mysql://xxx.mysql.yy.aliyuncs.com:3306/xxx"  // The JDBC URL for connecting to your MySQL database.
                               ]
                            }
                        ],

                        "where": ""
                    }
                },
                "writer": {  // Configure the writer as required. 
                   "name": "streamwriter",
                    "parameter": {
                        "print": true
                    }
                }
            }
        ]
    }
}