All Products
Search
Document Center

Migrate data from a MySQL database to a TSDB database

Last Updated: Aug 25, 2021

This topic describes how to use DataX to migrate data from a MySQL database to a Time Series Database (TSDB) database. DataX is an open source tool developed by Alibaba Group.

Background

This topic describes how to use DataX to migrate data from a MySQL database to a TSDB database. DataX is an open source tool developed by Alibaba Group.

For more information about how to use DataX, see README.

This section describes DataX, MySQL Reader, and TSDB Writer. MySQL Reader and TSDB Writer are the plug-ins used to migrate data and are powered by DataX.

DataX

 DataX is an offline data synchronization tool that is widely used within Alibaba Group. You can use DataX to efficiently synchronize data between various disparate data sources, including MySQL, Oracle, SQL Server, PostgreSQL, Hadoop Distributed File System (HDFS), Hive, AnalyticDB for MySQL, HBase, Tablestore (OTS), MaxCompute, and Distributed Relational Database Service (DRDS). MaxCompute is previously known as Open Data Processing Service (ODPS).

MySQL Reader

 MySQL Reader is a plug-in powered by DataX. You can use MySQL Reader to query data from a MySQL database.

TSDB Writer

 TSDB Writer is a plug-in powered by DataX. You can use TSDB Writer to write data points to a TSDB database that is developed by Alibaba Cloud.

Quick Start

Step 1: Configure an environment

  • Linux

  • Use Java Development Kit (JDK) 1.8 or later. We recommend that you use JDK 1.8.

  • We recommend that you use Python 2.6.x.

  • DataX is compatible only with MySQL 5.x. If a version of MySQL other than 5.x is used, compatibility issues can occur.

  • DataX is compatible only with TSDB 2.4.x or later. If an earlier version of TSDB is used, compatibility issues can occur.

Step 2: Download DataX and the plug-ins

 Download DataX and the plug-ins.

Step 3: Use the built-in script provided by DataX to test whether the data can be migrated as expected

 The easy-to-use plug-ins Stream Reader and Stream Writer are used in the test. Stream Reader and Stream Writer do not require external dependencies. Therefore, Stream Reader and Stream Writer are suitable for testing whether data can be migrated as expected. Stream Reader and Stream Writer are used to simulate a simple data migration process. Stream Reader generates random character strings. Stream Writer receives the strings and prints them to your CLI.

Install DataX

 Extract the DataX installation package to a specified directory. For example, you can specify a directory named DATAX_HOME. Then, you can use DataX to migrate data.

$ cd ${DATAX_HOME}
$ python bin/datax.py job/job.json

Check whether data is migrated as expected

 The following sample shows the summary information returned if the data is migrated as expected:

Task start time                    : 2019-04-26 11:18:07
Task end time                    : 2019-04-26 11:18:17
Execution time                    :                 10s
Average traffic                    :          253.91KB/s
Write rate                    :          10000rec/s
Records obtained                    :              100000
Read and write failures                    :                   0

For more information about the sample code, see Quick start to migrate data.

Step 4: Configure and start a task to migrate data from a MySQL database to a TSDB database

 In Step 3, data is migrated as expected when Stream Reader and Stream Writer are used. The migration result indicates that you can use DataX to migrate data. After you verify that the plug-ins can be used, you can use MySQL Reader and TSDB Writer to migrate data from a MySQL database to a TSDB database.

Configure a task to migrate data

 Configure a task named mysql2tsdb.json to migrate data from a MySQL database to a TSDB database. You can use the following sample code to configure the task. For more information about the parameters, see the "Parameters" section.

{
  "job": {
    "content": [
      {
        "reader": {
          "name": "mysqlreader",
          "parameter": {
            "column": [
              "name",
              "type",
              "create_time",
              "price"
            ],
            "connection": [
              {
                "jdbcUrl": [
                  "jdbc:mysql://127.0.0.1:3306/datax"
                ],
                "table": [
                  "book"
                ]
              }
            ],
            "password": "yuzhouwan",
            "splitPk": "id",
            "username": "root"
          }
        },
        "writer": {
          "name": "tsdbwriter",
          "parameter": {
            "column": [
              "name",
              "type",
              "create_time",
              "price"
            ],
            "columnType": [
              "tag",
              "tag",
              "timestamp",
              "metric_num"
            ],
            "sourceDbType": "RDB",
            "tsdbAddress": "localhost",
            "tsdbPort": 8242
          }
        }
      }
    ],
    "setting": {
      "speed": {
        "channel": 1
      }
    }
  }
}

Start the migration task

$ cd ${DATAX_HOME}/..
$ ls
  datax/  datax.tar.gz  mysql2tsdb.json
$ python datax/bin/datax.py mysql2tsdb.json

Check whether data is migrated as expected

 The following sample shows the summary information returned if the data is migrated as expected:

Task start time                    : 2019-05-21 18:25:16
Task end time                    : 2019-05-21 18:25:27
Execution time                    :                 11s
Average traffic                    :                3B/s
Write rate                    :               0rec/s
Records obtained                    :                   3
Read and write failures                    :                   0

For more information about the sample code, see Migrate data from a MySQL database to a TSDB database.

Parameters

 The following tables describe the parameters.

Parameters for MySQL Reader

Parameter

Type

Required

Description

Default value

Example

jdbcUrl

String

Yes

The URL of the database used by Java Database Connectivity (JDBC) to connect to the MySQL database.

None

jdbc:mysql://127.0.0.1:3306/datax

username

String

Yes

The username that is used to connect to the MySQL database.

None

root

password

String

Yes

The password that is used to connect to the MySQL database.

None

root

table

String

Yes

The table that contains the data you want to migrate.

None

book

column

Array

Yes

The names of the columns that you want to migrate.

[]

["m"]

splitPk

String

No

The shard key used for table sharding. If you configure a splitPk value, the table is sharded based on the specified value.

Empty string

id

Parameters for TSDB Writer

Parameter

Type

Required

Description

Default value

Example

sourceDbType

String

No

Valid values: TSDB and RDB. The value TSDB specifies a time series engine, such as OpenTSDB, InfluxDB, Prometheus, or TimescaleDB. The value RDB specifies a relational database engine, such as MySQL, Oracle, PostgreSQL, or DRDS.

TSDB

RDB

column

Array

Yes

The names of the columns that you want to migrate.

[]

["name","type","create_time","price"]

columnType

Array

Yes

The data type of the column in the TSDB database after the column is migrated. Valid values: timestamp, tag, metric_num, and metric_string. The value timestamp specifies that the column contains timestamps. The value tag specifies that the column contains tags. The value metric_num specifies that the column contains data of a numeric data type. The value metric_string specifies that the column contains data of a string data type.

[]

["tag","tag","timestamp","metric_num"]

tsdbAddress

String

Yes

The IP address of the TSDB database.

None

127.0.0.1

tsdbPort

int

Yes

The port used by the TSDB database.

None

8242

batchSize

int

No

The number of data records that you want to migrate at a time. The value must be an integer greater than 0.

100

100

Note

Make sure that the TSDB database is accessible to each process of the migration task.

 TSDB Writer calls the HTTP endpoint /api/put to write data. If you need to migrate data, make sure that each process of the migration task can access the HTTP endpoint provided by the TSDB database. Otherwise, a connection exception is thrown.

Make sure that the MySQL database is accessible to each process of the migration task.

 MySQL Reader uses JDBC to read data. If you need to migrate data, make sure that each process of the migration task can access the JDBC API provided by the MySQL database. Otherwise, a connection exception is thrown.

The sequence of column names must be consistent.

 The sequence of the column names configured in TSDB Writer must be the same as that of the column names configured in MySQL Reader. Otherwise, data cannot be correctly processed.

FAQ

Can I change the Java Virtual Machine (JVM) memory size for a migration process?

Yes, you can change the JVM memory size for a migration process. If you want to change the JVM memory size for a task that migrates data from a MySQL database to a TSDB database, run the following command:

python datax/bin/datax.py mysql2tsdb.json -j "-Xms4096m -Xmx4096m"

How do I add an IP address to the whitelist of a TSDB database?

For more information about how to configure a whitelist, see TSDB documentation. You can choose Quick Start > Set the IP address whitelist to view the description.

If my migration task runs on an Elastic Compute Service (ECS) instance, how do I configure a virtual private cloud (VPC) and what are the common issues?

For more information, see Use cases of ECS security groups and FAQ about VPCs.