All Products
Search
Document Center

Migrate data from a MySQL database to a TSDB database

Last Updated: May 19, 2022

This topic describes how to use DataX to migrate data from a MySQL database to a Lindorm Time Series Database (TSDB) database. DataX is an open source tool that is developed by Alibaba Group.

Background information

This topic describes how to use DataX to migrate data from a MySQL database to a TSDB database. DataX is an open source tool that is developed by Alibaba Group.

For more information about how to use DataX, see README.

This section describes DataX, MySQL Reader, and TSDB Writer. MySQL Reader and TSDB Writer are the plug-ins that are used to migrate data.

DataX

DataX is a tool that is widely used within Alibaba Group to synchronize offline data. You can use DataX to synchronize data in an efficient manner between disparate data sources of various types. DataX supports MySQL, Oracle, SQL Server, PostgreSQL, Hadoop Distributed File System (HDFS), Hive, AnalyticDB for MySQL, HBase, Tablestore (OTS), MaxCompute, and PolarDB-X data sources. The previous name of MaxCompute is Open Data Processing Service (ODPS).

MySQL Reader

MySQL Reader is a plug-in that is powered by DataX. You can use MySQL Reader to query data from a MySQL database.

TSDB Writer

TSDB Writer is a plug-in that is powered by DataX. You can use TSDB Writer to write data points to a TSDB database that is developed by Alibaba Cloud.

Quick Start

Step 1: Configure an environment

  • Linux

  • Use Java Development Kit (JDK) 1.8 or later. We recommend that you use JDK 1.8.

  • We recommend that you use Python 2.6.x.

  • DataX is compatible only with MySQL 5.x. If the version of MySQL is a version other than 5.x, compatibility issues can occur.

  • DataX is compatible only with TSDB 2.4.x or later. If TSDB of an earlier version is used, compatibility issues can occur.

Step 2: Download DataX and the related plug-ins

Step 3: Use the built-in script that is provided by DataX to test whether data can be migrated as expected

The plug-ins Stream Reader and Stream Writer are used in the test. Stream Reader and Stream Writer are easy to use and do not require external dependencies. This way, Stream Reader and Stream Writer can be used to test whether data can be migrated as expected. Stream Reader and Stream Writer can be used to simulate a process for simple data migration. Stream Reader generates random character strings. Stream Writer receives the strings and prints them out into your CLI.

Install DataX

Extract the installation package to DATAX_HOME. Then, run the datax.py script to start the migration task.

cd ${DATAX_HOME}
python bin/datax.py job/job.json

Check whether data is migrated as expected

The following sample shows the summary information that is returned if the data is migrated as expected:

Task start time                    : 2019-04-26 11:18:07
Task end time                    : 2019-04-26 11:18:17
Execution time                    :                 10s
Average traffic                    :          253.91KB/s
Write rate                    :          10000rec/s
Records obtained                    :              100000
Read and write failures                    :                   0

For more information about the sample code, see Quick start to migrate data.

Step 4: Configure and start a task to migrate data from a MySQL database to a TSDB database

In Step 3, data can be migrated as expected when Stream Reader and Stream Writer are used. The migration result indicates that you can use DataX to migrate data. In this case, you can use MySQL Reader and TSDB Writer to migrate data from a MySQL database to a TSDB database.

Configure a task to migrate data

Configure a task named mysql2tsdb.json to migrate data from a MySQL database to a TSDB database. You can use the following sample code to configure the task. For more information about the parameters, see the "Parameters" section.

{
  "job": {
    "content": [
      {
        "reader": {
          "name": "mysqlreader",
          "parameter": {
            "column": [
              "name",
              "type",
              "create_time",
              "price"
            ],
            "connection": [
              {
                "jdbcUrl": [
                  "jdbc:mysql://127.0.0.1:3306/datax"
                ],
                "table": [
                  "book"
                ]
              }
            ],
            "password": "yuzhouwan",
            "splitPk": "id",
            "username": "root"
          }
        },
        "writer": {
          "name": "tsdbwriter",
          "parameter": {
            "column": [
              "name",
              "type",
              "create_time",
              "price"
            ],
            "columnType": [
              "tag",
              "tag",
              "timestamp",
              "metric_num"
            ],
            "sourceDbType": "RDB",
            "tsdbAddress": "localhost",
            "tsdbPort": 8242
          }
        }
      }
    ],
    "setting": {
      "speed": {
        "channel": 1
      }
    }
  }
}

Start the task to migrate data from a MySQL database to a TSDB database

cd ${DATAX_HOME}/..
 ls
  datax/  datax.tar.gz  mysql2tsdb.json
python datax/bin/datax.py mysql2tsdb.json

Check whether data is migrated as expected

The following sample shows the summary information that is returned if the data is migrated as expected:

Task start time                    : 2019-05-21 18:25:16
Task end time                    : 2019-05-21 18:25:27
Execution time                    :                 11s
Average traffic                    :                3B/s
Write rate                    :               0rec/s
Records obtained                    :                   3
Read and write failures                    :                   0

For more information about the sample code, see Migrate data from a MySQL database to a TSDB database.

Parameters

The following tables describe the parameters.

Parameters for MySQL Reader

Parameter

Type

Required

Description

Default value

Example

jdbcUrl

String

Yes

The database URL that is used by Java Database Connectivity (JDBC) to connect to the MySQL database.

None

jdbc:mysql://127.0.0.1:3306/datax

username

String

Yes

The username that is used to connect to the MySQL database.

None

root

password

String

Yes

The password that is used to connect to the MySQL database.

None

root

table

String

Yes

The table that contains the data to be migrated.

None

book

column

Array

Yes

The names of the columns that you want to migrate.

[]

["m"]

splitPk

String

No

The shard key that is used for table sharding. If you configure a splitPk value, the table is sharded based on the specified value.

None

id

Parameters for TSDB Writer

Parameter

Type

Required

Description

Default value

Example

sourceDbType

String

No

Valid values are TSDB and RDB. The value TSDB specifies a system that processes time series data, such as OpenTSDB, InfluxDB, Prometheus, or TimescaleDB. The value RDB specifies a system that processes data points that are related to one another, such as MySQL, Oracle, PostgreSQL, or PolarDB-X.

TSDB

RDB

column

Array

Yes

The names of the columns that you want to migrate.

[]

["name","type","create_time","price"]

columnType

Array

Yes

The data type of the column in the TSDB database after the column is migrated. Valid values: timestamp, tag, metric_num, and metric_string. The value timestamp specifies that the column contains timestamps. The value tag specifies that the column contains tags. The value metric_num specifies that the column contains data of a numeric data type. The value metric_string specifies that the column contains data of a string data type.

[]

["tag","tag","timestamp","metric_num"]

tsdbAddress

String

Yes

The IP address of the TSDB database.

None

127.0.0.1

tsdbPort

int

Yes

The port that is used by the TSDB database.

None

8242

batchSize

int

No

The number of data records that you want to migrate at the same time. The value must be an integer that is greater than 0.

100

100

Note

Make sure that the TSDB database can be accessed by each process of the migration task.

TSDB Writer calls the HTTP endpoint /api/put to write data. If you need to migrate data, make sure that each process of the migration task can access the HTTP endpoint that is provided by the TSDB database. Otherwise, a connection exception is reported.

Make sure that the MySQL database can be accessed by each process of the migration task.

MySQL Reader uses JDBC to read data. If you need to migrate data, make sure that each process of the migration task can access the JDBC API that is provided by the MySQL database. Otherwise, a connection exception is reported.

The sequence of column names that are configured in TSDB Writer and MySQL Reader must be consistent.

The sequence of the column names configured in TSDB Writer must be the same as that of the column names configured in MySQL Reader. Otherwise, data cannot be processed as expected.

FAQ

Can I change the Java Virtual Machine (JVM) memory size for a migration process?

Yes, you can change the JVM memory size for a migration process. If you want to change the JVM memory size for a task that migrates data from a MySQL database to a TSDB database, run the following command:

python datax/bin/datax.py mysql2tsdb.json -j "-Xms4096m -Xmx4096m"

How do I add an IP address to the whitelist of a TSDB database?

You can add an IP address to the whitelist of a TSDB database based on the description provided in TSDB documentation. You can choose Quick Start > Set the IP address whitelist to view the description.

If my migration task runs on an Elastic Compute Service (ECS) instance, how do I configure a virtual private cloud (VPC) and what are the common issues that can occur?

You can configure a VPC and learn about the common issues that can occur from the "Use cases of ECS security groups" and "FAQ about VPCs" topics in TSDB documentation.