This topic describes how to use DataX to migrate data from a relational database to LindormTSDB. DataX is an open source tool provided by Alibaba Group.
Background information
DataX is an offline data synchronization tool that is widely used within Alibaba Group. You can use DataX to efficiently synchronize data between various disparate data sources, including MySQL, Oracle, SQL Server, PostgreSQL, Hadoop Distributed File System (HDFS), Hive, AnalyticDB for MySQL, HBase, Tablestore (OTS), MaxCompute, and PolarDB-X. MaxCompute is previously known as Open Data Processing Service (ODPS). For more information about how to use DataX, see README.
The following sections describe DataX and provide an example on how to migrate data from a relational database to a database in LindormTSDB. A MySQL database is used in the example. MySQL Reader and TSDB Writer are used to perform data migration.
MySQL Reader
MySQL Reader is a plug-in that is powered by DataX. You can use MySQL Reader to query data from a MySQL database.
TSDB Writer
DataX provides the TSDB Writer plug-in that is used to write data to a database in LindormTSDB.
Precautions
Make sure that DataX can access LindormTSDB.
TSDB Writer calls the HTTP endpoint /api/mput and the HTTP endpoint /api/put to write data. If you want to migrate data, make sure that each process of the migration task can access the HTTP endpoint provided by LindormTSDB. Otherwise, a connection exception is thrown.
Make sure that your MySQL database can be accessed by each process of the migration task.
MySQL Reader uses Java Database Connectivity (JDBC) to read data. If you need to migrate data, make sure that each process of the migration task can access the JDBC API that is provided by your MySQL database. Otherwise, a connection exception is thrown.
Preparations
Configure an environment.
Linux
Java Development Kit (JDK): Use JDK 1.8 or later.
Python: We recommend that you use Python 2.6.x.
MySQL: DataX is compatible only with MySQL 5.x.
LindormTSDB: Activate LindormTSDB.
Download DataX and the plug-ins.
Click here to download the corresponding TAR file.
Use the built-in script provided by DataX to test whether data can be migrated as expected.
The plug-ins used in the test are Stream Reader and Stream Writer. Stream Reader and Stream Writer do not require external dependencies. Therefore, Stream Reader and Stream Writer are suitable for tests. Stream Reader and Stream Writer are used to simulate a simple data migration process. Stream Reader generates random character strings. Stream Writer receives the strings and displays them in your CLI.
Install DataX
Decompress the installation package to DATAX_HOME. Then, run the following command to start a migration testing task:
$ cd ${DATAX_HOME}
$ python bin/datax.py job/job.jsonCheck the migration result
The following example shows the summary information that is returned if the data is migrated as expected:
Task start time : 2019-04-26 11:18:07
Task end time : 2019-04-26 11:18:17
Execution time : 10s
Average traffic : 253.91KB/s
Write rate : 10000rec/s
Records obtained : 100000
Number of write and read failures : 0
Configure and run a task that migrates data from a MySQL database to a database in LindormTSDB
In the preceding step, data can be migrated as expected when Stream Reader and Stream Writer are used. The migration result indicates that you can use DataX to migrate data. In this case, you can use MySQL Reader and TSDB Writer to migrate data from your MySQL database to a database in LindormTSDB.
Parameters
The following tables describe the parameters that you can configure.
MySQL Reader parameters
Parameter | Type | Required | Description | Default value | Example |
jdbcUrl | String | Yes | Specifies the database URL that is used by JDBC to connect to the MySQL database. | None |
|
username | String | Yes | Specifies the username that is used to connect to the MySQL database. | None |
|
password | String | Yes | Specifies the password that is used to connect to the MySQL database. | None |
|
table | String | Yes | Specifies the table that contains the data that you want to migrate. | None |
|
column | Array | Yes | Specifies the names of the columns that you want to migrate. |
|
|
splitPk | String | No | Specifies the shard key that is used for table sharding. If you specify a value for splitPk, the table is sharded based on the specified value. | None |
|
TSDB Writer parameters
Parameter | Type | Required | Description | Default value | Example |
sourceDbType | String | No | Valid values: TSDB and RDB. The value of TSDB specifies a time series engine, such as OpenTSDB, InfluxDB, Prometheus, or TimeScale. The value of RDB specifies a relational database engine, such as MySQL, Oracle, PostgreSQL, or PolarDB-X. |
|
|
column | Array | Yes | Specifies the names of the columns that you want to migrate. |
|
|
columnType | Array | Yes | Specifies the data types of the columns in the database in LindormTSDB after the columns are migrated. Valid values: timestamp, tag, metric_num, and metric_string. The value timestamp specifies that the column contains timestamps. The value tag specifies that the column contains tags. The value metric_num specifies that the column contains data of a numeric data type. The value metric_string specifies that the column contains data of a string data type. |
|
|
endpoint | String | Yes | Specifies the endpoint of LindormTSDB. Specify the value in the IP:PORT format. | None |
|
multiField | Bool | Yes | Specifies whether to perform your data write operation by calling the /api/mput HTTP endpoint. We recommend that you set the value to true. | false | true |
table | String | Yes | Specifies the table that you want to migrate to the database in LindormTSDB. | None | table |
username | String | No | Specifies the username that is used to connect to the database in LindormTSDB. | None | tsdbuser |
password | String | No | Specifies the password that is used to connect to the database in LindormTSDB. | None | tsdbpsw |
ignoreWriteError | Bool | Yes | Specifies whether to ignore write failures. | false | true |
database | String | No | Specifies the name of the database to which you want to write the data. | default | default |
Configure a data migration task
Configure a task to migrate data from the MySQL database to the database in LindormTSDB. In this example, the task name is mysql2tsdb.json. The following code provides an example on how to configure a data migration task. For more information about the parameters, see Parameters.
{
"job": {
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"column": [
"name",
"type",
"create_time",
"price",
"description",
"available"
],
"connection": [
{
"jdbcUrl": [
"jdbc:mysql://127.0.0.1:3306/datax"
],
"table": [
"book"
]
}
],
"password": "xxx",
"splitPk": "id",
"username": "root"
}
},
"writer": {
"name": "tsdbwriter",
"parameter": {
"column": [
"name",
"type",
"create_time",
"price",
"description",
"available"
],
"columnType": [
"tag",
"tag",
"timestamp",
"field_double",
"field_string",
"field_bool"
],
"sourceDbType": "RDB",
"endpoint": "http://localhost:8242",
"table": "my_book",
"multiField": "true",
"username":"xxx",
"password":"xxx",
"ignoreWriteError":"false",
"database":"default"
}
}
}
],
"setting": {
"speed": {
"channel": 1
}
}
}
} Consistent column name sequences
Make sure that the sequence of the column names configured in TSDB Writer is the same as that of the column names configured in MySQL Reader. Otherwise, data cannot be processed as expected.
Start the task
$ cd ${DATAX_HOME}/..
$ ls
datax/ datax.tar.gz mysql2tsdb.json
$ python datax/bin/datax.py mysql2tsdb.json
Check the migration result
The following example shows the summary information that is returned if the data is migrated as expected:
Task start time : 2019-05-21 18:25:16
Task end time : 2019-05-21 18:25:27
Execution time : 11s
Average traffic : 3B/s
Write rate : 0rec/s
Records obtained : 3
Number of write and read failures : 0
FAQ
Can I change the Java Virtual Machine (JVM) memory size for a migration process?
Yes, you can change the JVM memory size for a migration process. If you want to adjust the JVM memory size for a task that migrates data from a MySQL database to a database in LindormTSDB, you can run the following command:
python datax/bin/datax.py mysql2tsdb.json -j "-Xms4096m -Xmx4096m" How do I configure an IP address whitelist for LindormTSDB?
For more information, see Configure whitelists.
If my migration task runs on an Elastic Compute Service (ECS) instance, how do I configure a virtual private cloud (VPC) and what are the common issues?
For more information, see Use cases of ECS security groups and FAQ about VPCs.