This topic describes how to use DataX, an open source tool from Alibaba, to migrate data from a relational database to LindormTSDB.
Background information
DataX is an offline data synchronization tool that is widely used within Alibaba Group. It provides efficient data synchronization between disparate data sources, such as MySQL, Oracle, SQL Server, PostgreSQL, HDFS, Hive, ADS, HBase, TableStore (OTS), MaxCompute (ODPS), and DRDS. For more information about how to use DataX, see the DataX README document.
This topic introduces the DataX tool and uses MySQL as an example to demonstrate how to migrate data to LindormTSDB. This migration task involves two plugins: MySQL Reader and TSDB Writer.
MySQL Reader
MySQL Reader is a DataX plugin that reads data from MySQL.
TSDB Writer
TSDB Writer is a DataX plugin that writes data points to LindormTSDB.
Precautions
Ensure network connectivity to LindormTSDB
TSDB Writer writes data by calling the /api/mput and /api/put HTTP APIs. Ensure that the migration task can access the HTTP APIs exposed by LindormTSDB. Otherwise, a Connect Exception error occurs.
Ensure network connectivity to MySQL
MySQL Reader reads data using Java Database Connectivity (JDBC). Ensure that the migration task can access the JDBC interface exposed by MySQL. Otherwise, a Connect Exception error occurs.
Preparations
Prepare the environment.
Download DataX and its plugins.
You can download the package from the following link.
Use the test migration task included with DataX to verify that the migration process is working correctly.
This example uses the simple Stream Reader to Stream Writer task. These two plugins do not depend on any external environment and are ideal for verifying that the process is working. Stream Reader randomly generates strings. Stream Writer outputs the received strings to the console. This simulates a basic data migration process.
Deploy the tool
Unzip the downloaded installation package to a directory. This directory is used as DATAX_HOME. Then, run the test migration task:
$ cd ${DATAX_HOME}
$ python bin/datax.py job/job.jsonCheck if the task is successful
The following summary information is displayed after the task is complete. This output indicates that the migration task was successful:
Task start time : 2019-04-26 11:18:07
Task end time : 2019-04-26 11:18:17
Total time : 10s
Average traffic : 253.91KB/s
Record write speed : 10000rec/s
Total records read : 100000
Total read/write failures : 0
Configure and start the migration task from MySQL to LindormTSDB
After the Stream Reader to Stream Writer migration task confirms that the DataX process is working correctly, you can start the migration task from MySQL Reader to TSDB Writer.
Parameters
The configuration items are described as follows:
MySQL Reader parameters
Name | Type | Required | Description | Default value | Example |
jdbcUrl | String | Yes | The JDBC connection string for the database. | None |
|
username | String | Yes | The username for the data source. | None |
|
password | String | Yes | The password for the specified username. | None |
|
table | String | Yes | The table to synchronize. | None |
|
column | Array | Yes | The columns to synchronize from the table. |
|
|
splitPk | String | No | If you specify `splitPk`, the system uses the specified field for data partitioning. | Empty |
|
TSDB Writer parameters
Name | Type | Required | Description | Default value | Example |
sourceDbType | String | No | Valid values are `TSDB` and `RDB`. `TSDB` refers to time series engines such as OpenTSDB, InfluxDB, Prometheus, and TimeScale. `RDB` refers to relational databases such as MySQL, Oracle, PostgreSQL, and DRDS. |
|
|
column | Array | Yes | The columns to synchronize. |
|
|
columnType | Array | Yes | The types in LindormTSDB to which the fields from the relational database table are mapped. Supported types include the following: timestamp: The field is a timestamp. tag: The field is a tag. metric_num: The value of the metric is a numeric type. metric_string: The value of the metric is a string. |
|
|
endpoint | String | Yes | The IP address and port of LindormTSDB | None |
|
multiField | Bool | Yes | Specifies whether to write data in multi-field mode. Set this parameter to `true`. | false | true |
table | String | Yes | The name of the table to import into LindormTSDB. | None | table |
username | String | No | The database username. | None | tsdbuser |
password | String | No | The database password. | None | tsdbpsw |
ignoreWriteError | Bool | Yes | Ignore write errors | false | true |
database | String | No | The name of the database to be imported into TSDB | default | default |
Configure the migration task
Configure a task to synchronize data from a MySQL database to LindormTSDB. Name the configuration file mysql2tsdb.json. The complete configuration is as follows. For a detailed description of each parameter, see "Parameters":
{
"job": {
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"column": [
"name",
"type",
"create_time",
"price",
"description",
"available"
],
"connection": [
{
"jdbcUrl": [
"jdbc:mysql://127.0.0.1:3306/datax"
],
"table": [
"book"
]
}
],
"password": "xxx",
"splitPk": "id",
"username": "root"
}
},
"writer": {
"name": "tsdbwriter",
"parameter": {
"column": [
"name",
"type",
"create_time",
"price",
"description",
"available"
],
"columnType": [
"tag",
"tag",
"timestamp",
"field_double",
"field_string",
"field_bool"
],
"sourceDbType": "RDB",
"endpoint": "http://localhost:8242",
"table": "my_book",
"multiField": "true",
"username":"xxx",
"password":"xxx",
"ignoreWriteError":"false",
"database":"default"
}
}
}
],
"setting": {
"speed": {
"channel": 1
}
}
}
} Keep the column order consistent
Ensure that the order of fields in the `column` parameter for the TSDB Writer plugin is the same as the order in the `column` parameter for the MySQL Reader plugin. Otherwise, the data will be disordered.
Start the migration task from MySQL to LindormTSDB
$ cd ${DATAX_HOME}/..
$ ls
datax/ datax.tar.gz mysql2tsdb.json
$ python datax/bin/datax.py mysql2tsdb.json
Check if the task is successful
The following summary information is displayed after the task is complete. This output indicates that the migration task was successful:
Task start time : 2019-05-21 18:25:16
Task end time : 2019-05-21 18:25:27
Total time : 11s
Average traffic : 3B/s
Record write speed : 0rec/s
Total records read : 3
Total read/write failures : 0
FAQ
Q: Can I adjust the JVM memory size for the migration process?
A: Yes, you can. For example, to migrate data from MySQL to LindormTSDB, use the following start command:
python datax/bin/datax.py mysql2tsdb.json -j "-Xms4096m -Xmx4096m" Q: How do I configure a network whitelist for LindormTSDB?
A: For more information, see Configure a whitelist.
Q: If I run the migration task on an ECS instance, how do I configure the VPC, and what are the common issues?
A: For more information, see ECS security group configuration use cases and VPC FAQ.