Use DataX to migrate data from a MySQL database to Lindorm Time Series Database (TSDB). DataX is an open source offline data synchronization tool developed by Alibaba Group that supports a wide range of data sources, including MySQL, Oracle, SQL Server, PostgreSQL, Hadoop Distributed File System (HDFS), Hive, AnalyticDB for MySQL, HBase, Tablestore (OTS), MaxCompute, and PolarDB-X. The previous name of MaxCompute is Open Data Processing Service (ODPS).
How it works
DataX uses two plug-ins to move data between MySQL and TSDB:
MySQL Reader: Reads data from a MySQL database using Java Database Connectivity (JDBC).
TSDB Writer: Writes data points to TSDB by calling the HTTP endpoint
/api/put.
Each process in the migration task must be able to reach both the MySQL JDBC endpoint and the TSDB HTTP endpoint. If either is unreachable, a connection exception is reported.
Prerequisites
Before you begin, make sure you have:
Linux operating system
Java Development Kit (JDK) 1.8 or later (JDK 1.8 recommended)
Python 2.6.x
MySQL 5.x (DataX is compatible only with MySQL 5.x)
TSDB 2.4.x or later (DataX is compatible only with TSDB 2.4.x and later)
Network connectivity from the migration machine to both the MySQL JDBC endpoint and the TSDB HTTP endpoint
Add the migration machine's IP address to the TSDB whitelist before running any migration task. See Add the TSDB IP whitelist for instructions.
Migrate data from MySQL to TSDB
The migration consists of these steps:
Download and install DataX.
Verify that DataX runs correctly with built-in test plug-ins.
Configure and run the MySQL-to-TSDB migration task.
Run the migration task.
Step 1: Download and install DataX
Download DataX from GitHub and extract the package to DATAX_HOME.
Step 2: Verify the DataX installation
Before migrating real data, verify that DataX runs correctly using the built-in Stream Reader and Stream Writer plug-ins. These plug-ins have no external dependencies: Stream Reader generates random strings and Stream Writer prints them to your CLI.
Run the built-in test job:
cd ${DATAX_HOME}
python bin/datax.py job/job.jsonA successful run shows a summary similar to:
Task start time : 2019-04-26 11:18:07
Task end time : 2019-04-26 11:18:17
Execution time : 10s
Average traffic : 253.91KB/s
Write rate : 10000rec/s
Records obtained : 100000
Read and write failures: 0If Read and write failures is 0, DataX is installed correctly. See Quick start to migrate data for a full walkthrough.
Step 3: Configure the migration task
Create a file named mysql2tsdb.json with the following content. Replace the placeholder values with your actual MySQL connection details and TSDB address.
{
"job": {
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"column": [
"name",
"type",
"create_time",
"price"
],
"connection": [
{
"jdbcUrl": [
"jdbc:mysql://127.0.0.1:3306/datax"
],
"table": [
"book"
]
}
],
"password": "yuzhouwan",
"splitPk": "id",
"username": "root"
}
},
"writer": {
"name": "tsdbwriter",
"parameter": {
"column": [
"name",
"type",
"create_time",
"price"
],
"columnType": [
"tag",
"tag",
"timestamp",
"metric_num"
],
"sourceDbType": "RDB",
"tsdbAddress": "localhost",
"tsdbPort": 8242
}
}
}
],
"setting": {
"speed": {
"channel": 1
}
}
}
}The column array in MySQL Reader and TSDB Writer must list columns in the same order. The columnType array maps each column to its TSDB data type in that same order.
A mismatch between the column order in MySQL Reader and TSDB Writer causes data to be written to the wrong fields.
For parameter details, see Parameters.
Step 4: Run the migration task
Place mysql2tsdb.json in the parent directory of datax/, then run:
cd ${DATAX_HOME}/..
python datax/bin/datax.py mysql2tsdb.jsonA successful run shows output similar to:
Task start time : 2019-05-21 18:25:16
Task end time : 2019-05-21 18:25:27
Execution time : 11s
Average traffic : 3B/s
Write rate : 0rec/s
Records obtained : 3
Read and write failures: 0See Migrate data from a MySQL database to a TSDB database for a complete example.
Parameters
MySQL Reader parameters
| Parameter | Type | Required | Description | Default | Example |
|---|---|---|---|---|---|
jdbcUrl | String | Yes | The JDBC connection URL for the MySQL database. | None | jdbc:mysql://127.0.0.1:3306/datax |
username | String | Yes | The username for the MySQL database. | None | root |
password | String | Yes | The password for the MySQL database. | None | root |
table | String | Yes | The source table containing the data to migrate. | None | book |
column | Array | Yes | The column names to migrate. | [] | ["name", "type", "create_time", "price"] |
splitPk | String | No | The shard key for table sharding. When set, the table is split based on this column to enable parallel reads. | None | id |
TSDB Writer parameters
| Parameter | Type | Required | Description | Default | Example |
|---|---|---|---|---|---|
sourceDbType | String | No | The type of source database. Use RDB when migrating from a relational database (MySQL, Oracle, PostgreSQL, PolarDB-X). Use TSDB when migrating from a time series system (OpenTSDB, InfluxDB, Prometheus, TimescaleDB). | TSDB | RDB |
column | Array | Yes | The column names to write. Must be in the same order as MySQL Reader's column array. | [] | ["name", "type", "create_time", "price"] |
columnType | Array | Yes | The TSDB data type for each column, in the same order as column. Valid values: timestamp, tag, metric_num, metric_string. | [] | ["tag", "tag", "timestamp", "metric_num"] |
tsdbAddress | String | Yes | The IP address of the TSDB instance. | None | 127.0.0.1 |
tsdbPort | int | Yes | The port of the TSDB instance. | None | 8242 |
batchSize | int | No | The number of records to write per batch. Must be greater than 0. | 100 | 100 |
columnType values:
| Value | Description |
|---|---|
timestamp | The column contains timestamps. |
tag | The column contains tag values. |
metric_num | The column contains numeric metric data. |
metric_string | The column contains string metric data. |
FAQ
Can I adjust the Java Virtual Machine (JVM) memory for a migration task?
Pass the -j flag when starting the task:
python datax/bin/datax.py mysql2tsdb.json -j "-Xms4096m -Xmx4096m"How do I add an IP address to the TSDB whitelist?
See the TSDB documentation. Navigate to Quick Start > Set the IP address whitelist for instructions.
My migration task runs on an ECS instance. How do I configure VPC access?
See the "Use cases of ECS security groups" and "FAQ about VPCs" topics in the TSDB documentation.
What's next
Learn more about DataX configuration in the DataX README.