This topic describes how to use DataX to migrate data from Prometheus Service (Prometheus) to a Time Series Database (TSDB) database. DataX is an open source tool that is provided by Alibaba Group.
Background information
You can use DataX to migrate data from Prometheus to a TSDB database.
For more information about how to use DataX, see README.
The following sections describe DataX, the Prometheus Reader plug-in, and the TSDB Writer plug-in. Prometheus Reader and TSDB Writer are provided by DataX to migrate data.
DataX
DataX is an offline data synchronization tool that is widely used within Alibaba Group. You can use DataX to efficiently synchronize data between various disparate data sources, including MySQL, Oracle, SQL Server, PostgreSQL, Hadoop Distributed File System (HDFS), Hive, AnalyticDB for MySQL, HBase, Tablestore (OTS), MaxCompute, and Distributed Relational Database Service (DRDS). MaxCompute is previously known as Open Data Processing Service (ODPS).
Prometheus Reader
DataX provides the Prometheus Reader plug-in to read data points from Prometheus.
TSDB Writer
DataX provides the TSDB Writer plug-in to write data points to a TSDB database. TSDB is a database service that is developed by Alibaba Cloud.
Quick Start
Step 1: Configure an environment
Linux
Java Development Kit (JDK): Use JDK version 1.8 or later. We recommend that you use JDK 1.8.
Python: We recommend that you use Python 2.6.x.
Prometheus: Only Prometheus versions 2.9.x are supported. Earlier versions are not fully compatible with DataX.
TSDB: Only TSDB versions 2.4.x or later are supported. Earlier versions are not fully compatible with DataX.
Step 2: Download DataX and the plug-ins
Click DataX to download DataX and the TSDB Writer plug-in.
Click Prometheus Reader to download the Prometheus Reader plug-in.
Step 3: Use the built-in script provided by DataX to test whether DataX can migrate data as expected
The Stream Reader and Stream Writer plug-ins are used in the test. Stream Reader and Stream Writer do not require external dependencies. Therefore, Stream Reader and Stream Writer are suitable for tests. Stream Reader and Stream Writer are used to simulate a simple data migration process. Stream Reader generates random character strings. Stream Writer receives the strings and prints them on your CLI.
Install DataX
Decompress the installation package to a specified directory and then run the datax.py script to start the migration task.
$ cd ${DATAX_HOME}
$ python bin/datax.py job/job.json
Check the migration result
The following sample shows the information that is returned if the data is migrated:
Task start time: 2019-04-26 11:18:07
Task end time: 2019-04-26 11:18:17
Time consumed: 10s
Average traffic: 253.91KB/s
Write rate: 10000rec/s
Number of records obtained: 100000
Number of write and read failures: 0
For more information about how to test DataX, see Quick start to migrate data.
Step 4: Configure and start a task to migrate data from Prometheus to a TSDB database
In Step 3, the Stream Reader and Stream Writer plug-ins are used to test the migration capability of DataX. The migration result indicates that DataX can migrate data as expected. The following parts describe how to use the Prometheus Reader and TSDB Writer plug-ins to migrate data from Prometheus to a TSDB database.
Configure a migration task
Configure a task to migrate data from Prometheus to a TSDB database. In this example, the task name is prometheus2tsdb.json
. The following sample code provides an example of task configuration. For more information about the parameters, see the Parameter description section.
{
"job":{
"content":[
{
"reader":{
"name":"prometheusreader",
"parameter":{
"endpoint":"http://localhost:9090",
"column":[
"up"
],
"beginDateTime":"2019-05-20T16:00:00Z",
"endDateTime":"2019-05-20T16:00:10Z"
}
},
"writer":{
"name":"tsdbwriter",
"parameter":{
"endpoint":"http://localhost:8242"
}
}
}
],
"setting":{
"speed":{
"channel":1
}
}
}
}
Start the migration task
$ cd ${DATAX_HOME}/..
$ ls
datax/ datax.tar.gz prometheus2tsdb.json
$ python datax/bin/datax.py prometheus2tsdb.json
Check the migration result
The following sample shows the information that is returned if the data is migrated:
Task start time: 2019-05-20 20:22:39
Task end time: 2019-05-20 20:22:50
Time consumed: 10s
Average traffic: 122.07KB/s
Write rate: 1000rec/s
Number of records obtained: 10000
Number of write and read failures: 0
For more information about how to migrate data from Prometheus to a TSDB database, see Migrate data from Prometheus to a TSDB database.
Parameter description
The following tables describe the parameters.
Prometheus Reader
Parameter | Type | Required | Description | Default value | Example |
---|---|---|---|---|---|
endpoint | String | Yes | The HTTP endpoint of Prometheus. | N/A |
|
column | Array | Yes | The columns of data that you want to migrate. |
|
|
beginDateTime | String | Yes | This parameter is used together with the endDateTime parameter to specify a time range. The data that is generated within this time range is migrated. | N/A |
|
endDateTime | String | Yes | This parameter is used together with the beginDateTime parameter to specify a time range. The data that is generated within this time range is migrated. | N/A |
|
TSDB Writer
Parameter | Type | Required | Description | Default value | Example |
---|---|---|---|---|---|
endpoint | String | Yes | The HTTP endpoint of the destination TSDB database. | N/A |
|
batchSize | Integer | No | The number of data records that you want to migrate at a time. The value must be an integer greater than 0. | 100 | 100 |
maxRetryTime | Integer | No | The maximum number of retries that are allowed after a migration failure occurs. The value must be an integer greater than 1. | 3 | 3 |
ignoreWriteError | Boolean | No | This parameter specifies whether to ignore the maxRetryTime parameter. If the ignoreWriteError parameter is set to true, the system ignores write errors and attempts to write data again. If the ignoreWriteError parameter is set to false, the task for writing data is terminated after the maximum number of retries is reached. | false | false |
Note
Make sure that DataX can access the TSDB database.
TSDB Writer calls the /api/put
HTTP API operation to write data. To ensure successful data migration, make sure that each process of the migration task can access the HTTP API that is provided by TSDB. Otherwise, a connection exception is thrown.
Make sure that DataX can access Prometheus.
Prometheus Reader reads data by calling the /api/v1/query_range
operation. To ensure successful data migration, make sure that each process of the migration task can access the HTTP API that is provided by Prometheus. Otherwise, a connection exception is thrown.
FAQ
Can I change the Java Virtual Machine (JVM) memory size for a migration process?
Yes, you can change the JVM memory size for a migration process. You can run the following command to change the JVM memory size for a task that migrates data from Prometheus to a TSDB database:
python datax/bin/datax.py prometheus2tsdb.json -j "-Xms4096m -Xmx4096m"
How do I add an IP address to an IP address whitelist of a TSDB database?
For information about how to configure an IP address whitelist, see Set the IP address whitelist.
If my migration task runs on an Elastic Compute Service (ECS) instance, how do I configure a virtual private cloud (VPC) and what are the problems I may encounter?