All Products
Search
Document Center

Migrate data from Prometheus to TSDB

Last Updated: May 19, 2022

This topic describes how to use DataX to migrate data from Prometheus Service (Prometheus) to a Time Series Database (TSDB) database. DataX is an open source tool that is provided by Alibaba Group.

Background information

You can use DataX to migrate data from Prometheus to a TSDB database.

For more information about how to use DataX, see README.

The following sections describe DataX, the Prometheus Reader plug-in, and the TSDB Writer plug-in. Prometheus Reader and TSDB Writer are provided by DataX to migrate data.

DataX

 DataX is an offline data synchronization tool that is widely used within Alibaba Group. You can use DataX to efficiently synchronize data between various disparate data sources, including MySQL, Oracle, SQL Server, PostgreSQL, Hadoop Distributed File System (HDFS), Hive, AnalyticDB for MySQL, HBase, Tablestore (OTS), MaxCompute, and Distributed Relational Database Service (DRDS). MaxCompute is previously known as Open Data Processing Service (ODPS).

Prometheus Reader

 DataX provides the Prometheus Reader plug-in to read data points from Prometheus.

TSDB Writer

 DataX provides the TSDB Writer plug-in to write data points to a TSDB database. TSDB is a database service that is developed by Alibaba Cloud.

Quick Start

Step 1: Configure an environment

  • Linux

  • Java Development Kit (JDK): Use JDK version 1.8 or later. We recommend that you use JDK 1.8.

  • Python: We recommend that you use Python 2.6.x.

  • Prometheus: Only Prometheus versions 2.9.x are supported. Earlier versions are not fully compatible with DataX.

  • TSDB: Only TSDB versions 2.4.x or later are supported. Earlier versions are not fully compatible with DataX.

Step 2: Download DataX and the plug-ins

  • Click DataX to download DataX and the TSDB Writer plug-in.

  • Click Prometheus Reader to download the Prometheus Reader plug-in.

Step 3: Use the built-in script provided by DataX to test whether DataX can migrate data as expected

 The Stream Reader and Stream Writer plug-ins are used in the test. Stream Reader and Stream Writer do not require external dependencies. Therefore, Stream Reader and Stream Writer are suitable for tests. Stream Reader and Stream Writer are used to simulate a simple data migration process. Stream Reader generates random character strings. Stream Writer receives the strings and prints them on your CLI.

Install DataX

 Decompress the installation package to a specified directory and then run the datax.py script to start the migration task.

$ cd ${DATAX_HOME}
$ python bin/datax.py job/job.json

Check the migration result

 The following sample shows the information that is returned if the data is migrated:

Task start time: 2019-04-26 11:18:07
Task end time: 2019-04-26 11:18:17
Time consumed: 10s
Average traffic: 253.91KB/s
Write rate: 10000rec/s
Number of records obtained: 100000
Number of write and read failures: 0

For more information about how to test DataX, see Quick start to migrate data.

Step 4: Configure and start a task to migrate data from Prometheus to a TSDB database

 In Step 3, the Stream Reader and Stream Writer plug-ins are used to test the migration capability of DataX. The migration result indicates that DataX can migrate data as expected. The following parts describe how to use the Prometheus Reader and TSDB Writer plug-ins to migrate data from Prometheus to a TSDB database.

Configure a migration task

 Configure a task to migrate data from Prometheus to a TSDB database. In this example, the task name is prometheus2tsdb.json. The following sample code provides an example of task configuration. For more information about the parameters, see the Parameter description section.

{
"job":{
"content":[
{
"reader":{
"name":"prometheusreader",
"parameter":{
"endpoint":"http://localhost:9090",
"column":[
"up"
],
"beginDateTime":"2019-05-20T16:00:00Z",
"endDateTime":"2019-05-20T16:00:10Z"
}
},
"writer":{
"name":"tsdbwriter",
"parameter":{
"endpoint":"http://localhost:8242"
}
}
}
],
"setting":{
"speed":{
"channel":1
}
}
}
}

Start the migration task

$ cd ${DATAX_HOME}/..
$ ls
  datax/  datax.tar.gz  prometheus2tsdb.json
$ python datax/bin/datax.py prometheus2tsdb.json

Check the migration result

 The following sample shows the information that is returned if the data is migrated:

Task start time: 2019-05-20 20:22:39
Task end time: 2019-05-20 20:22:50
Time consumed: 10s
Average traffic: 122.07KB/s
Write rate: 1000rec/s
Number of records obtained: 10000
Number of write and read failures: 0

For more information about how to migrate data from Prometheus to a TSDB database, see Migrate data from Prometheus to a TSDB database.

Parameter description

 The following tables describe the parameters.

Prometheus Reader

Parameter

Type

Required

Description

Default value

Example

endpoint

String

Yes

The HTTP endpoint of Prometheus.

N/A

http://127.0.0.1:9090

column

Array

Yes

The columns of data that you want to migrate.

[]

["m"]

beginDateTime

String

Yes

This parameter is used together with the endDateTime parameter to specify a time range. The data that is generated within this time range is migrated.

N/A

2019-05-13 15:00:00

endDateTime

String

Yes

This parameter is used together with the beginDateTime parameter to specify a time range. The data that is generated within this time range is migrated.

N/A

2019-05-13 17:00:00

TSDB Writer

Parameter

Type

Required

Description

Default value

Example

endpoint

String

Yes

The HTTP endpoint of the destination TSDB database.

N/A

http://127.0.0.1:8242

batchSize

Integer

No

The number of data records that you want to migrate at a time. The value must be an integer greater than 0.

100

100

maxRetryTime

Integer

No

The maximum number of retries that are allowed after a migration failure occurs. The value must be an integer greater than 1.

3

3

ignoreWriteError

Boolean

No

This parameter specifies whether to ignore the maxRetryTime parameter. If the ignoreWriteError parameter is set to true, the system ignores write errors and attempts to write data again. If the ignoreWriteError parameter is set to false, the task for writing data is terminated after the maximum number of retries is reached.

false

false

Note

Make sure that DataX can access the TSDB database.

 TSDB Writer calls the /api/put HTTP API operation to write data. To ensure successful data migration, make sure that each process of the migration task can access the HTTP API that is provided by TSDB. Otherwise, a connection exception is thrown.

Make sure that DataX can access Prometheus.

 Prometheus Reader reads data by calling the /api/v1/query_range operation. To ensure successful data migration, make sure that each process of the migration task can access the HTTP API that is provided by Prometheus. Otherwise, a connection exception is thrown.

FAQ

Can I change the Java Virtual Machine (JVM) memory size for a migration process?

Yes, you can change the JVM memory size for a migration process. You can run the following command to change the JVM memory size for a task that migrates data from Prometheus to a TSDB database:

python datax/bin/datax.py prometheus2tsdb.json -j "-Xms4096m -Xmx4096m"

How do I add an IP address to an IP address whitelist of a TSDB database?

For information about how to configure an IP address whitelist, see Set the IP address whitelist.

If my migration task runs on an Elastic Compute Service (ECS) instance, how do I configure a virtual private cloud (VPC) and what are the problems I may encounter?

See Use cases of ECS security groups.