All Products
Search
Document Center

Time Series Database:Migrate data from Prometheus to TSDB

Last Updated:Mar 28, 2026

Use DataX to migrate historical time series data from Prometheus to a Time Series Database (TSDB) instance. DataX is an open source offline data synchronization tool from Alibaba Group that uses two plug-ins for this workflow: Prometheus Reader reads data points from Prometheus, and TSDB Writer writes them to TSDB.

How it works

  1. Prometheus Reader calls the /api/v1/query_range API to read data points from Prometheus within a specified time range.

  2. DataX transfers the data through one or more parallel channels.

  3. TSDB Writer calls the /api/put API to write the data points to TSDB.

Prerequisites

Before you begin, make sure you have:

  • A Linux environment

  • Java Development Kit (JDK) 1.8 or later installed. JDK 1.8 is recommended. Download from the Oracle website.

  • Python 2.6.x installed. Download from python.org.

  • Prometheus 2.9.x. Earlier versions are not fully compatible with DataX.

  • TSDB 2.4.x or later. Earlier versions are not fully compatible with DataX.

  • Network connectivity from the machine running DataX to both the Prometheus endpoint and the TSDB endpoint. DataX calls their HTTP APIs directly — a connection exception is thrown if either endpoint is unreachable.

Download DataX and the plug-ins

  1. Download DataX with the TSDB Writer plug-in.

  2. Download the Prometheus Reader plug-in.

For general DataX documentation, see the DataX README.

Test the DataX installation

Before migrating your Prometheus data, run the built-in test job to confirm DataX is working correctly. The test uses Stream Reader and Stream Writer — two plug-ins that require no external dependencies. Stream Reader generates random strings; Stream Writer prints them to your terminal.

Decompress the DataX package and run the built-in job:

cd ${DATAX_HOME}
python bin/datax.py job/job.json

A successful run produces output similar to the following:

Task start time: 2019-04-26 11:18:07
Task end time: 2019-04-26 11:18:17
Time consumed: 10s
Average traffic: 253.91KB/s
Write rate: 10000rec/s
Number of records obtained: 100000
Number of write and read failures: 0

If Number of write and read failures is 0, DataX is installed correctly. For a full walkthrough, see the quick start demo.

Migrate data from Prometheus to TSDB

Create the job configuration

Create a JSON file for the migration job. This example uses the filename prometheus2tsdb.json.

All jobs use the same top-level structure: a reader block for Prometheus Reader and a writer block for TSDB Writer.

{
  "job": {
    "content": [
      {
        "reader": {
          "name": "prometheusreader",
          "parameter": {
            "endpoint": "http://localhost:9090",
            "column": [
              "up"
            ],
            "beginDateTime": "2019-05-20T16:00:00Z",
            "endDateTime": "2019-05-20T16:00:10Z"
          }
        },
        "writer": {
          "name": "tsdbwriter",
          "parameter": {
            "endpoint": "http://localhost:8242"
          }
        }
      }
    ],
    "setting": {
      "speed": {
        "channel": 1
      }
    }
  }
}

Replace the placeholder values with your actual endpoints, metric names, and time range. See the Parameters section for a full description of each field.

Run the migration

Place prometheus2tsdb.json in the parent directory of the extracted DataX package, then run:

cd ${DATAX_HOME}/..
ls
# datax/  datax.tar.gz  prometheus2tsdb.json

python datax/bin/datax.py prometheus2tsdb.json

Performance tuning: For large migrations, increase the Java Virtual Machine (JVM) heap size with the -j flag:

python datax/bin/datax.py prometheus2tsdb.json -j "-Xms4096m -Xmx4096m"

Verify the migration

A successful migration produces output similar to the following:

Task start time: 2019-05-20 20:22:39
Task end time: 2019-05-20 20:22:50
Time consumed: 10s
Average traffic: 122.07KB/s
Write rate: 1000rec/s
Number of records obtained: 10000
Number of write and read failures: 0

Check these two fields to diagnose problems:

FieldWhat it means
Number of records obtainedTotal data points read from Prometheus. A value of 0 usually means the time range or metric name is incorrect.
Number of write and read failuresFailed write attempts after retries. A non-zero value indicates a network issue or a TSDB connectivity problem. Check that the TSDB endpoint is accessible and that the IP address running DataX is on the TSDB IP address whitelist.

For a full walkthrough of this migration, see the Prometheus to TSDB migration demo.

Parameters

Prometheus Reader

ParameterTypeRequiredDefaultDescriptionExample
endpointStringYesHTTP endpoint of the Prometheus instance.http://127.0.0.1:9090
columnArrayYes[]List of metric names to migrate.["m"]
beginDateTimeStringYesStart of the time range to migrate. Used together with endDateTime.2019-05-13 15:00:00
endDateTimeStringYesEnd of the time range to migrate. Used together with beginDateTime.2019-05-13 17:00:00

TSDB Writer

ParameterTypeRequiredDefaultDescriptionExample
endpointStringYesHTTP endpoint of the destination TSDB instance.http://127.0.0.1:8242
batchSizeIntegerNo100Number of data points written per batch. Must be greater than 0.100
maxRetryTimeIntegerNo3Maximum number of retries after a write failure. Must be greater than 1.3
ignoreWriteErrorBooleanNofalseIf true, write errors are ignored and the job continues. If false, the job stops after maxRetryTime retries are exhausted.false

FAQ

If my migration job runs on an Elastic Compute Service (ECS) instance, how do I configure the VPC network?

See Use cases of ECS security groups for guidance on configuring virtual private cloud (VPC) access and security group rules.

What's next

  • Set the IP address whitelist — add the IP address of the machine running DataX to the TSDB whitelist before running a migration.

  • DataX README — learn more about DataX configuration, plug-ins, and advanced options.