All Products
Search
Document Center

DataWorks:TSDB data source

Last Updated:Nov 16, 2023

DataWorks Data Integration provides TSDB Writer for you to write data points to Lindorm Time Series Database (TSDB) provided by Alibaba Cloud ApsaraDB for Lindorm. This topic describes the capabilities of synchronizing data to TSDB data sources.

Supported TSDB versions

TSDB Writer supports all versions of ApsaraDB for Lindorm and HiTSDB V2.4.X or later.

Limits

How it works

TSDB Writer connects to a TSDB instance by using the TSDB client hitsdb-client and writes data points by using the HTTP API endpoint. For more information, see TSDB SDK documentation.

Data type mappings

If the sourceDbType parameter is set to TSDB, source data is read by using TSDB Reader or OpenTSDB Reader. In this case, TSDB Writer writes the source data to Lindorm TSDB in the format of JSON strings. If the sourceDbType parameter is set to RDB, the source is a relational database. In this case, TSDB Writer parses the source data based on the records of the relational database. The following table lists the valid values of the columnType parameter and the data types that match the column types when the sourceDbType parameter is set to RDB.

Data model

Valid value of columnType

Data type

Tag

tag

A string data type. A tag describes the features of the data source. In most case, a tag does not change over time.

Timestamp

timestamp

The TIMESTAMP data type. A timestamp specifies the point in time at which data is generated. The timestamp can be manually specified when data is written or automatically generated by the system.

Field

field_string

A string data type. A field describes the measurement metrics of the data source. In most case, a field changes over time.

field_double

A numeric data type. A field describes the measurement metrics of the data source. In most case, a field changes over time.

field_boolean

A Boolean data type. A field describes the measurement metrics of the data source. In most case, a field changes over time.

Develop a data synchronization task

Appendix: Code and parameters

Appendix: Configure a batch synchronization task by using the code editor

If you use the code editor to configure a batch synchronization task, you must configure parameters for the reader and writer of the related data source based on the format requirements in the code editor. For more information about the format requirements, see Configure a batch synchronization task by using the code editor. The following information describes the configuration details of parameters for the reader and writer in the code editor.

Code for TSDB Writer

  • Write data from RDB to TSDB by using the following default configurations (recommended)

    {
        "type": "job",
        "version": "2.0",
        "steps": [
            {
                "stepType": "stream",// You can replace the stream plug-in with the specific RDB plug-in. RDB databases include MySQL, Oracle, PostgreSQL, and DRDS databases. 
                "parameter": {},
                "name": "Reader",
                "category": "reader"
            },
            {
                "stepType": "tsdb",
                "parameter": {
                    "endpoint": "http://localhost:8242",
                    "username": "xxx",
                    "password": "xxx",
                    "sourceDbType": "RDB",
                    "batchSize": 256,
                    "columnType": [
                        "tag",
                        "tag",
                        "field_string",
                        "field_double",
                        "timestamp",
                        "field_bool"
                    ],
                    "column": [
                        "tag1",
                        "tag2",
                        "field1",
                        "field2",
                        "timestamp",
                        "field3"
                    ],
                    "multiField": "true",
                    "table": "testmetric",
                    "ignoreWriteError": "false",
                    "database": "default"
                },
                "name": "Writer",
                "category": "writer"
            }
        ],
        "setting": {
            "errorLimit": {
                "record": "0"
            },
            "speed": {
                "throttle":true,// Specifies whether to enable throttling. The value false indicates that throttling is disabled, and the value true indicates that throttling is enabled. The mbps parameter takes effect only when the throttle parameter is set to true. 
                "concurrent":1, // The maximum number of parallel threads. 
                "mbps":"12"// The maximum transmission rate. Unit: MB/s. 
            }
        },
        "order": {
            "hops": [
                {
                    "from": "Reader",
                    "to": "Writer"
                }
            ]
        }
    }
  • Write data from a database that supports the OpenTSDB protocol to TSDB

    {
        "type": "job",
        "version": "2.0",
        "steps": [
            {
                "stepType": "opentsdb",
                "parameter": {
                    "endpoint": "http://localhost:4242",
                    "column": [
                        "m1",
                        "m2",
                        "m3",
                        "m4",
                        "m5",
                        "m6"
                    ],
                    "startTime": "2019-01-01 00:00:00",
                    "endTime": "2019-01-01 03:00:00"
                },
                "name": "Reader",
                "category": "reader"
            },
            {
                "stepType": "tsdb",
                "parameter": {
                    "endpoint": "http://localhost:8242"
                },
                "name": "Writer",
                "category": "writer"
            }
        ],
        "setting": {
            "errorLimit": {
                "record": "0"
            },
            "speed": {
                "throttle":true,// Specifies whether to enable throttling. The value false indicates that throttling is disabled, and the value true indicates that throttling is enabled. The mbps parameter takes effect only when the throttle parameter is set to true. 
                "concurrent":1, // The maximum number of parallel threads. 
                "mbps":"12"// The maximum transmission rate. Unit: MB/s. 
            }
        },
        "order": {
            "hops": [
                {
                    "from": "Reader",
                    "to": "Writer"
                }
            ]
        }
    }
  • Use the OpenTSDB protocol to write a univariate data point to TSDB (not recommended)

    
    {
        "type": "job",
        "version": "2.0",
        "steps": [
            {
                "stepType": "stream",// You can replace the stream plug-in with the specific RDB plug-in. RDB databases include MySQL, Oracle, PostgreSQL, and DRDS databases. 
                "parameter": {},
                "name": "Reader",
                "category": "reader"
            },
            {
                "stepType": "tsdb",
                "parameter": {
                    "endpoint": "http://localhost:8242",
                    "username": "xxx",
                    "password": "xxx",
                    "sourceDbType": "RDB",
                    "batchSize": 256,
                    "columnType": [
                        "tag",
                        "tag",
                        "field_string",
                        "field_double",
                        "timestamp",
                        "field_boolean"
                    ],
                    "column": [
                        "tag1",
                        "tag2",
                        "field_metric_1",
                        "field_metric_2",
                        "timestamp",
                        "field_metric_3"
                    ],
                    "ignoreWriteError": "false"
                },
                "name": "Writer",
                "category": "writer"
            }
        ],
        "setting": {
            "errorLimit": {
                "record": "0"
            },
            "speed": {
                "throttle":true,// Specifies whether to enable throttling. The value false indicates that throttling is disabled, and the value true indicates that throttling is enabled. The mbps parameter takes effect only when the throttle parameter is set to true. 
                "concurrent":1, // The maximum number of parallel threads. 
                "mbps":"12"// The maximum transmission rate. Unit: MB/s. 
            }
        },
        "order": {
            "hops": [
                {
                    "from": "Reader",
                    "to": "Writer"
                }
            ]
        }
    }
    Note

    The names of the TSDB metrics are determined by the column names of fields for the column parameter. In the preceding code, a row of data in a relational database is written to three metrics: field_metric_1, field_metric_2, and field_metric_3.

Parameters in code for TSDB Writer

Parameter type

Parameter

Description

Required

Default value

Common parameters

sourceDbType

The type of the source database.

No

TSDB

Note

Valid values: TSDB and RDB. The value TSDB indicates that the source database is an OpenTSDB, Prometheus, or Timescale database. The value RDB indicates that the source database is a relational database, such as a MySQL, Oracle, PostgreSQL, or DRDS database.

endpoint

The HTTP URL of the destination TSDB database. Specify the endpoint in the format of http://IP address:Port number. You can obtain the HTTP endpoint in the ApsaraDB for Lindorm console.

Yes

No default value

database

The name of the TSDB database to which data is written.

No

default

Note

You must create a database first.

username

The username of the TSDB database. You must specify a value for this parameter if you configure authentication for the TSDB database.

No

No default value

batchSize

The number of data records to write at a time. The value of this parameter is of the INT type and must be greater than 0. If you want to configure a large value for the batchSize parameter, you must reserve more memory space.

No

100

Parameters for TSDB

maxRetryTime

The maximum number of retries allowed after a failure. The value of this parameter is of the INT type and must be greater than 1.

No

3

ignoreWriteError

Specifies whether to ignore write errors. The value of this parameter is of the BOOLEAN type. If you set this parameter to true, TSDB Writer continues to perform the write operation after a write error occurs. If the write operation fails after the specified number of retries, the synchronization task is terminated.

No

false

Parameters for RDB

table

The names of the metrics that you want to import to TSDB. If the multiField parameter is set to false, you can leave this parameter empty. In this case, you need to specify the names of the metrics for the column parameter. If the multiField parameter is set to true, you must configure this parameter.

No

No default value

multiField

Specifies whether to write a multivariate data point to TSDB by using the HTTP API endpoint.

Note

If you want to use the native SQL capabilities of Lindorm TSDB to access data that is written by using the HTTP API endpoint, you must create a table in TSDB. Otherwise, you can query a multivariate data point only by using the TSDB HTTP API endpoint. For more information, see Query a multivariate data point.

Yes

false

Note

To write a multivariate data point to TSDB, you must set the value to true.

column

The names of the columns whose data you want to write to the TSDB database.

Yes

No default value

Note

You must specify the columns in the same order as the columns specified for a reader.

columnType

The data types of the columns in the relational database. The following types are supported:

  • timestamp: a timestamp column.

  • tag: a tag column.

  • field_string: a metric column whose value is of a string data type.

  • field_double: a metric column whose value is of a numeric data type.

  • field_boolean: a metric column whose value is of a Boolean data type.

Yes

No default value

Note

You must specify the columns in the same order as the columns specified for a reader.

batchSize

The number of data records to write at a time. The value of this parameter is of the INT type and must be greater than 0.

No

100

Performance test report

  • Characteristics of test data

    • Metric: a metric, which is m.

    • tag_k and tag_v: the key and value of a tag. The keys and values of the first four tags constitute a time series of 2,000,000 data points. The number of data points is calculated by using the following formula: 10 (zones) × 20 (clusters) × 100 (groups) × 100 (applications). The ip tag corresponds to the index of the 2,000,000 data points, starting from 1.

      tag_k

      tag_v

      zone

      z1 to z10

      cluster

      c1 to c20

      group

      g1 to g100

      app

      a1 to a100

      ip

      ip1 to ip2,000,000

    • value: a random value from 1 to 100.

    • interval: a collection interval of 10 seconds. The total duration of data collection is 3 hours, and a total number of 2,160,000,000 data points are collected. The number of data points is calculated by using the following formula: 3 × 60 × 60/10 × 2,000,000.

  • Performance test results

    Number of channels

    Data integration speed (record/s)

    Data integration bandwidth (Mbit/s)

    1

    129,753

    15.45

    2

    284,953

    33.70

    3

    385,868

    45.71