Use TSDB Writer to write data to TSDB - DataWorks

DataWorks Data Integration uses TSDB Writer to write data points to a Lindorm TSDB instance. This topic describes how TSDB Writer works, the supported field types, configuration parameters, and performance benchmarks.

Supported versions

TSDB Writer supports all versions of Lindorm TSDB and HiTSDB 2.4.x or later. Compatibility with other versions is not guaranteed.

Limits

TSDB Writer supports Serverless resource groups (Recommended) for running tasks. It also supports exclusive resource groups for Data Integration.
TSDB Writer supports task configuration only in the code editor.

How it works

TSDB Writer connects to a TSDB instance using the TSDB client (hitsdb-client) and writes data points over the HTTP API. For details on the write API, see SDK reference.

Supported field types

The behavior of TSDB Writer depends on the value of sourceDbType:

`TSDB` — the source is TSDB Reader or OpenTSDB Reader. TSDB Writer passes the source data through directly as a JSON string.
`RDB` — the source is a relational database. TSDB Writer parses the data as relational database records and maps each column to a TSDB type using columnType.

The following table shows the supported columnType values and the corresponding TSDB data types when sourceDbType is set to RDB.

Data model	`columnType` value	Data type
Data tag	`tag`	String. A tag describes a feature of the data source and typically does not change over time.
Data generation time	`timestamp`	Timestamp. Represents when the data was generated. Specify it during the write operation, or let the system generate it automatically.
Data content	`field_string`	String. A field describes a measured metric of the data source and typically changes over time.
Data content	`field_double`	Numeric. A field describes a measured metric of the data source and typically changes over time.
Data content	`field_boolean`	Boolean. A field describes a measured metric of the data source and typically changes over time.

Configure a batch synchronization task

Configure TSDB Writer tasks using the code editor only. For the general setup procedure, see Configure a task in the code editor.

The appendix below provides ready-to-use script templates and full parameter descriptions.

Appendix: Script templates and parameter descriptions

Script templates

All three templates below use the unified script format required by the code editor. Replace the placeholder values before running. For the general configuration procedure, see Configure a task in the code editor.

RDB to TSDB (recommended)

Use this template when the source is a relational database such as MySQL, Oracle, PostgreSQL, or DRDS.

{
    "type": "job",
    "version": "2.0",
    "steps": [
        {
            "stepType": "stream",
            // Replace "stream" with the plugin name of your RDB source
            // (e.g., mysql, oracle, postgresql, drds).
            "parameter": {},
            "name": "Reader",
            "category": "reader"
        },
        {
            "stepType": "tsdb",
            "parameter": {
                "endpoint": "http://localhost:8242",
                "username": "xxx",
                "password": "xxx",
                "sourceDbType": "RDB",
                "batchSize": 256,
                "columnType": [
                    "tag",
                    "tag",
                    "field_string",
                    "field_double",
                    "timestamp",
                    "field_bool"
                ],
                "column": [
                    "tag1",
                    "tag2",
                    "field1",
                    "field2",
                    "timestamp",
                    "field3"
                ],
                "multiField": "true",
                "table": "testmetric",
                "ignoreWriteError": "false",
                "database": "default"
            },
            "name": "Writer",
            "category": "writer"
        }
    ],
    "setting": {
        "errorLimit": {
            "record": "0"
        },
        "speed": {
            "throttle": true,
            // Set to false to disable throttling (mbps is ignored when false).
            "concurrent": 1,
            // Number of concurrent channels. See performance benchmarks
            // to choose a value based on your throughput requirements.
            "mbps": "12"
            // Maximum transfer rate in MB/s. Only applies when throttle is true.
        }
    },
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    }
}

OpenTSDB to TSDB

Use this template when the source supports the OpenTSDB protocol (e.g., OpenTSDB Reader).

{
    "type": "job",
    "version": "2.0",
    "steps": [
        {
            "stepType": "opentsdb",
            "parameter": {
                "endpoint": "http://localhost:4242",
                "column": [
                    "m1",
                    "m2",
                    "m3",
                    "m4",
                    "m5",
                    "m6"
                ],
                "startTime": "2019-01-01 00:00:00",
                "endTime": "2019-01-01 03:00:00"
            },
            "name": "Reader",
            "category": "reader"
        },
        {
            "stepType": "tsdb",
            "parameter": {
                "endpoint": "http://localhost:8242"
                // Only the destination endpoint is required for TSDB-to-TSDB writes.
            },
            "name": "Writer",
            "category": "writer"
        }
    ],
    "setting": {
        "errorLimit": {
            "record": "0"
        },
        "speed": {
            "throttle": true,
            "concurrent": 1,
            "mbps": "12"
        }
    },
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    }
}

RDB to TSDB using the OpenTSDB single-value protocol (not recommended)

Use this template only when you need to write data using the OpenTSDB single-value protocol. For new workloads, use the RDB to TSDB template above instead.

{
    "type": "job",
    "version": "2.0",
    "steps": [
        {
            "stepType": "stream",
            // Replace "stream" with your RDB source plugin name.
            "parameter": {},
            "name": "Reader",
            "category": "reader"
        },
        {
            "stepType": "tsdb",
            "parameter": {
                "endpoint": "http://localhost:8242",
                "username": "xxx",
                "password": "xxx",
                "sourceDbType": "RDB",
                "batchSize": 256,
                "columnType": [
                    "tag",
                    "tag",
                    "field_string",
                    "field_double",
                    "timestamp",
                    "field_boolean"
                ],
                "column": [
                    "tag1",
                    "tag2",
                    "field_metric_1",
                    "field_metric_2",
                    "timestamp",
                    "field_metric_3"
                ],
                "ignoreWriteError": "false"
                // multiField is omitted (defaults to false) for single-value mode.
                // Each field column maps to a separate metric in TSDB.
            },
            "name": "Writer",
            "category": "writer"
        }
    ],
    "setting": {
        "errorLimit": {
            "record": "0"
        },
        "speed": {
            "throttle": true,
            "concurrent": 1,
            "mbps": "12"
        }
    },
    "order": {
        "hops": [
            {
                "from": "Reader",
                "to": "Writer"
            }
        ]
    }
}

In single-value mode, the destination metric name is derived from the column name that maps to a field. Based on the configuration above, one row of relational data is written to three metrics: field_metric_1, field_metric_2, and field_metric_3.

Writer parameters

The parameters are grouped by the source type they apply to.

Common parameters

These parameters apply to all sourceDbType values.

Parameter	Description	Required	Default	Example
`sourceDbType`	The source type. `TSDB` includes OpenTSDB, Prometheus, and TimeScale. `RDB` includes MySQL, Oracle, PostgreSQL, and DRDS.	No	`TSDB`	`RDB`
`endpoint`	The HTTP endpoint of the TSDB instance. Get this value from the product console. Format: `http://IP:Port`.	Yes	—	`http://192.168.1.1:8242`
`database`	The target TSDB database. Create the database in TSDB before running the task.	No	`default`	`my_database`
`username`	The TSDB database username. Required only if authentication is enabled.	No	—	`admin`
`batchSize`	The number of data entries to write per batch. Larger values increase throughput but require more memory. Must be greater than 0.	No	`100`	`256`

Parameters for TSDB sources (`sourceDbType: TSDB`)

Parameter	Description	Required	Default	Example
`maxRetryTime`	The number of retries after a write failure. Must be greater than 1.	No	`3`	`5`
`ignoreWriteError`	If `true`, write errors are skipped and the task continues. If the write fails after all retries, the task stops regardless of this setting.	No	`false`	`false`

Parameters for RDB sources (`sourceDbType: RDB`)

Parameter	Description	Required	Default	Example
`table`	The target metric name in TSDB. Required when `multiField` is `true`. When `multiField` is `false`, specify the metric name in the `column` field instead.	Required if `multiField` is `true`	—	`testmetric`
`multiField`	Set to `true` to write multiple fields to TSDB in a single HTTP API call. The current TSDB version requires `true` for multi-value writes. To query the written data using Lindorm TSDB SQL, pre-create the table in TSDB first; otherwise, use the TSDB HTTP API for queries. See Query multi-value data.	Yes	`false`	`true`
`column`	The field names from the source relational database table. The order must match the `column` parameter of the Reader plugin.	Yes	—	`["tag1", "tag2", "field1", "timestamp"]`
`columnType`	The TSDB types that the source columns map to. Supported values: `tag`, `timestamp`, `field_string`, `field_double`, `field_boolean`. The order must match the `column` field order.	Yes	—	`["tag", "tag", "field_double", "timestamp"]`
`batchSize`	The number of data entries to write per batch. Must be greater than 0.	No	`100`	`256`

Write errors and retry behavior

Scenario	Behavior
Write succeeds	Task continues normally.
Write fails, retries remaining	TSDB Writer retries up to `maxRetryTime` times (default: 3). Applies only when `sourceDbType` is `TSDB`.
Write fails after all retries	Task stops, regardless of the `ignoreWriteError` setting.
`ignoreWriteError` is `true`	Individual write errors are skipped and the task continues, unless the failure persists through all retries.

Performance benchmarks

The following test results show how throughput scales with the number of concurrent channels.

Test dataset:

Metric: m
Tag combinations: 10 zones x 20 clusters x 100 groups x 100 apps = 2,000,000 time series, plus an IP tag auto-incremented across all 2,000,000 time series
Value: random integer between 1 and 100
Collection interval: 10 seconds over 3 hours
Total data points: 3 x 3,600 / 10 x 2,000,000 = 2,160,000,000

Test results:

Channels	Speed (records/s)	Traffic (MB/s)
1	129,753	15.45
2	284,953	33.70
3	385,868	45.71

Tag key-value breakdown used in the test:

Tag key	Tag values
zone	z1-z10
cluster	c1-c20
group	g1-g100
app	a1-a100
ip	ip1-ip2,000,000

Use these results to calibrate the concurrent and batchSize settings for your workload. Start with batchSize set to 256 for higher throughput, and increase concurrent channels if a single channel cannot saturate your target transfer rate. Increasing batchSize above the default of 100 improves throughput but increases memory usage per task.