ClickHouse is an open source column-oriented database management system (DBMS) for online analytical processing (OLAP) of queries. This topic describes how ClickHouse Writer works and the parameters that you can use to configure ClickHouse Writer. This topic also provides an example for configuring ClickHouse Writer.

Limits

  • ClickHouse Writer supports only exclusive resource groups for Data Integration, but not the default resource group or custom resource groups. For more information, see Exclusive resource groups for Data Integration, Use the default resource group, and Add a custom resource group.
  • ClickHouse Writer connects to a ClickHouse database by using Java Database Connectivity (JDBC), and can write data to a destination table only by using JDBC Statement.
  • ClickHouse Writer allows you to filter columns and change the order of columns. You can specify columns as needed.
  • Considering the load on ClickHouse, we recommend that you throttle the throughput per second (TPS) to 1,000 when the INSERT INTO write mode is used.
  • After all tasks of a node are run, ClickHouse Writer performs a single-process POST Flush operation to update the data records in the ClickHouse database.
  • You must confirm the compatibility between the driver version and your ClickHouse database. ClickHouse Writer uses the following version of the ClickHouse database driver:
    <dependency>
        <groupId>ru.yandex.clickhouse</groupId>
        <artifactId>clickhouse-jdbc</artifactId>
        <version>0.2.4.ali2-SNAPSHOT</version>
    </dependency>

Background information

ClickHouse Writer allows you to write data to ClickHouse databases. Specifically, ClickHouse Writer connects to a remote ClickHouse database by using JDBC, and executes an INSERT INTO statement to write data to the ClickHouse database.

ClickHouse Writer is designed for extract, transform, load (ETL) developers to import data from data warehouses to ClickHouse databases. Users such as database administrators can use ClickHouse Writer as a data migration tool.

ClickHouse Writer obtains data from a Data Integration reader, and writes the data to the ClickHouse database by executing the INSERT INTO statement that is generated based on your configurations.

Parameters

Parameter Description Required Default Value
jdbcUrl The JDBC URL of the destination database. The jdbcUrl parameter must be included in the connection parameter.
  • You can configure only one JDBC URL for a database.
  • The format of jdbcUrl must be in accordance with ClickHouse official specifications. You can also specify the information of the attachment facility. Example: jdbc:clickhouse://127.0.0.1:3306/database.
Yes N/A
username The username that you can use to connect to the database. Yes N/A
password The password that you can use to connect to the database. Yes N/A
table The name of the destination table, which is described in a JSON array.
Note The table parameter must be included in the connection parameter.
Yes N/A
column The columns in the destination table to which data is written. Separate the columns with a comma (,). Example: "column":["id","name","age"].
Note The column parameter cannot be left empty.
Yes N/A
preSql The SQL statement to execute before the sync node is run. Use @table to specify the name of the destination table in the SQL statement. When you execute this SQL statement, @table is replaced by the name of the destination table. No N/A
postSql The SQL statement to execute after the data is written to the destination table. No N/A
batchSize The number of data records to write at a time. Setting this parameter can greatly reduce the interactions between Data Integration and ClickHouse on the network, and increase the throughput. However, an excessively large value may lead to the out of memory (OOM) error during the data synchronization process. No 1,024

Codeless UI mode

The codeless user interface (UI) mode is not supported.

Code editor mode

For more information about how to use the code editor to create a sync node, see Create a sync node by using the code editor.
Note Delete the comments from the following code before you run the code.
The following example shows how to write data to a ClickHouse database in JSON format:
{
    "type":"job",
    "version":"2.0", // The version number.
    "steps":[
        {
            "stepType":"stream",
            "parameter":{},
            "name":"Reader",
            "category":"reader"
        },
        {
            "stepType":"clickhouse",// The writer type.
            "parameter":{
                "username": "",
                "password": "",
                "column": [// The columns to which data is written.
                    "id",
                    "name"
                ],
                "connection": [
                    {
                        "table":[// The name of the table.
                            "ClickHouse_table"
                        ],
                        "jdbcUrl": "jdbc:clickhouse://ip:port/database"
                    }
                ],
                "preSql": [ // The SQL statement to execute before the sync node is run.
                    "delete from @table where db_id = -1"
                ],
                "postSql": [ // The SQL statement to execute after the sync node is run.
                    "update @table set db_modify_time = now() where db_id = 1"
                ],
                "batchSize": "1024",
                "batchByteSize": "67108864",
                "writeMode": "insert"
            },
            "name":"Writer",
            "category":"writer"
        }
    ],
    "setting":{
        "errorLimit":{
            "record":"0"// The number of dirty data records.
        },
        "speed":{
            "throttle":false,// A value of false indicates that the bandwidth is not throttled. A value of true indicates that the bandwidth is throttled. The maximum transmission rate takes effect only if you set this parameter to true.
            "concurrent":1, // The number of concurrent threads.
        }
    },
    "order":{
        "hops":[
            {
                "from":"Reader",
                "to":"Writer"
            }
        ]
    }
}