All Products
Search
Document Center

Dataphin:Configure the ClickHouse Input Component

Last Updated:Mar 09, 2026

The ClickHouse input component reads data from a ClickHouse data source. When you sync data from a ClickHouse data source to another data source, first configure the ClickHouse input component with the source data source information. Then configure the destination data source for the sync task. This topic describes how to configure the ClickHouse input component.

Prerequisites

  • You have created a ClickHouse data source. For more information, see Create a ClickHouse Data Source.

  • The account used to configure the ClickHouse input component must have sync-read permission on the data source. If the account does not have this permission, request it. For more information, see Request Data Source Permissions.

Procedure

  1. In the top menu bar of the Dataphin homepage, choose Develop > Data Integration.

  2. In the top menu bar of the Integration page, select Project. In Dev-Prod mode, also select an environment.

  3. In the left navigation pane, click Offline Integration. In the Offline Integration list, click the Offline Pipeline that you want to develop to open its configuration page.

  4. In the upper-right corner of the page, click Component Library to open the Component Library panel.

  5. In the navigation pane on the left of the Component Library panel, click Input. In the input component list on the right, find the ClickHouse component and drag it onto the canvas.

  6. Click the image icon in the ClickHouse input component card to open the ClickHouse Input Configuration dialog box.

  7. In the ClickHouse Input Configuration dialog box, configure the parameters.

    Parameter

    Description

    Step Name

    The name of the ClickHouse input component. Dataphin generates a step name automatically. You can change it based on your business scenario. Naming rules:

    • Use only Chinese characters, letters, underscores (_), and digits.

    • Do not exceed 64 characters.

    Datasource

    The drop-down list shows all ClickHouse data sources in Dataphin. It includes data sources for which you have sync-read permission and those for which you do not. Click the image icon to copy the current data source name.

    • If you do not have sync-read permission for a data source, click Request next to the data source to request permission. For more information, see Request Data Source Permissions.

    • If you do not have a ClickHouse data source, click Create Data Source to create one. For more information, see Create a ClickHouse Data Source.

    Source Table Count

    Select the number of source tables. Options are Single Table and Multiple Tables:

    • Single Table: Use this option when syncing business data from one source table to one destination table.

    • Multiple Tables: Use this option when syncing business data from multiple source tables to one destination table. When writing data from multiple tables into one destination table, the system uses the UNION algorithm.

    Table Matching Method

    Select Generic Rule or Database Regex.

    Note

    This parameter is available only when you select Multiple Tables for Source Table Count.

    Table

    Select the source table:

    • If you select Single Table for Source Table Count, enter a keyword to search for a table name or enter the exact table name and click Exact Search. After you select a table, the system automatically checks the table status. Click the image icon to copy the selected table name.

    • If you select Multiple Tables for Source Table Count, enter an expression based on the table matching method.

      • If you select Generic Rule for Table Matching Method, enter an expression in the field to filter tables with the same structure. The system supports enumeration, regex-like patterns, and mixed formats. For example, table_[001-100];table_102;.

      • If you select Database Regex for Table Matching Method, enter a regex supported by the database. The system matches tables in the destination database using this regex. During task runtime, the system dynamically matches new tables based on the regex.

      After entering the expression, click Exact Search. In the Confirm Match Details dialog box, view the list of matched tables.

    Split Key (Optional)

    The system splits data based on the configured split key field. Use this with concurrency settings to enable concurrent reads. You can use any column from the source table as the split key. For best performance, use a primary key or an indexed column as the split key.

    Important

    If you select a date-time type, the system performs brute-force splitting across the full time range based on the maximum and minimum values and the concurrency setting. This does not guarantee even distribution.

    Batch Read Size (Optional)

    The number of records read at a time. Configure a batch read size (for example, 1024 records) instead of reading records one by one. This reduces interactions with the data source, improves I/O efficiency, and lowers network latency.

    Input Filter (Optional)

    Configure the filtering conditions for extracting data as follows:

    • Set a static value to extract matching data. For example, ds=20210101.

    • Set a variable parameter to extract part of the data. For example, ds=${bizdate}.

    Output Fields

    This section lists all fields from the selected table and filtered by the input filter. You can perform the following actions:

    • Manage fields: Remove fields that you do not need to pass to downstream components.

      • Remove individual fields: Click the sgaga icon in the Actions column to remove extra fields.

      • Batch field deletion scenario: To delete many fields, click Field Management, select multiple fields in the Field Management dialog box, click the image left-moving icon to move the selected input fields to the unselected input fields, and click OK to complete batch field deletion.

        image..png

    • Add fields in bulk: Click Add in Bulk to configure fields in JSON, TEXT, or DDL format.

      Note

      After adding fields in bulk and clicking OK, the system overwrites existing field configurations.

      • JSON format example:

        // Example:
          [{
             "index": 1,
             "name": "id",
             "type": "int(10)",
             "mapType": "Long",
             "comment": "comment1"
           },
           {
             "index": 2,
             "name": "user_name",
             "type": "varchar(255)",
             "mapType": "String",
             "comment": "comment2"
         }]
        Note

        index specifies the column index, name specifies the name of the imported field, and type specifies the type of the imported field. For example, "index":3,"name":"user_id","type":"String" indicates that the fourth column of the file is imported because the column index is 0-based. The field name is user_id and the field type is String.

      • TEXT format example:

        // Example:
        1,id,int(10),Long,comment1
        2,user_name,varchar(255),Long,comment2
        • The row delimiter separates field information. The default is a line feed (\n). You can also use a semicolon (;) or a period (.).

        • The column delimiter separates field names and types. The default is a comma (,). You can also use ','. The field type is optional and defaults to ','.

      • DDL format example:

        CREATE TABLE tablename (
        	user_id serial,
        	username VARCHAR(50),
        	password VARCHAR(50),
        	email VARCHAR (255),
        	created_on TIMESTAMP,
        );
    • Add a new output field: Click + Add Output Field. Enter the Column, Type, and Comment, and select a Mapping Type. Click the image icon to save the configuration for the current row.

  8. Click OK to finish configuring the ClickHouse Input Component.