Configure ClickHouse input component - Dataphin data sync - Dataphin

Reads data from a ClickHouse data source. To sync this data to another destination, configure the input component first, then the destination.

Prerequisites

A ClickHouse data source is created. Create a ClickHouse Data Source.
The account has sync-read permission on the data source. If not, request it from Request Data Source Permissions.

Procedure

On the Dataphin homepage, choose Develop > Data Integration.
On the Integration page, select a Project. In Dev-Prod mode, also select an environment.
In the left navigation pane, click Offline Integration, then click the target Offline Pipeline in the Offline Integration list.
In the upper-right corner, click Component Library to open the Component Library panel.
In the Component Library panel, click Input, find ClickHouse, and drag it onto the canvas.
Click the icon on the ClickHouse component to open the ClickHouse Input Configuration dialog box.

In the ClickHouse Input Configuration dialog box, configure the parameters.

Parameter	Description
Step Name	Auto-generated. Rename as needed. Rules: Use only Chinese characters, letters, underscores (_), and digits. Do not exceed 64 characters.
Datasource	All ClickHouse data sources in Dataphin, regardless of sync-read permission. Click the icon to copy the data source name. If you lack sync-read permission, click Request next to the data source. Request Data Source Permissions. If no ClickHouse data source exists, click Create Data Source. Create a ClickHouse Data Source.
Source Table Count	Options: Single Table or Multiple Tables. Single Table: Syncs data from one source table to one destination table. Multiple Tables: Syncs data from multiple source tables to one destination table using the UNION algorithm.
Table Matching Method	Select Generic Rule or Database Regex. Note Available only when Source Table Count is Multiple Tables.
Table	Select the source table: For Single Table: search by keyword or enter the exact name, then click Exact Search. The system checks table status automatically. Click the icon to copy the table name. For Multiple Tables: enter an expression based on the table matching method. Generic Rule: enter an expression to filter tables with the same structure. Supports enumeration, regex-like patterns, and mixed formats. Example: `table_[001-100];table_102;`. Database Regex: enter a database-supported regex. The system matches tables using this regex and dynamically matches new tables at runtime. Click Exact Search to view matched tables in the Confirm Match Details dialog box.
Split Key (Optional)	Splits data for concurrent reads when used with concurrency settings. Use any column; primary keys or indexed columns perform best. Important Date-time columns use brute-force splitting across the full time range based on min/max values and concurrency. This does not guarantee even distribution.
Batch Read Size (Optional)	Records per batch (for example, 1024). Reduces data source interactions and network latency.
Input Filter (Optional)	Filter conditions for data extraction: Static value: extracts matching data. Example: `ds=20210101`. Variable parameter: extracts data dynamically. Example: `ds=${bizdate}`.
Output Fields	All fields from the selected table after applying the input filter. Manage fields: Remove unneeded fields. Remove individual fields: Click the icon in the Actions column. Batch deletion: Click Field Management, select fields in the Field Management dialog, click the icon to move them to unselected, then click OK. Add fields in bulk: Click Add in Bulk to configure fields in JSON, TEXT, or DDL format. Note Clicking OK overwrites existing fields. JSON format example: `// Example: [{ "index": 1, "name": "id", "type": "int(10)", "mapType": "Long", "comment": "comment1" }, { "index": 2, "name": "user_name", "type": "varchar(255)", "mapType": "String", "comment": "comment2" }]` Note `index`: column index (0-based). `name`: field name. `type`: field type. Example: `"index":3,"name":"user_id","type":"String"` imports the fourth column as `user_id` of type `String`. TEXT format example: `// Example: 1,id,int(10),Long,comment1 2,user_name,varchar(255),Long,comment2` Row delimiter: default line feed (\n). Alternatives: semicolon (;) or period (.). Column delimiter: default comma (,). Alternative: `','`. Field type is optional, defaults to `','`. DDL format example: `CREATE TABLE tablename ( user_id serial, username VARCHAR(50), password VARCHAR(50), email VARCHAR (255), created_on TIMESTAMP, );` Add a field: Click + Add Output Field, enter the Column, Type, Comment, and Mapping Type, then click to save.

Click OK.