All Products
Search
Document Center

Dataphin:Configure the TiDB Input Component

Last Updated:Mar 05, 2026

The TiDB input component retrieves data from a TiDB data source. When you sync data from a TiDB data source to another data source, first configure the TiDB input component to read from the source. Then configure the target data source for the sync. This topic explains how to configure the TiDB input component.

Prerequisites

Procedure

  1. On the Dataphin homepage, in the top menu bar, click Develop, and then click Data Integration.

  2. On the Integration page, select a Project. In Dev-Prod mode, also select an environment.

  3. In the left navigation pane, click Batch Pipeline. In the Batch Pipeline list, click the offline pipeline that you want to develop. The pipeline configuration page opens.

  4. In the upper-right corner of the page, click Component Library to open the Component Library panel.

  5. In the left navigation pane of the Component Library panel, click Input. In the list on the right, locate the TiDB component and drag it onto the canvas.

  6. Click the image icon in the TiDB Input component card to open the TiDB Input Configuration dialog box.

  7. In the TiDB Input Configuration dialog box, configure the parameters.

    Parameter

    Description

    Step Name

    The name of the TiDB input component. Dataphin generates a name automatically. You can change it based on your business needs. Use these naming rules:

    • Use only Chinese characters, letters, underscores (_), and digits.

    • Keep the name under 64 characters.

    Datasource

    The drop-down list shows all TiDB data sources in Dataphin. It includes data sources where you have sync-read permission and those where you do not. Click the image icon to copy the current data source name.

    • If you do not have sync-read permission for a data source, click Request next to the data source to request permission. For steps, see Request Data Source Permission.

    • If you do not have any TiDB data sources, click Create Data Source to create one. For steps, see Create a TiDB Data Source.

    Source Table Count

    Select the number of source tables. Options are Single Table and Multiple Tables:

    • Single Table: Use this option when syncing data from one source table to one target table.

    • Multiple Tables: Use this option when syncing data from multiple source tables to one target table. When writing data from multiple tables into one table, the system uses the union algorithm.

    Table Matching Method

    You can only select Generic Rule.

    Note

    This setting is available only when you select Multiple Tables for Source Table Count.

    Table

    Select the source table:

    • If you selected Single Table for Source Table Count, search by keyword or enter the full table name and click Exact Search. After you select a table, the system checks its status automatically. Click the image icon to copy the selected table name.

    • If you selected Multiple Tables for Source Table Count, add tables as follows:

      1. In the input box, enter a table expression to match tables with the same structure.

        The system supports enumerated, regex-like, and mixed formats. For example: table_[001-100];table_102.

      2. Click Exact Search. In the Confirm Match Details dialog box, review the list of matched tables.

      3. Click Confirm.

    Shard Key (Optional)

    The system splits data using the configured shard key field. Use this with concurrency settings to enable concurrent reads. You can use any column from the source table as the shard key. For best performance, use a primary key or indexed column.

    Important

    If you select a date-time type, the system performs brute-force splitting across the full time range and concurrency setting. This method does not guarantee even distribution.

    Batch Read Size (Optional)

    The number of records to read at once. Configure a batch size—such as 1024 records—to reduce round trips to the source database. This improves I/O efficiency and lowers network latency.

    Input Filter (Optional)

    Configure the filtering conditions for data extraction as follows:

    • Use a static value to extract matching data. Example: ds=20210101.

    • Use a variable parameter to extract part of the data. Example: ds=${bizdate}.

    Output Fields

    This section lists all fields from the selected table and filtered results. You can perform these actions:

    • Manage Fields: Remove fields you do not need downstream:

      • Remove One Field: Click the sgaga icon in the Actions column to delete a single field.

      • Batch field deletion scenario: To delete many fields, click Field Management, select multiple fields in the Field Management dialog box, click the image left-moving icon to move the selected input fields to the unselected input fields, and click OK to complete batch field deletion.

        image..png

    • Batch Add: Click Batch Add to add fields in JSON, TEXT, or DDL format.

      Note

      After you click OK, the new configuration overwrites existing field settings.

      • Configure in batches using JSON format, for example:

        // Example:
          [{
             "index": 1,
             "name": "id",
             "type": "int(10)",
             "mapType": "Long",
             "comment": "comment1"
           },
           {
             "index": 2,
             "name": "user_name",
             "type": "varchar(255)",
             "mapType": "String",
             "comment": "comment2"
         }]
        Note

        index, name, and type specify the column number of the object, the field name after import, and the field type after import, respectively. For example, "index": 3, "name": "user_id", "type": "String" indicates that the fourth column in the file is imported as a field named user_id with the type String.

      • TEXT format example:

        // Example:
        1,id,int(10),Long,comment1
        2,user_name,varchar(255),Long,comment2
        • The row delimiter separates field entries. Default is line feed (\n). Supported delimiters: \n, semicolon (;), and period (.).

        • The column delimiter separates field names and types. Default is comma (,). Supported delimiters: ','. Field type is optional and defaults to ','.

      • DDL format example:

        CREATE TABLE tablename (
        	user_id serial,
        	username VARCHAR(50),
        	password VARCHAR(50),
        	email VARCHAR (255),
        	created_on TIMESTAMP,
        );
    • Add Output Field: Click + Add Output Field. Enter values for Column, Type, and Comment. Select a Mapping Type. Click the image icon to save the row.

  8. Click OK to finish configuring the TiDB Input Component.