All Products
Search
Document Center

Dataphin:Configure the PolarDB-X 2.0 Input Component

Last Updated:Mar 05, 2026

The PolarDB-X 2.0 input component retrieves data from a PolarDB-X 2.0 data source. When you sync data from a PolarDB-X 2.0 data source to another data source, first configure the PolarDB-X 2.0 input component to read from the source data source. Then configure the destination data source for the sync. This topic explains how to configure the PolarDB-X 2.0 input component.

Prerequisites

Procedure

  1. On the Dataphin homepage, in the top menu bar, click Develop, and then click Data Integration.

  2. On the Integration page, in the top menu bar, click Project. In Dev-Prod mode, also select an environment.

  3. In the left navigation pane, click Batch Pipeline. In the Batch Pipeline list, click the offline pipeline that you want to develop. The configuration page for the offline pipeline opens.

  4. In the upper-right corner of the page, click Component Library to open the Component Library panel.

  5. In the left navigation pane of the Component Library panel, click Input. In the input component list on the right, locate the PolarDB-X 2.0 component and drag it onto the canvas.

  6. On the PolarDB-X 2.0 input component card, click the image icon to open the PolarDB-X 2.0 Input Configuration dialog box.

  7. In the PolarDB-X 2.0 Input Configuration dialog box, configure the following parameters.

    Parameter

    Description

    Step Name

    The name of the PolarDB-X 2.0 input component. Dataphin generates a step name automatically. You can change it based on your business scenario. Use the following naming rules:

    • Use only Chinese characters, letters, underscores (_), and digits.

    • Keep the name no longer than 64 characters.

    Source Table Count

    Select the number of source tables. Options are Single Table and Multiple Tables:

    • Single Table: Use this option when syncing business data from one source table to one destination table.

    • Multiple Tables: Use this option when syncing business data from multiple source tables to one destination table. When writing data from multiple tables into one destination table, use the union algorithm.

      For more information about union, see INTERSECT, UNION, and EXCEPT.

    Datasource

    The drop-down list shows all PolarDB-X 2.0 data sources. It includes data sources for which you have sync-read permission and those for which you do not. Click the image icon to copy the current data source name.

    Database (Optional)

    Select the database where the table resides. If you leave this field blank, the system uses the database specified when registering the data source.

    If you select Multiple Tables for Source Table Count, you can select multiple databases. Click the image icon to open the Database List dialog box and view all selected databases.

    Table Matching Method

    Select Generic Rule or Database Regex.

    Note

    This parameter is available only when you select Multiple Tables for Source Table Count.

    Table

    Select the source table:

    • If you select Single Table for Source Table Count, search by entering a keyword in the table name, or enter the exact table name and click Exact Match. After selecting a table, the system automatically checks the table status. Click the image icon to copy the name of the selected table.

    • If you select Multiple Tables for Source Table Count, enter an expression based on the table matching method.

      • If you select Generic Rule for table matching: Enter a table expression in the input box to filter tables with the same structure. The system supports enumeration, regex-like syntax, and mixed formats. For example: table_[001-100];table_102;.

      • If you select Database Regex for table matching: Enter a regex supported by the current database. The system matches tables in the destination database using this regex. During task runtime, the system dynamically matches new tables based on the regex.

      After entering the expression, click Exact Match to open the Confirm Match Details dialog box and view the list of matched tables.

    Shard Key (Optional)

    The system partitions data based on the configured shard key column. Use this with concurrency settings to enable concurrent reads. You can use any column from the source data table as the shard key. For best performance, use a primary key or indexed column as the shard key.

    Important

    If you select a date-time type, the system performs brute-force partitioning across the full time range based on the maximum and minimum values and the concurrency setting. This does not guarantee even distribution.

    Input Filter (Optional)

    Enter filter conditions for input fields. For example: ds=${bizdate}. Use Input Filter in these scenarios:

    • A fixed subset of data.

    • Parameter-based filtering.

    Output Fields

    The Output Fields section lists all fields from the selected table and filtered results. You can perform the following actions:

    • Field Management: To exclude fields from downstream components, delete them:

      • Delete individual fields: To delete a few fields, click the sgaga icon in the Actions column.

      • Batch field deletion scenario: To delete many fields, click Field Management. In the Field Management dialog box, select multiple fields, click the image left-moving icon to move the selected input fields to the list of unselected input fields, and click OK to complete batch field deletion.

        image..png

    • Batch Add: Click Batch Add to add fields in JSON, TEXT, or DDL format.

      Note

      After batch adding, clicking OK overwrites existing field configurations.

      • You can perform batch configurations in JSON format. For example:

        // Example:
          [{
             "index": 0,
             "name": "id",
             "type": "int(10)",
             "mapType": "Long",
             "comment": "comment1"
           },
           {
             "index": 1,
             "name": "user_name",
             "type": "varchar(255)",
             "mapType": "String",
             "comment": "comment2"
         }]
        Note

        The index field specifies the column number. The name field specifies the field name after import. The type field specifies the field type after import.

        For example, "index":3,"name":"user_id","type":"String" indicates that the fourth column from the file is imported with the field name user_id and the field type String.

      • You can configure settings in batch using TEXT format, such as:

        // Example:
        0,id,int(10),Long,comment1
        1,user_name,varchar(255),Long,comment2
        • The row delimiter separates field information. The default is a line feed (\n). You can also use a semicolon (;) or period (.).

        • The column delimiter separates field names and types. The default is a comma (,). You can use','. Field types are optional. The default is','.

      • You can configure in batches in DDL format, for example:

        CREATE TABLE tablename (
        	user_id serial,
        	username VARCHAR(50),
        	password VARCHAR(50),
        	email VARCHAR (255),
        	created_on TIMESTAMP,
        );
    • Add a new output field: Click + Add Output Field. Enter values for Column, Type, and Comment. Select a Mapping Type. Click the image icon to save the row.

  8. Click OK to finish configuring the PolarDB-X 2.0 input component.