All Products
Search
Document Center

Dataphin:Configure the PolarDB-X 2.0 input component

Last Updated:Nov 20, 2025

The PolarDB-X 2.0 input component reads data from a PolarDB-X 2.0 data source. To synchronize data from a PolarDB-X 2.0 data source to another data source, you must first configure the PolarDB-X 2.0 input component and then configure the component for the destination data source. This topic describes how to configure the PolarDB-X 2.0 input component.

Prerequisites

Procedure

  1. On the Dataphin home page, in the top menu bar, choose Development > Data Integration.

  2. In the top menu bar of the integration page, select a Project. In Dev-Prod mode, you must also select an environment.

  3. In the navigation pane on the left, click Batch Pipeline. In the Batch Pipeline list, click the offline pipeline that you want to develop to open its configuration page.

  4. In the upper-right corner of the page, click Component Library to open the Component Library panel.

  5. In the Component Library panel, in the navigation pane on the left, select Input. In the list of input components on the right, find the PolarDB-X 2.0 component and drag it to the canvas.

  6. On the PolarDB-X 2.0 input component card, click the image icon to open the PolarDB-X 2.0 Input Configuration dialog box.

  7. In the PolarDB-X 2.0 Input Configuration dialog box, configure the following parameters.

    Parameter

    Description

    Step Name

    The name of the PolarDB-X 2.0 input component. Dataphin automatically generates a step name. You can also change it as needed. The naming convention is as follows:

    • Can contain only Chinese characters, letters, underscores (_), and digits.

    • Cannot exceed 64 characters in length.

    Source Table Quantity

    Select the number of source tables. The options are Single Table and Multiple Tables:

    • Single Table: Use this option to synchronize data from one table to one destination table.

    • Multiple Tables: Use this option to synchronize data from multiple tables to the same destination table. When data from multiple tables is written to the same data table, the union algorithm is used.

      For more information about union, see INTERSECT, UNION, and EXCEPT.

    Datasource

    The drop-down list displays all PolarDB-X 2.0 data sources. This includes data sources for which you have the sync read permission and those for which you do not. Click the image icon to copy the current data source name.

    Database (Optional)

    Select the database where the table is located. If you leave this blank, the database specified during data source registration is used.

    If you set Source Table Quantity to Multiple Tables, you can select multiple databases. Click the image icon to view all selected databases in the Database List dialog box.

    Table

    Select the source table:

    • If you set Source Table Quantity to Single Table, you can enter a keyword to search for the table name, or enter the exact table name and click Exact Match. After you select a table, the system automatically checks its status. Click the image icon to copy the name of the currently selected table.

    • If you set Source Table Quantity to Multiple Tables, perform the following steps to add tables.

      1. In the input box, enter an expression to filter for tables with the same structure.

        The system supports enumeration, regular expression-like patterns, and a mix of both. For example, table_[001-100];table_102.

      2. Click Exact Match to view a list of matching tables in the Confirm Match Details dialog box.

      3. Click Confirm.

    Shard Key (Optional)

    The system shards data based on the configured shard key field. You can use this with the concurrency setting to enable concurrent reads. You can use a column from the source table as the shard key. For better transfer performance, use a primary key or an indexed column as the shard key.

    Important

    If you select a date and time type, the system identifies the maximum and minimum values. It then performs a rough split based on the total time range and concurrency. The splits are not guaranteed to be even.

    Input Filter (Optional)

    Enter the filter information for the input fields. For example, ds=${bizdate}. The Input Filter is suitable for the following two scenarios:

    • Filtering a fixed portion of data.

    • Filtering by parameters.

    Output Fields

    The Output Fields section displays all fields from the selected tables that match the filter criteria. The following operations are supported:

    • Manage Fields: If you do not need to output certain fields to downstream components, you can delete them:

      • To delete a single field: To delete a few fields, click the sgaga icon in the Actions column to remove unwanted fields.

      • To delete fields in batch: To delete many fields, click Manage Fields. In the Manage Fields dialog box, select multiple fields. Click the image left arrow icon to move the selected input fields to the unselected input fields list. Then, click OK to complete the batch deletion.

        image..png

    • Batch Add: Click Batch Add to configure fields in batch using JSON, TEXT, or DDL format.

      Note

      After you add fields in batch and click OK, the existing field configuration will be overwritten.

      • To configure in batch using JSON format, for example:

        // Example:
          [{
             "index": 0,
             "name": "id",
             "type": "int(10)",
             "mapType": "Long",
             "comment": "comment1"
           },
           {
             "index": 1,
             "name": "user_name",
             "type": "varchar(255)",
             "mapType": "String",
             "comment": "comment2"
         }]
        Note

        `index` specifies the column number of the object. `name` specifies the field name after import. `type` specifies the field type after import.

        For example, "index":3,"name":"user_id","type":"String" means that the fourth column of the file is imported with the field name `user_id` and the field type `String`.

      • To configure in batch using TEXT format, for example:

        // Example:
        0,id,int(10),Long,comment1
        1,user_name,varchar(255),Long,comment2
        • The row delimiter separates the information for each field. The default is a line feed (\n). Semicolons (;) and periods (.) are also supported.

        • The column delimiter is used to separate the field name from the field type. The default is a comma (,), and the supported character is ','. The field type can be omitted, and the default delimiter is ','.

      • To configure in batch using DDL format, for example:

        CREATE TABLE tablename (
        	user_id serial,
        	username VARCHAR(50),
        	password VARCHAR(50),
        	email VARCHAR (255),
        	created_on TIMESTAMP,
        );
    • Create Output Field: Click + Create Output Field. Follow the prompts to enter the Column, Type, and Remarks, and select a Mapping Type. After you finish configuring the current row, click the image icon to save.

  8. Click Confirm to complete the configuration of the PolarDB-X 2.0 input component.