All Products
Search
Document Center

Dataphin:Configure the PostgreSQL input component

Last Updated:Mar 05, 2026

The PostgreSQL input component reads data from a PostgreSQL data source. To sync data from a PostgreSQL source to another destination, you must first configure the PostgreSQL input component to read from the source. Then, you must configure the destination component. This topic describes how to configure the PostgreSQL input component.

Prerequisites

Procedure

  1. You can select Developer > Data Integration from the top menu bar on the Dataphin homepage.

  2. On the Integration page, in the top menu bar, select a Project. In Dev-Prod mode, you must also select an environment.

  3. In the navigation pane on the left, click Offline Integration. In the Offline Integration list, click the offline pipeline that you want to develop to open its configuration page.

  4. Click Component Library in the upper-right corner of the page. This opens the Component Library panel.

  5. You can select Input in the left navigation pane of the Component Library panel. Then, locate the PostgreSQL component in the input component list on the right and drag it to the canvas.

  6. Click the image icon on the PostgreSQL input component card. This opens the PostgreSQL Input Configuration dialog box.

  7. In the PostgreSQL Input Configuration dialog box, configure the following parameters.

    Parameter

    Description

    Step Name

    The name of the PostgreSQL input component. Dataphin automatically generates the step name. You can modify it as needed. Naming rules are as follows:

    • Can contain only Chinese characters, letters, underscores (_), and numbers.

    • Length cannot exceed 64 characters.

    Datasource

    The data source drop-down list displays all PostgreSQL data sources. This includes data sources for which you have read-through permissions and those for which you do not. Click the image icon to copy the current data source name.

    • For data sources for which you do not have read-through permission, you can click Request after the data source to request read permission for the corresponding data source. For specific steps to request data source read permission, see Request, renew, and return data source permissions.

    • If you don't have a PostgreSQL data source yet, click New Data Source to create one. For detailed instructions, see Creating a PostgreSQL Data Source.

    Time Zone

    Time-formatted data is processed based on the current time zone. By default, this is the time zone configured in the selected data source and cannot be modified.

    Note

    For tasks created before V5.1.2, select Data Source Default Configuration or Channel Configuration Time Zone. The default selection is Channel Configuration Time Zone.

    • Data Source Default Configuration: The default time zone of the selected data source.

    • Channel Configuration Time Zone: The time zone configured for the current integration task in Properties > Channel Configuration.

    Schema (Optional)

    Supports cross-schema table selection. Select the schema where the table resides. If not specified, the default is the schema configured in the data source.

    Source Table Quantity

    Select the source table quantity. Source table quantity includes Single Table and Multiple Tables:

    • Single Table: Use this for scenarios where you sync business data from one table to one destination table.

    • Multiple Tables: Use this for scenarios where you sync business data from multiple tables to the same destination table. When data from multiple tables is written to the same data table, the union algorithm is used.

    Table Matching Method

    You can select General Class Rule or Database Regex.

    Note

    Configure this option only when Source Table Quantity is set to Multiple Tables.

    Table

    Select the source table:

    • If Source Table Quantity is set to Single Table, enter a table name keyword to search. Alternatively, enter the exact table name and click Precise Search. After selecting a table, the system automatically detects its status. Click the image icon to copy the name of the selected table.

    • If Source Table Quantity is set to Multiple Tables, add tables by entering different expressions based on the table matching method.

      • If General Rules is selected for the table matching method: In the input box, enter a table expression to filter for tables with the same structure. The system supports enumeration, regex-like, and mixed forms. For example, table_[001-100];table_102;.

      • If Database Regex is selected for the table matching method: In the input box, enter a regular expression supported by the current database. The system matches tables in the destination database based on this regular expression. During runtime, the task immediately matches new table ranges for synchronization based on the database regular expression.

      After entering the expression, click Precise Search. This displays a list of matched tables in the Confirm Match Details dialog box.

    Shard Key (Optional)

    The system performs data partitioning based on the configured shard key field. Use this with concurrency configuration to achieve concurrent reads. You can use a column in the source data table as the shard key. Additionally, use a primary key or an indexed column as the shard key to ensure transfer performance.

    Important

    When you select a datetime type, the system identifies the maximum and minimum values. It then performs a brute-force split based on the total time range and concurrency. This does not guarantee an even distribution.

    Batch Read Count (Optional)

    The number of data records read at one time. When reading data from the source database, configure a specific batch read count (such as 1024 records) instead of reading one record at a time. This reduces interactions with the data source, improves I/O efficiency, and lowers network latency.

    Input Filter (Optional)

    Enter filter information for the input field, for example, ds=${bizdate}. Input Filtering applies to the following two scenarios:

    • A fixed subset of data.

    • Parameter filtering.

    Output Fields

    The Output Fields area displays all fields in the selected table and those matching the filter conditions. It supports the following operations:

    • Field Management: If you do not need to output certain fields to downstream components, delete the corresponding fields:

      • Single Field Deletion Scenario: To delete a few fields, click the sgaga icon in the Operation column to remove unwanted fields.

      • Batch field deletion scenario: To delete many fields, click Field Management. In the Field Management dialog box, select multiple fields, then click the image left-moving icon to move the selected input fields to the unselected input fields, and click OK to complete batch field deletion.

        image..png

    • Batch Add: Click Batch Add. This supports batch configuration in JSON, TEXT, and DDL formats.

      Note

      After batch addition, clicking OK will overwrite the configured field information.

      • Batch configure in JSON format. For example:

        // Example:
          [{
             "index": 0,
             "name": "id",
             "type": "int(10)",
             "mapType": "Long",
             "comment": "comment1"
           },
           {
             "index": 1,
             "name": "user_name",
             "type": "varchar(255)",
             "mapType": "String",
             "comment": "comment2"
         }]
        Note

        Index indicates the column number of the specified object. Name indicates the field name after import. Type indicates the field type after import. For example, "index":3,"name":"user_id","type":"String" means to import the fourth column from the file, with the field name user_id and field type String.

      • Batch configure in TEXT format. For example:

        // Example:
        0,id,int(10),Long,comment1
        1,user_name,varchar(255),Long,comment2
        • The row delimiter separates information for each field. The default is a line feed (\n). It supports line feed (\n), semicolon (;), and period (.).

        • The column delimiter separates the field name and field type. The default is a comma (,). It supports ','. The field type is optional, and the default is ','.

      • Batch configure in DDL format. For example:

        CREATE TABLE tablename (
        	user_id serial,
        	username VARCHAR(50),
        	password VARCHAR(50),
        	email VARCHAR (255),
        	created_on TIMESTAMP,
        );
    • Create Output Field: Click + Create Output Field, and as prompted, specify Column, Type, Description, and select a Mapping Type. After you configure this row, click the image icon to save.

  8. Click OK to complete the property configuration for the PostgreSQL input component.