All Products
Search
Document Center

Dataphin:Configure the Doris input component

Last Updated:Mar 09, 2026

After you configure the Doris input component, you can read data from a Doris data source into Dataphin for data integration and data development. This topic describes how to configure the Doris input component.

Prerequisites

  • You have created a Doris data source. For more information, see Create a Doris data source.

  • The account that you use to configure the Doris input component must have sync read permission for the data source. If the permission is not granted, request it. For more information, see Request data source permissions.

Procedure

  1. On the Dataphin home page, in the top menu bar, choose Develop > Data Integration.

  2. On the integration page, in the top menu bar, select a Project. In Dev-Prod mode, also select an environment.

  3. In the navigation pane on the left, click Offline Integration. Then, in the Offline Integration list, click the offline pipeline you want to develop to open its configuration page.

  4. In the upper-right corner of the page, click Component Library to open the Component Library panel.

  5. In the left navigation pane of the Component Library panel, select Input. In the list of input components on the right, locate the Doris component and drag it to the canvas.

  6. Click the image icon on the Doris input component card to open the Doris Input Configuration dialog box.

  7. In the Doris Input Configuration dialog box, configure the parameters.

    Parameter

    Description

    Step Name

    The name of the Doris input component. Dataphin automatically generates a step name. You can also change the name as needed. The naming convention is as follows:

    • The name can contain only Chinese characters, letters, underscores (_), and digits.

    • The name cannot exceed 64 characters in length.

    Datasource

    The drop-down list displays all Doris data sources in the current Dataphin project. This includes data sources for which you have sync read permission and those for which you do not. Click the image icon to copy the current data source name.

    • For a data source for which you do not have sync read permission, click Request next to the data source to request the permission. For more information, see Request data source permissions.

    • If you do not have a Doris data source, click Create Data Source to create one. For more information, see Create a Doris data source.

    Number of source tables

    Select one or more tables with the same schema as the input, as needed. The options are Single Table and Multiple Tables:

    • Single Table: Use this option to sync data from one table to a single destination table.

    • Multiple Tables: Use this option to sync data from multiple tables to a single destination table. When data from multiple tables is written to one destination table, a union algorithm is used.

    Table matching mode

    You can select General Rule or Database Regex.

    Note

    This parameter is available only when Number of source tables is set to Multiple Tables.

    Table

    Select the source table or tables:

    • If you set Number of source tables to Single Table, enter a keyword to search for the table, or enter the exact table name and click Exact Match. After you select a table, the system automatically checks its status. Click the image icon to copy the name of the selected table.

    • If you set Number of source tables to Multiple Tables, enter an expression to add tables based on the selected table matching mode.

      • If you set Table matching mode to General Rule, enter an expression in the input box to filter for tables that have the same schema. The system supports enumerations, regular expression-like patterns, and a mix of both. For example, table_[001-100];table_102;.

      • Select Database Regex as the table matching method. Enter a regular expression that the destination database supports in the input box. The system matches tables in the destination database based on this regular expression. During sync task runtime, the system dynamically matches and synchronizes any newly matched tables according to the database regex.

      After you enter the expression, click Exact Match to view the list of matched tables in the Confirm Match Details dialog box.

    Split key

    You can use a column of an integer type from the source table as the split key. Use the primary key or an indexed column as the split key. When reading data, the system partitions the data based on the configured split key. This enables concurrent reads and improves data synchronization efficiency.

    Batch read size

    The number of records to read at a time. When reading from the source database, you can configure a specific batch size, such as 1024 records. Reading in batches instead of one record at a time reduces the number of interactions with the data source, improves I/O efficiency, and lowers network latency.

    Input filter

    A filter condition to extract specific data. The configuration is as follows:

    • Use a static value to extract corresponding data. For example, ds=20210101.

    • Use a variable to extract a subset of data. For example, ds=${bizdate}.

    Output Fields

    The Output Fields section displays all fields from the selected tables that match the filter conditions. You can perform the following operations:

    • Manage fields: If you do not need to output certain fields to downstream components, you can delete them:

      • To delete a single field: To delete a small number of fields, click the sgaga icon in the Actions column to remove unwanted fields.

      • To delete fields in a batch: To delete many fields, click Manage Fields. In the Manage Fields dialog box, select multiple fields, click the image shift-left icon to move the selected input fields to the unselected list, and then click OK to delete the fields in a batch.

        image..png

    • Batch Add: Click Batch Add to configure fields in a batch using JSON, TEXT, or DDL format.

      Note

      After you add fields in a batch and click OK, the existing field configuration is overwritten.

      • To configure in JSON format, for example:

        // Example:
          [{
             "index": 1,
             "name": "id",
             "type": "int(10)",
             "mapType": "Long",
             "comment": "comment1"
           },
           {
             "index": 2,
             "name": "user_name",
             "type": "varchar(255)",
             "mapType": "String",
             "comment": "comment2"
         }]
        Note

        `index` specifies the column number of the object. `name` specifies the field name after import. `type` specifies the field type after import. For example, "index":3,"name":"user_id","type":"String" means that the fourth column of the file is imported with the field name `user_id` and the field type `String`.

      • To configure in TEXT format, for example:

        // Example:
        1,id,int(10),Long,comment1
        2,user_name,varchar(255),Long,comment2
        • The row delimiter separates the information for each field. The default delimiter is a line feed (\n). Semicolons (;) and periods (.) are also supported.

        • The column delimiter separates the field name from the field type. The default is a half-width comma (,). A',' is supported. The field type can be omitted, and the default delimiter is a','.

      • To configure in DDL format, for example:

        CREATE TABLE tablename (
        	user_id serial,
        	username VARCHAR(50),
        	password VARCHAR(50),
        	email VARCHAR (255),
        	created_on TIMESTAMP,
        );
    • Add Output Field: Click + Add Output Field, and follow the prompts to enter the Column, Type, and Comment, and select the Mapping Type. After you configure the current row, click the image icon to save.

  8. Click Confirm to save the configuration for the Doris input component.