All Products
Search
Document Center

Dataphin:Configure local file input component

Last Updated:Mar 05, 2026

The local file input component enables the uploading of local text, Excel (xls, xlsx), and CSV files to Dataphin, and facilitates data synchronization with other data sources. This topic outlines the configuration process for the local file input component.

Limits

The local file input component is only configurable for one-time tasks.

Procedure

  1. On the Dataphin home page, navigate to the top menu bar and select Development > Data Integration.

  2. At the top menu bar of the integration page, select Project (Dev-Prod mode requires selecting the environment).

  3. In the left-side navigation pane, single click Batch Pipeline. Then, in the Batch Pipeline list, single click the desired offline pipeline to access its configuration page.

  4. Click the Component Library in the upper right corner to open the Component Library panel.

  5. In the Component Library panel's left-side navigation pane, select Input. Locate the Local File component in the list on the right and drag it onto the canvas.

  6. Click the image icon on the component card to open the Local File Input Configuration dialog box.

    image

  7. In the Local Text Input Configuration dialog box, you can select file types such as csv, text, xls, and xlsx. Follow these configuration instructions:

    Text file type

    Parameter

    Description

    Step Name

    The local file input component's step name is auto-generated by Dataphin, but you can modify it to suit your business scenario. The naming convention is as follows:

    • Must only contain Chinese characters, uppercase and lowercase English letters, underscores (_), and numbers.

    • Should not exceed 64 characters in length.

    File Type

    Choose the text file type.

    File Path

    Click Select File or drag the object file into the file path area.

    Note

    Only .txt format files are supported, and the file size must not exceed 500MB.

    First Row Content Type

    Choose between Data Content and Column Name for the first row content type.

    First Row Content Start Row

    • If the first row content type is Column Name, the data content start row must be 2 or greater.

    • If the first row content type is Data Content, the data content start row must be 1 or greater.

    Row Delimiter, Column Delimiter (optional)

    Row Delimiter: The delimiter separating rows in the file. If not specified, the default is \n. For other characters, enter them and click Parse.

    Column Delimiter: The delimiter separating columns in the file. If not specified, the default is a comma (,).

    File Encoding

    Select the file encoding method. The system supports UTF-8 and GBK encoding.

    Advanced Configuration

    Enter the read control configuration item. The sample code is as follows:

    {
     "textReaderConfig":{
     "caseSensitive":true,
     "useTextQualifier":false,
     "textQualifier":"\"",
     "trimWhitespace":false
     }
    }

    Create Output Fields

    The output fields will be displayed for configuration.

    • Batch Add Fields.

      1. Click Batch Add.

        • Enter batch configuration in JSON format. Example:

          [{
            "index": 0,
            "name": "cf1a",
            "type": "String"
           },
           {
            "index": 1,
            "name": "cf1b",
            "type": "String"
           }]
          Note

          Here, 'index' refers to the column number, 'name' to the field name, and 'type' to the field type. For instance, "name":"user_id","type":"String" means the field named user_id is introduced as a String type.

        • Enter batch configuration in TEXT format. Example:

          0,cf1a,String
          1,cf1b,String
          • The row delimiter separates each field's information, defaulting to a line feed (\n). It supports line feed (\n), semicolon (;), and period (.).

          • The column delimiter separates the field name from the field type, defaulting to a comma (,).

      2. Click Confirm to save the configuration.

    • Create Output Fields.

      Single click Create Output Fields, and fill in the Source Ordinal Number, Column and select Type according to the page prompts. The source ordinal number of the text file type must be filled in with the numeric ordinal number of the column where the field is located, starting from 0.

    • Manage Output Fields.

      You can perform the following actions on the added fields:

      • Click and drag the Column icon image to reorder the fields.

      • Click the operation column's agag icon to modify existing fields.

      • Single-click the operation column's agfag icon to remove the selected field.

    CSV file type

    Parameter

    Description

    Step Name

    Enter the name for the local file input step. While Dataphin generates a default step name, you may customize it to suit your business scenario. Adhere to the following naming convention:

    • Include only Chinese characters, uppercase and lowercase English letters, underscores (_), and numbers.

    • Limit the name to a maximum of 64 characters.

    File Type

    Choose the CSV file type.

    File Path

    To select a file, either single click Select File or drag the object file into the designated file path area.

    Note

    Only csv file types are supported, with a maximum file size of 500MB.

    Character Delimiter

    Specify the column delimiter for the file. If left blank, the default is a comma (,).

    File Encoding

    Select the encoding for the file. Supported encodings include UTF-8 and GBK.

    First Row Content Type

    Choose whether the first row contains Data Content or Column Names.

    Data Content Start Row

    • If Column Name is selected for the first row, the data content must start from row 2 or higher.

    • If Data Content is selected for the first row, the data content must start from row 1 or higher.

    Create Output Fields

    The output fields will be displayed here.

    • Batch add fields

      1. Single click Batch Add.

        • Enter batch configuration in JSON format. For example:

          [{
            "index": 0,
            "name": "cf1a",
            "type": "String"
           },
           {
            "index": 1,
            "name": "cf1b",
            "type": "String"
           }]
          Note

          Here, 'index' specifies the column number, 'name' the field name, and 'type' the field data type. For instance, "name":"user_id","type":"String" adds a field named user_id with a String data type.

        • Enter batch configuration in TEXT format. For example:

          0,cf1a,String
          1,cf1b,String
          • A row delimiter separates each field's information, with the default being a line feed (\n). Supported delimiters include line feed (\n), semicolon (;), and period (.).

          • A column delimiter separates the field name from the field type, with the default being a comma (,).

      2. Single click Confirm.

    • Create Output Fields.

      To create output fields, single click Create Output Fields and enter the Source Ordinal Number and Column, then select the Type as prompted. For CSV file types, the source ordinal number must correspond to the column's numeric ordinal, starting at 0.

    • Manage Output Fields.

      The following actions can be taken on the fields you've added:

      • To rearrange field positions, single click and drag the Column icon image.

      • Click the operation column's agag icon to edit existing fields.

      • Single click the operation column's agfag icon to delete the existing field.

    XLS or XLSX file type

    Parameter

    Description

    Step Name

    The name of the step in the local file input process. Dataphin automatically generates a step name, but you can modify it to suit your business scenario. The naming convention is as follows:

    • May include Chinese characters, uppercase and lowercase English letters, underscores (_), and numbers.

    • Cannot exceed 64 characters in length.

    File Type

    Choose either the xls or xlsx file type.

    File Path

    Click Select File or drag the object file into the designated file path area.

    Note
    • For xls file type, only .xls format is supported; for xlsx, only .xlsx format is supported. The maximum file size is 500MB.

    • System parsing supports files up to 50MB. If the file size exceeds 50MB, automatic parsing of output fields is not available. Please create output fields manually.

    Sheet Selection

    Choose sheets by name or index.

    • By Name: Specify the sheet name to read.

    • By Index: Specify the sheet index to read, starting from 0.

    First Row Content Type

    Choose between data content and column names for the first row.

    Data Content Start Row

    If the first row is column names, the data content must start from row 2 or higher; if it's data content, it must start from row 1 or higher.

    Data Content End Row

    The data content end row must be equal to or greater than the start row. If unspecified, the system reads to the last row containing data by default.

    Export Sheet Name

    Optionally export the name of the source sheet. When selected, Output Fields will include a Source Sheet field, with the content formatted as {file name}-{sheet name}.

    File Encoding

    Select the encoding for the file. The system supports UTF-8 and GBK encoding methods.

    Output Fields

    Review the fields that will be output.

    • Batch Add Fields.

      1. Click Batch Add.

        • Enter batch configurations in JSON format. For example:

          [{
            "index": 0,
            "name": "cf1a",
            "type": "String"
          },
          {
            "index": 1,
            "name": "cf1b",
            "type": "String"
          }]
          Note

          Here, 'index' refers to the column number, 'name' to the field name, and 'type' to the field type. For instance, "name":"user_id","type":"String" means the field named user_id is introduced as a String type.

        • Enter batch configurations in TEXT format. For example:

          0,cf1a,String
          1,cf1b,String
          • Use the row delimiter to separate each field's information, with the default being a line feed (\n). It supports line feed (\n), semicolon (;), and period (.).

          • Use the column delimiter to separate the field name from the field type, with the default being a comma (,).

      2. Click Confirm.

    • Create Output Fields.

      Click Create Output Fields and follow the prompts to fill in the Source Ordinal Number, Column, and select the Type.

    • Manage Output Fields.

      Perform the following operations on the added fields:

      • Click and drag the Column icon image to rearrange the fields.

      • To edit existing fields, single-click the agag icon in the operation column.

      • Single click the operation column's agfag icon to delete the existing field.

  8. To finalize the configuration of the Local Text input component, click Confirm.