All Products
Search
Document Center

Dataphin:Configure the Impala output component

Last Updated:Feb 12, 2026

The Impala output component enables data writing to an Impala data source. When synchronizing data from other sources to an Impala data source, it's necessary to configure the Impala output component after setting up the source data source information. This topic guides you through the configuration process.

Prerequisites

Procedure

  1. On the Dataphin home page, select Development > Data Integration from the top menu bar.

  2. In the integration page's top menu bar, select Project (Dev-Prod mode requires selecting Environment).

  3. In the navigation pane on the left, click Batch Pipeline, and in the Batch Pipeline list, click the offline pipeline you want to develop to open its configuration page.

  4. Click Component Library in the upper right corner of the page to open the Component Library panel.

  5. In the Component Library panel's left-side navigation pane, select Output, find the Impala component in the list on the right, and drag it to the canvas.

  6. Click and drag the image icon from the target input, transform, or flow component to connect it to the Impala output component.

  7. Click the image icon on the Impala output component card to open the Impala Output Configuration dialog box.image

  8. In the Impala Output Configuration dialog box, configure the parameters.

    Parameter

    Description

    Basic settings

    Step name

    This is the name of the Impala output component. Dataphin automatically generates the step name, and you can also modify it according to the business scenario. The name must meet the following requirements:

    • Can only contain Chinese characters, letters, underscores (_), and numbers.

    • Cannot exceed 64 characters.

    Data source

    In the data source drop-down list, all Impala-type data sources are displayed, including data sources for which you have write-through permission and those for which you do not.

    • For data sources without write-through permission, you can click Request after the data source to request write-through permission. For more information, see Request data source permissions.

    • If you do not have an Impala-type data source, click Create Data Source to create a data source. For more information, see Create an Impala data source.

    Table

    Select the target table for the output data. Click the image icon to copy the name of the currently selected table.

    Loading policy

    Impala only supports the append policy and does not support the overwrite policy. Under the append data policy, a dirty data fault will be prompted when there is a primary key or constraint violation.

    Batch write data volume

    The size of the data volume written at one time. You can also set Batch Write Count. The system will write according to the limit reached first among the two configurations. The default is 32M.

    Batch write count

    The default is 2048 entries. When data synchronization is written, a batch write strategy is adopted. The parameters set include Batch Write Count and Batch Write Data Volume.

    • When the accumulated data volume reaches any of the set limits (that is, the batch write data volume or count limit is reached), the system will consider a batch of data to be full and will immediately write this batch of data to the target end at one time.

    • It is recommended to set the batch write data volume to 32 MB. For the upper limit of batch insert count, you can flexibly adjust according to the actual size of a single record. It is usually set to a larger value to fully utilize the advantages of batch writing. For example, if the size of a single record is about 1 KB, you can set the batch insert byte size to 16 MB. Considering this condition, set the batch insert count to be greater than the result of 16 MB divided by the size of a single record, 1 KB (that is, greater than 16384 entries). Here, it is assumed to be set to 20000 entries. After such configuration, the system will trigger the batch write operation based on the batch insert byte size. Each time the accumulated data volume reaches 16 MB, a write action will be executed.

    Field mapping

    Input field

    Displays the input fields based on the output of the upstream component.

    Output field

    Displays the output fields. Click Field Management to select output fields.

    image

    • Click the gaagag icon to move the Selected Input Fields to Unselected Input Fields.

    • Click the agfag icon to move the Unselected Input Fields to Selected Input Fields.

    Mapping relationship

    Based on the input of the upstream and the fields of the target table, you can manually select field mapping. Quick Mapping includes Row Mapping and Name Mapping.

    • Name Mapping: Maps fields with the same field name.

    • Row Mapping: The field names of the source table and target table are inconsistent, but the data in the corresponding rows of the fields need to be mapped. Only fields in the same row are mapped.

  9. Click OK to finalize the property configuration for the Impala Output Component.