All Products
Search
Document Center

Dataphin:Configure the StarRocks output component

Last Updated:Mar 05, 2026

The StarRocks output component writes data to a StarRocks data source. When you synchronize data from other data sources to a StarRocks data source, after you configure the source data source, you must configure the StarRocks output component to write data to the target data source. This topic describes how to configure the StarRocks output component.

Prerequisites

  • You have added a StarRocks data source. For more information, see Create a StarRocks data source.

  • The account that you use to configure the properties of the StarRocks output component must have write-through permission for the data source. If you do not have the permission, you must request permission for the data source. For more information, see Request, renew, and revoke data source permissions.

Stream load data synchronization latency

When importing data to a StarRocks database using stream load, the returned status can vary. A 'publish timeout' might occur. In this case, the job succeeds, but query latency may occur. Monitor the status in the operational log:

  • Success: The import succeeded, and data is visible.

  • Publish Timeout: The import job committed successfully, but data is not immediately visible for some reason. Consider it successful. No re-import is needed.

  • Label Already Exists: Another job already uses this label. The import might have succeeded or is in progress.

  • Fail: This import failed. You can retry the job by specifying a label.

Procedure

  1. On the Dataphin home page, select Development > Data Integration from the top menu bar.

  2. On the Integration page, in the top menu bar, select Project (In Dev-Prod mode, select an environment).

  3. In the navigation pane on the left, click Offline Integration. Then, in the Offline Integration list, click the offline pipeline you want to develop to open its configuration page.

  4. Click Component Library in the upper-right corner of the page to open the Component Library panel.

  5. In the navigation pane on the left of the Component Library panel, select Output. Find the StarRocks component in the list on the right, and then drag it to the canvas.

  6. Click and drag the image icon of the target input component. Connect it to the current StarRocks output component.

  7. Click the image icon in the StarRocks output component card. This opens the StarRocks output configuration dialog box.

    image

  8. In the StarRocks output configuration dialog box, configure the following parameters.

    Parameter

    Description

    Basic Settings

    Step Name

    The name of the StarRocks output component. Dataphin automatically generates the step name. You can modify it as needed. Naming conventions are as follows:

    • Can contain only Chinese characters, letters, underscores (_), and digits.

    • Length cannot exceed 64 characters.

    Datasource

    The data source drop-down list displays all StarRocks data sources, including those for which you have write-through permissions and those for which you do not. Click the image icon to copy the current data source name.

    • For a data source for which you do not have write-through permissions, you can click Request to request the permissions. For more information, see Request data source permissions.

    • If you do not have a StarRocks data source, click New Data Source to create one. For more information, see Create a StarRocks data source.

    Table

    Select the target table for output data. You can enter a table name keyword to search, or enter the exact table name and click Exact Search. After you select a table, the system automatically performs a table status check. Click the image icon to copy the name of the currently selected table.

    If the StarRocks data source does not have a target table for data synchronization, use the one-click table creation feature to quickly generate a target table. The detailed procedure is as follows:

    1. Click One-Click Table Creation. Dataphin automatically generates the code required to create the target table, including the target table name (default: source table name) and field types (preliminary conversion based on Dataphin fields).

    2. You can modify the SQL script for creating the target table as needed, then click New. After the target table is successfully created, Dataphin automatically sets the newly created target table as the target table for output data.

      Note
      • If a table with the same name exists in the development environment, Dataphin reports an error after you click Create.

      • If no matching item is found, you can also integrate based on a manually entered table name.

      • View selection is not supported in copy mode.

    Production Table Missing Policy

    The policy for handling cases where the production table does not exist. You can select No Action or Automatic Creation. The default is Automatic Creation. If you select No Action, the production table is not created when the task is published. If you select Automatic Creation, a table with the same name is created in the target environment when the task is published.

    • Do Not Process: If the target table does not exist, a message indicating that the target table does not exist is displayed when you submit, but you can still proceed with publishing. In this case, you must manually create the target table in the production environment before you can execute the task.

    • Automatic Creation: You must Edit The Table Creation Statement. The statement is pre-filled by default based on the selected table, and you can make adjustments. In the statement, use the placeholder ${table_name} for the table name. Only this placeholder is supported, and it is replaced with the actual table name during execution.

      If the target table does not exist, the system first creates the table according to the table creation statement. If table creation fails, the publishing check result is 'failed'. Modify the table creation statement based on the error message, then publish again. If the target table already exists, table creation is not performed.

    Note

    This option is supported only in Dev-Prod mode projects.

    Data Format

    You can select CSV or JSON.

    If you select CSV, also configure CSV import column delimiter and CSV import row delimiter.

    CSV import column delimiter (Optional)

    When using StreamLoad CSV import, configure the CSV import column delimiter here. The default is _@dp@_. Do not explicitly specify it if you use the default value. If your data contains _@dp@_, customize another character as the delimiter.

    CSV import row delimiter (Optional)

    When using StreamLoad CSV import, configure the CSV import row delimiter here. The default is _#dp#_. Do not explicitly specify it if you use the default value. If your data contains _#dp#_, customize another character as the delimiter.

    Batch write data volume (Optional)

    Specifies the data volume per write operation. You can also set the Batch Write Count. During writing, the system writes data based on whichever limit—the data volume or the batch count—is reached first. The default is 32 MB.

    Number of batch write records (Optional)

    The default value is 2,048 records. During data synchronization, a batch writing policy is used. The parameters for this policy are Number Of Records Per Batch and Data Volume Per Batch.

    • When the accumulated data reaches either of the set limits (batch write data volume or number of records), the system considers a batch full and immediately writes this batch of data to the target.

    • Set the batch write data volume to 32 MB. Adjust the maximum number of batch insert records based on the actual size of a single record. Typically, set a larger value to fully leverage the benefits of batch writing. For example, if a single record is approximately 1 KB, set the batch insert byte size to 16 MB. Considering this, set the number of batch insert records to a value greater than 16 MB divided by 1 KB (i.e., greater than 16,384 records). Here, assume it is set to 20,000 records. With this configuration, the system triggers batch write operations based on the batch insert byte size. Each time the accumulated data reaches 16 MB, a write action is performed.

    Prepare Statement (Optional)

    The SQL script executed on the database before data import.

    For example, to ensure continuous service availability, create target table Target_A before this step writes data. Write data to Target_A. After this step finishes writing data, rename Service_B (the continuously serving table in the database) to Temp_C, then rename Target_A to Service_B, and finally delete Temp_C.

    End Statement (Optional)

    The SQL script executed on the database after data import.

    Field Mapping

    Input Fields

    Displays input fields based on the output of upstream components.

    Output Fields

    Displays output fields. It supports the following operations:

    • Field Management: Click Field Management to select output fields.

      image

      • Click the gaagag icon to move Selected input fields to Unselected input fields.

      • Click the agfag icon to move Unselected input fields to Selected input fields.

    • Batch Add: Click Batch Add. It supports batch configuration in JSON, TEXT, and DDL formats.

      • Batch configure in JSON format, for example:

        // Example:
        [{
          "name": "user_id",
          "type": "String"
         },
         {
          "name": "user_name",
          "type": "String"
         }]
        Note

        name indicates the imported field name, and type indicates the field type after import. For example, "name":"user_id","type":"String" means to import the field named user_id and set its field type to String.

      • Batch configure in TEXT format, for example:

        // Example:
        user_id,String
        user_name,String
        • The row delimiter separates information for each field. The default is a line feed (\n). It supports line feed (\n), semicolon (;), and period (.).

        • The column delimiter separates the field name and field type. The default is a comma (,).

      • Batch configure in DDL format, for example:

        CREATE TABLE tablename (
            id INT PRIMARY KEY,
            name VARCHAR(50),
            age INT
        );
    • Create Output Field: Click +Create Output Field. Fill in Column and select Type as prompted on the page. After configuring the current row, click the image icon to save.

    Mapping

    Based on the upstream input and fields in the target table, you can manually select field mappings. Mapping includes Row Mapping and Name Mapping.

    • Name-based mapping: Maps fields with the same name.

    • Row-wise mapping: The source table and target table have different field names, but data in corresponding rows needs mapping. Only maps fields in the same row.

  9. You can click Confirm to complete the property configuration of the StarRocks output component.