All Products
Search
Document Center

Dataphin:Configure HBase input component

Last Updated:Mar 05, 2026

The HBase input component reads data from an HBase data source. When you need to synchronize data from an HBase data source to other data sources, you must first configure the HBase input component to read the data source, and then configure the target data source for data synchronization. This topic describes how to configure an HBase input component.

Prerequisites

  • You have purchased and enabled the high availability (HA) feature of the DataService Studio or Tag Service module to configure primary/secondary links for data sources.

  • You have created an HBase data source. For more information, see Create an HBase data source.

  • The account used to configure the HBase input component properties must have read-through permission on the data source. If you do not have the permission, you need to request it. For more information, see Request, renew, and return permissions on a data source.

Procedure

  1. In the top navigation bar of the Dataphin homepage, choose Develop > Data Integration.

  2. In the top navigation bar of the integration page, select a project (In Dev-Prod mode, you need to select an environment).

  3. In the left-side navigation pane, click Batch Pipeline. In the Batch Pipeline list, click the offline pipeline that you want to develop to open its configuration page.

  4. Click Component Library in the upper-right corner of the page to open the Component Library panel.

  5. In the left-side navigation pane of the Component Library panel, select Inputs. Find the HBase component in the input component list on the right and drag it to the canvas.

  6. Click the image icon in the HBase input component card to open the HBase Input Configuration dialog box.

  7. In the HBase Input Configuration dialog box, configure the parameters.

    Parameter

    Description

    Step Name

    The name of the HBase input component. Dataphin automatically generates a step name, which you can modify based on your business scenario. The name must meet the following requirements:

    • It can contain only Chinese characters, letters, underscores (_), and digits.

    • It cannot exceed 64 characters in length.

    Datasource

    The dropdown list displays all HBase data sources in the current Dataphin instance, including those for which you may not have read-through permission. Click the image icon to copy the current data source name.

    • For data sources for which you do not have read-through permission, you can click Request next to the data source to request read-through permission. For more information, see Request permission on a data source.

    • If you do not have an HBase data source, click Create to create one. For more information, see Create an HBase data source.

    Select Link

    If you have enabled the high availability feature of Tag Service and the selected HBase data source has Active/standby Links, you can select either the Active Link or Standby Link for integration. This only affects the production data source.

    Table

    You can enter a keyword to search for tables or enter the exact table name and click Exact Match. Click the image icon to copy the name of the selected table.

    Output Mode

    Select an output mode. The options are Normal Mode and Multi-version Mode (Vertical Table).

    maxversion

    If you select Multi-version Mode (Vertical Table) as the output mode, you need to specify maxversion.

    maxversion specifies the number of versions to read. A value of -1 indicates that all versions are read.

    File Encoding

    Select a file encoding format. The system supports File Encoding formats including UTF-8 and GBK.

    Start Rowkey

    Specifies a starting rowkey as the starting point for scanning. All rows with rowkeys that are lexicographically greater than or equal to this starting rowkey will be included in the scan results. For example, aaa (string) or 10110 (binary).

    End Rowkey

    Defines the end position of the scan operation. If an end rowkey is specified, all rows with rowkeys that are lexicographically less than this rowkey will be scanned, but the end rowkey itself is not included (i.e., the scan is a left-closed, right-open interval). For example, to scan all user records from user0001 to user9999 in an HBase table. You can set the start rowkey to user0001 and the end rowkey to user10000. This will return all rows that start with user and have rowkey values between user0001 and user10000, but will not include the row with the rowkey user10000.

    Start Rowkey Type

    Select the type of the start rowkey. The options are String or Binary.

    Output Fields

    Displays the output fields.

    • Batch Add Fields.

      1. Click Batch Add.

        • Configure in JSON format. For example:

          // Example:
          [{
            "name": "cf1:q1",
            "type": "string" 
           },
           { 
            "name": "cf1:q2",
            "type": "string"
           }, 
           {
            "name": "cf1:q3", 
            "type": "string"
           }]
          Note

          name represents the imported column family and field name, and type represents the field type. For example, "name":"cf1:a","type":"String" indicates that the field a in the column family cf1 is imported, and the field type is String.

        • Configure in TEXT format. For example:

          // Example:
          cf1:q1,string
          cf1:q2,string
          cf1:q3,string
          • The row delimiter is used to separate the information of each field. The default is a line feed (\n). Supported delimiters include line feed (\n), semicolon (;), and period (.).

          • The column delimiter is used to separate the field name and field type. The default is a comma (,).

      2. Click OK.

    • Create A New Output Field.

      Click Create Output Field, and fill in the Column Family, Column, and select the Type as prompted.

    • Manage output fields.

      You can perform the following operations on added fields:

      • Click and drag the Columnimage icon next to to change the position of the field.

      • Click the Operationagag icon in the column to edit an existing field.

      • Click the Operationagfag icon in the column to delete an existing field.

  8. Click OK to complete the property configuration of the HBase input component.