This topic describes how to import incremental data from an ApsaraDB RDS cluster to an HBase cluster.

Prerequisites

  • Lindorm Tunnel Service (LTS) is activated. The username and the password for logging on to the LTS web UI are set.
  • LTS is connected to the network in which your HBase cluster for migration is deployed.
  • An HBase data source is added.
  • Data Transmission Service (DTS) is activated. A DTS data source is added.

Applicable versions

  • Self-managed HBase V1.x and V2.x
  • EMR HBase
  • ApsaraDB for HBase Standard Edition and ApsaraDB for HBase Performance-enhanced Edition
  • ApsaraDB for HBase to which a Phoenix data source is added

Create a task

  1. Log on to the LTS web UI. In the top navigation bar, choose Tasks > RDS Real-time Change Tracking.
  2. Select the DTS subscription channel and the HBase cluster to which a Phoenix data source is added, and create mappings for the tables to synchronize.

Map an ApsaraDB RDS table to an HBase table

{
  "mapping": [
    {
      "columns": [
        {
          "name": "cf1:hhh",
          "value": "{{ concat(title, id) }}"
        },
        {
          "name": "cf1:title",
          "value": "title"
        },
        {
          "name": "cf1:*"
        }
      ],
      "config": {
        "skipDelete": true
      },
      "rowkey": {
        "value": "{{ concat('idg', id) }}"
      },
      "srcTableName": "hhh_test.test",
      "targetTableName": "default:_test"
    }
  ]
}
Parameter Description Required
mapping[y].srcTableName The name of the source ApsaraDB RDS table. Yes
mapping[y].targetTableName The name of the destination HBase table. Yes
mapping[y].columns The mapping of the columns between the ApsaraDB RDS table and the HBase table. Yes
mapping[y].columns[x].name The column names in the HBase table. Yes
mapping[y].columns[x].value The expression that is used to calculate column values in the HBase table. The expression uses the Jtwig syntax. You can use this expression to perform simple calculations on the column values of the source table to obtain rowkeys. Yes
mapping[y].config The policy that is used to synchronize data between tables. No
mapping[y].rowkey The rule that is used to generate a rowkey for the HBase table. Yes
  • You can use the following simple expression:
    {
      "name": "cf1:hhh",
      "value": "{{ concat(title, id) }}"
    }
  • You can configure a dynamic column. This way, you can insert a column without the need to predefine the column.
    {
        "name": "cf1:*",
    }
  • You can specify the start time to track data changes. If data has a timestamp that is later than the specified timestamp, the data is synchronized over the DTS subscription channel.
    {
      "config": {
          "startOffset":1569463200 // Unit: seconds.
      },
      "mapping": [
          "srcTableName": "hhh_test.test",
          "targetTableName": "default:test",
          "columns": [
            {
              "name": "cf1:*"
            }
          ],
          "config": {
            "skipDelete": true
          },
          "rowkey": {
            "value": "{{ concat('idg', id) }}"
          }
        }
      ]
    }
  • You can execute the following DML statements.
    Statement Supported Description
    INSERT Yes This operation is similar to the PUT operation in HBase.
    UPDATE Yes This operation is similar to the PUT operation in HBase.
    DELETE Yes You can specify whether to synchronize the DELETE operation from the source table to the destination table. By default, the DELETE operation is not synchronized.

Map an ApsaraDB RDS table to a Phoenix table

{
  "mapping": [
    {
      "srcTableName": "hhh_test.phoenix_test",
      "targetTableName": "phoenix_test",
      "config": {
        "skipDelete": true
      },
      "columns": [
        {
          "name": "id",
          "isPk": true
        },
        {
          "name": "title",
          "value": "title"
        },
        {
          "name": "ts",
          "value": "ts"
        },
        {
          "name": "datetime",
          "value": "datetime"
        }
      ]
    }
  ]
}
Parameter Description Required
mapping[y].srcTableName The name of the source ApsaraDB RDS table. Yes
mapping[y].targetTableName The name of the destination Phoenix table. Yes
mapping[y].columns The mapping of the columns between the ApsaraDB RDS table and the Phoenix table. Yes
mapping[y].columns[x].name The column names in the Phoenix table. Yes
mapping[y].columns[x].value The column names in the ApsaraDB RDS table. Yes
mapping[y].columns[x].isPk The primary key columns. Yes
mapping[y].config The policy that is used to synchronize data between tables. No
mapping[y].rowkey The rule that is used to generate a rowkey for the Phoenix table. Yes