This topic describes how to import incremental data from ApsaraDB RDS.
Prerequisites
- Lindorm Tunnel Service (LTS) is activated.
- The network is connected. For more information, see Network connection.
- An HBase data source is added. For more information, see HBase data source.
- Data Transmission Service (DTS) is activated. A DTS data source is added.
Versions
- Self-managed HBase V1.x and V2.x
- EMR HBase
- ApsaraDB for HBase Standard Edition and ApsaraDB for Lindorm (the standalone edition and cluster edition)
- ApsaraDB for HBase Phoenix
Create a task
- Log on to the BDS web UI and choose Tasks > RDS Real-time Change Tracking.
- Select the corresponding Data Transmission Service (DTS) change tracking channel and HBase (Phoenix) cluster, and create mappings for the tables to be synchronized.
Parameter description
- HBase table mapping
{ "mapping": [ { "columns": [ { "name": "cf1:hhh", "value": "{{ concat(title, id) }}" }, { "name": "cf1:title", "value": "title" }, { "name": "cf1:*" } ], "config": { "skipDelete": true }, "rowkey": { "value": "{{ concat('idg', id) }}" }, "srcTableName": "hhh_test.test", "targetTableName": "default:_test" } ] }
Parameter Description Required mapping[y].srcTableName The name of the source ApsaraDB RDS table. Yes mapping[y].targetTableName The name of the destination HBase table. Yes mapping[y].columns The mapping of columns between the ApsaraDB RDS table and HBase table. Yes mapping[y].columns[x].name The names of corresponding columns in the HBase table. Yes mapping[y].columns[x].value An expression for the corresponding HBase columns. The expression uses the Jtwig syntax. You can use the expression to perform simple calculations on the columns of the source table to obtain the rowkey. Yes mapping[y].config The synchronization policy. No mapping[y].rowkey The rule for generating a rowkey of an HBase table. Yes - The following simple expressions are supported:
{ "name": "cf1:hhh", "value": "{{ concat(title, id) }}" }
- Dynamic columns are supported. Columns that are not matched follow the default matching
settings.
{ "name": "cf1:*", }
- You can specify the start time for change tracking. Data whose timestamp is later
than the specified time is synchronized over the DTS change tracking channel.
{ "config": { "startOffset":1569463200 // Unit: seconds. }, "mapping": [ "srcTableName": "hhh_test.test", "targetTableName": "default:test", "columns": [ { "name": "cf1:*" } ], "config": { "skipDelete": true }, "rowkey": { "value": "{{ concat('idg', id) }}" } } ] }
- DML support
Operation Supported Remarks INSERT Yes This operation corresponds to PUT in HBase. UPDATE Yes This operation corresponds to PUT in HBase. DELETE Yes You can specify whether to synchronize DELETE operations from source tables to destination tables. By default, DELETE operations are not synchronized.
- The following simple expressions are supported:
- Phoenix table mapping
{ "mapping": [ { "srcTableName": "hhh_test.phoenix_test", "targetTableName": "phoenix_test", "config": { "skipDelete": true }, "columns": [ { "name": "id", "isPk": true }, { "name": "title", "value": "title" }, { "name": "ts", "value": "ts" }, { "name": "datetime", "value": "datetime" } ] } ] }
Parameter Description Required mapping[y].srcTableName The name of the source ApsaraDB RDS table. Yes mapping[y].targetTableName The name of the destination Phoenix table. Yes mapping[y].columns The mapping of columns between the ApsaraDB RDS table and the Phoenix table. Yes mapping[y].columns[x].name The names of the columns in the Phoenix table. Yes mapping[y].columns[x].value The names of the columns in the ApsaraDB RDS table. Yes mapping[y].columns[x].isPk Specify the primary key column. Yes mapping[y].config The synchronization policy. No mapping[y].rowkey The rule for generating a rowkey of an HBase table. Yes