This topic describes how to import incremental data from an ApsaraDB RDS cluster to
an HBase cluster.
Prerequisites
- Lindorm Tunnel Service (LTS) is activated. The username and the password for logging
on to the LTS web UI are set.
- LTS is connected to the network in which your HBase cluster for migration is deployed.
- An HBase data source is added.
- Data Transmission Service (DTS) is activated. A DTS data source is added.
Applicable versions
- Self-managed HBase V1.x and V2.x
- EMR HBase
- ApsaraDB for HBase Standard Edition and ApsaraDB for HBase Performance-enhanced Edition
- ApsaraDB for HBase to which a Phoenix data source is added
Create a task
- Log on to the LTS web UI. In the top navigation bar, choose Tasks > RDS Real-time Change Tracking.
- Select the DTS subscription channel and the HBase cluster to which a Phoenix data
source is added, and create mappings for the tables to synchronize.
Map an ApsaraDB RDS table to an HBase table
{
"mapping": [
{
"columns": [
{
"name": "cf1:hhh",
"value": "{{ concat(title, id) }}"
},
{
"name": "cf1:title",
"value": "title"
},
{
"name": "cf1:*"
}
],
"config": {
"skipDelete": true
},
"rowkey": {
"value": "{{ concat('idg', id) }}"
},
"srcTableName": "hhh_test.test",
"targetTableName": "default:_test"
}
]
}
Parameter |
Description |
Required |
mapping[y].srcTableName |
The name of the source ApsaraDB RDS table. |
Yes |
mapping[y].targetTableName |
The name of the destination HBase table. |
Yes |
mapping[y].columns |
The mapping of the columns between the ApsaraDB RDS table and the HBase table. |
Yes |
mapping[y].columns[x].name |
The column names in the HBase table. |
Yes |
mapping[y].columns[x].value |
The expression that is used to calculate column values in the HBase table. The expression
uses the Jtwig syntax. You can use this expression to perform simple calculations
on the column values of the source table to obtain rowkeys.
|
Yes |
mapping[y].config |
The policy that is used to synchronize data between tables. |
No |
mapping[y].rowkey |
The rule that is used to generate a rowkey for the HBase table. |
Yes |
- You can use the following simple expression:
{
"name": "cf1:hhh",
"value": "{{ concat(title, id) }}"
}
- You can configure a dynamic column. This way, you can insert a column without the
need to predefine the column.
{
"name": "cf1:*",
}
- You can specify the start time to track data changes. If data has a timestamp that
is later than the specified timestamp, the data is synchronized over the DTS subscription
channel.
{
"config": {
"startOffset":1569463200 // Unit: seconds.
},
"mapping": [
"srcTableName": "hhh_test.test",
"targetTableName": "default:test",
"columns": [
{
"name": "cf1:*"
}
],
"config": {
"skipDelete": true
},
"rowkey": {
"value": "{{ concat('idg', id) }}"
}
}
]
}
- You can execute the following DML statements.
Statement |
Supported |
Description |
INSERT |
Yes |
This operation is similar to the PUT operation in HBase. |
UPDATE |
Yes |
This operation is similar to the PUT operation in HBase. |
DELETE |
Yes |
You can specify whether to synchronize the DELETE operation from the source table
to the destination table. By default, the DELETE operation is not synchronized.
|
Map an ApsaraDB RDS table to a Phoenix table
{
"mapping": [
{
"srcTableName": "hhh_test.phoenix_test",
"targetTableName": "phoenix_test",
"config": {
"skipDelete": true
},
"columns": [
{
"name": "id",
"isPk": true
},
{
"name": "title",
"value": "title"
},
{
"name": "ts",
"value": "ts"
},
{
"name": "datetime",
"value": "datetime"
}
]
}
]
}
Parameter |
Description |
Required |
mapping[y].srcTableName |
The name of the source ApsaraDB RDS table. |
Yes |
mapping[y].targetTableName |
The name of the destination Phoenix table. |
Yes |
mapping[y].columns |
The mapping of the columns between the ApsaraDB RDS table and the Phoenix table. |
Yes |
mapping[y].columns[x].name |
The column names in the Phoenix table. |
Yes |
mapping[y].columns[x].value |
The column names in the ApsaraDB RDS table. |
Yes |
mapping[y].columns[x].isPk |
The primary key columns. |
Yes |
mapping[y].config |
The policy that is used to synchronize data between tables. |
No |
mapping[y].rowkey |
The rule that is used to generate a rowkey for the Phoenix table. |
Yes |