All Products
Search
Document Center

ApsaraDB for HBase:Archive incremental data to MaxCompute

Last Updated:Mar 28, 2026
Important

This feature is unavailable for Lindorm Tunnel Service (LTS) instances purchased after June 16, 2023. If your LTS instance was purchased before that date, you can still use this feature.

Use Lindorm Tunnel Service (LTS) to continuously archive HBase incremental data to MaxCompute for offline analysis and long-term storage. LTS reads HBase write-ahead logs (WAL) and syncs new data to partitioned MaxCompute tables at a configurable interval.

Prerequisites

Before you begin, ensure that you have:

  • LTS activated

  • An HBase data source added to LTS

  • A MaxCompute data source added to LTS

Supported versions

  • Self-managed HBase V1.x and HBase V2.x

  • Elastic MapReduce (EMR) HBase

  • ApsaraDB for HBase Standard Edition

  • ApsaraDB for HBase Performance-enhanced Edition (cluster mode only)

  • Lindorm

Limitations

  • LTS archives data based on HBase WAL. Data imported through bulk loading is not captured and cannot be exported.

Log data lifecycle

  • Archived log data is retained for 48 hours by default if it is not consumed after you enable the archiving feature. After that, LTS automatically cancels the log subscription and deletes the retained data.

  • If you release an LTS instance without first stopping its synchronization tasks, those tasks are suspended and data stops being consumed.

Create an archiving job

  1. Log on to the LTS web UI. In the left-side navigation pane, choose Data Export > Incremental Archive to MaxCompute.

    1

  2. Click create new job. Select a source HBase cluster and a destination MaxCompute cluster, then specify the HBase table to export. The example above shows an archiving job for the wal-test table: Set mergeStartAt to a past timestamp to archive historical data from that point forward.

    • Columns to archive: cf1:a, cf1:b, cf1:c, cf1:d

    • mergeInterval: 86400000 (default; archives data once per day)

    • mergeStartAt: 20190930000000 (September 30, 2019 at 00:00)

    Great Job

  3. Monitor progress in the job detail view:

    • Real-time Synchronization Channel: shows latency and start offset of log synchronization tasks

    • Table Merge: shows table merging tasks; after merging completes, the partitioned tables are available in MaxCompute

  4. Log on to the MaxCompute console to query the archived data.

    View Table data

Job configuration reference

Each exported table entry follows this format:

hbaseTable/odpsTable {tbConf}
PartRequiredDescription
hbaseTableYesSource HBase table name
odpsTableNoDestination MaxCompute table name. Defaults to the HBase table name. Periods (.) and hyphens (-) are automatically converted to underscores (_).
tbConfNoJSON object specifying archiving behavior. See tbConf parameters.

tbConf parameters

ParameterTypeDefaultDescription
colsArrayColumns to export and their target data types. Format: "columnFamily:qualifier|type". If no type is specified, the value is exported as HexString. Example: ["cf1:a|string", "cf1:b|int"]
mergeEnabledBooleantrueSpecifies whether to convert key-value (KV) tables into wide tables.
mergeStartAtStringStart time for table merging, in yyyyMMddHHmmss format. Accepts a past timestamp to archive historical data from that point. Example: "20191008000000"
mergeIntervalInteger (ms)86400000Interval between table merging runs. The default value of 86,400,000 ms archives data once per day.

Data type mapping for cols

Specify the target type with the columnFamily:qualifier|type format. If no type is specified, the value is exported as HexString.

HBase valueMaxCompute typecols annotation example
Stringstringcf1:a|string
Integerintcf1:b|int
Long integerlongcf1:c|long
Short integershortcf1:d|short
Decimaldecimalcf1:e|decimal
Doubledoublecf1:f|double
Floatfloatcf1:g|float
Booleanbooleancf1:h|boolean
(no annotation)HexStringcf1:i

Example configurations

hbaseTable/odpsTable {"cols": ["cf1:a|string", "cf1:b|int", "cf1:c|long", "cf1:d|short", "cf1:e|decimal", "cf1:f|double", "cf1:g|float", "cf1:h|boolean", "cf1:i"], "mergeInterval": 86400000, "mergeStartAt": "20191008100547"}
hbaseTable/odpsTable {"cols": ["cf1:a", "cf1:b", "cf1:c"], "mergeStartAt": "20191008000000"}
hbaseTable {"mergeEnabled": false} // No merge operation is performed.