All Products
Search
Document Center

ApsaraDB for HBase:Introduction to LTS (formerly known as BDS)

Last Updated:May 04, 2023

Lindorm Tunnel Service (LTS) is a data ecosystem service that is customized based on the characteristics of business scenarios in which ApsaraDB for HBase is used. This service is formerly known as Big DataHub Service (BDS). LTS supports easy-to-use capabilities, including data exchange, processing, and subscription. This allows you to migrate data, subscribe to real-time data updates, dump data to data lakes, and configure backflow to data warehouses. This also allows you to apply multi-active redundancy based on units and implement backup and restoration. This way, LTS provides an all-in-one data ecosystem service for ApsaraDB for HBase.

Core capabilities

  • Cloud native distributed system: LTS is a distributed system that is deployed based on Elastic Compute Service (ECS). It features excellent horizontal scalability and allows you to configure resources based on your business requirements.

  • Ease of use: LTS allows you to configure data migration, import, subscription, and archiving tasks with a few clicks. For example, to create a data migration task, you need to only specify the migration source, the migration destination, and the columns that you want to synchronize. LTS automatically replicates schemas, full data, and incremental data based on the settings.

  • High security and reliability: LTS minimizes the impact on online source and destination systems and minimizes the impact of potential failures due to incompatibility. Before a task is started, LTS prechecks network connectivity and security. While the task is running, LTS monitors the synchronization latency and the storage usage of the destination cluster in real time. LTS also implements traffic throttling and reports alerts based on the monitoring data. After the task is complete, LTS verifies the synchronized data.

  • Cost-efficiency: LTS is an optimized service based on open source systems, such as Apache HBase, Apache Phoenix, and Apache Cassandra. LTS allows you to replicate data based on physical files, which is 10 times more efficient than traditional data replication. LTS also provides optimized CPU, cache, memory, and network I/O capabilities. This way, LTS provides cost-effective tunnels and helps reduce your costs of data transfer and processing.

Features

Feature

Scenario

References

Data migration or synchronization between HBase clusters

Seamless data migration between existing clusters and new clusters, cluster upgrades, online and offline workload decoupling, active/standby disaster recovery, and active geo-redundancy

Synchronize full and incremental data

Data export from HBase to MaxCompute (previously known as Open Data Processing Service (ODPS))

Export of historical data and incremental data

Export full data to MaxCompute

Archive incremental data to MaxCompute

Subscription to Log Service to synchronize data to HBase

Subscription to real-time data in Log Service to synchronize the data to HBase

Import incremental data from Log Service

Subscription to HBase incremental data

Subscription to real-time data of ApsaraDB for HBase Performance-enhanced Edition

Instructions

Log lifecycle management

  • If log data is not consumed after you enable the log subscription feature, the log data is retained for 48 hours by default. After the period expires, the subscription is automatically canceled, and the retained data is automatically deleted.

  • Log data may fail to be consumed if your LTS cluster is released while the task is still running or the synchronization task is suspended.

  • You can enable the log subscription feature for the following types of tasks in LTS: HBase or Lindorm incremental synchronization, data archiving, data backup, and data subscription.

Typical scenarios

  • Migration without downtime (HBase 1.x, HBase 2.x, ApsaraDB for HBase Performance-enhanced Edition, Phoenix 4.x, and Phoenix 5.x)

    • Data can be migrated without service interruption. LTS can migrate historical data and synchronize real-time incremental data in a task.

    • When data is being migrated, LTS does not interact with the source HBase cluster. LTS reads data only from the HDFS of the source cluster. This minimizes the impact on the online business that runs on the source cluster.

    • LTS performs data replication based on files. Compared with data replication performed by calling API operations, this method can reduce more than 50% of the generated traffic during the replication.

    • High efficiency. Each node can migrate data at a rate of up to 100 MB/s. You can add nodes for horizontal scaling to migrate terabytes or even petabytes of data.

    • High stability. LTS supports a robust mechanism for rerunning failed tasks. LTS monitors the synchronization rates and the progress of tasks in real time, and reports alerts if the tasks fail.

    • Data accuracy. LTS verifies the synchronized data.

    • Automatic schema synchronization. This ensures consistent partitions.

  • Online and offline workload decoupling

    • LTS allows you to synchronize online business data in real time to storage such as HDFS or OSS. LTS can work with components of big data services, such as Spark and MapReduce, to analyze data. This ensures that online business queries are not affected.

  • Active/standby disaster recovery

    • LTS supports two-way data synchronization between a primary cluster and a secondary cluster. When the primary cluster fails, you can switch your workloads to the secondary cluster to reduce the impact on the workloads. After the primary cluster recovers, you can use LTS to synchronize the incremental data from the secondary cluster to the primary cluster.

  • Historical data storage in ApsaraDB RDS databases

    • In scenarios where historical data, such as transaction orders, is stored, performance bottlenecks may occur in ApsaraDB RDS databases due to the ever-increasing data size. Periodic data archiving or sharding is complex and causes high costs. LTS allows you to synchronize ApsaraDB RDS data to ApsaraDB for HBase in real time. This can separate hot data from cold data. ApsaraDB for HBase supports automatic horizontal scaling, high-concurrency queries, multi-dimensional indexing, and lightweight analysis. Streams allows you to subscribe to data updates in order. LTS also allows you to synchronize data from ApsaraDB for HBase to other analytics systems for complex data analysis.