An Introduction to the Core Technologies behind Hologres - Alibaba's Cloud-Native Real-Time Data Warehouse

This article gives an introduction to Alibaba’s cloud-native real-time data warehouse, Hologres. It explains how Hologres has been implemented and what Alibaba's business scenarios it supports.

Step up the digitalization of your business during the Alibaba Cloud 2020 Double 11 Big Sale! Get new user coupons and explore over 16 free trials, 30+ bestselling products, and 6+ solutions for all your needs!

By Jin, Xiaojun (Xianyin), Senior Technical Expert at Alibaba Cloud.
Released by Hologres

The final turnover from Tmall’s 2020 Double 11 is 498.2 billion Chinese Yuan (CNY). Behind all of this are the greatest human-machine collaboration in mankind history. This presents an unprecedented challenge in the digital world. Hologres, Alibaba Cloud’s next-generation cloud-native data warehouse, provided important technical support during Double 11. Real-time data generated by consumer searching, browsing, favorites, or purchasing flows into Hologres; they are stored and cross-checked with the accumulated historical offline data.

During the 2020 Double 11 Global Shopping Festival, Hologres withstood a real-time data peak of 596 million records per second, with a single table storage up to 2.5 PB. It provided multi-dimensional analysis and services for external systems based on trillions of data records and returned results within 80 millseconds for 99.99% of queries. Hologres realized real-time and offline data unification and supported online application services.

Hologres starts in 2017. Over the past three plus years, Hologres has achieved many breakthroughs:

Business coverage increased from one business scenario to hundreds of business scenarios. Hologres covers more than 90% of business scenarios within the Alibaba Group. These business scenarios include Double 11 real-time livestreaming, smart recommendation, global data analysis of Umeng +, CCO intelligent customer service, new retail data platform, Kaola, Eleme, data platforms of Alimama, Alibaba Cloud International, Cainiao, and other Alibaba businesses.
The cluster number starts a few at the beginning, now reaches to nearly 10,000. The utilization rate of the storage and computing clusters is relatively high. After Hologres successfully completed the three-level jump, from productization to cloud migration and to commercialization, it perfectly empowers Alibaba Cloud's public cloud, private cloud, and financial cloud services.
The concept of Hybrid Serving & Analytical Processing (HSAP) was proposed. This integrates services and analytics into one system. The same data can meet the computing requirements in real-time, offline, and online scenarios. HSAP simplifies the data warehouse architecture, reduces costs, and redefines the development trend of data warehouse.
At the same time, the paper on a technical interpretation of Hologres was selected to the VLDB Conference. It is called Alibaba Hologres: A Cloud-Native Service for Hybrid Serving/Analytical Processing.

As Hologres has been successfully used on Alibaba’s Double 11 Tmall shopping festival, the underlying core technologies of Hologres are now revealed to the public for the first time. This article gives an introduction to Hologres and explains how Hologres has been implemented in Alibaba's core application scenarios.

1. Pain Points of Traditional Data Warehouses

1) Pain Points of Traditional Data Warehouse

Currently, big data-related business scenarios generally include real-time big screen, real-time BI report, user profile, as well as monitoring and alert, as shown in the following figure.

Real-Time Big Screen: It is generally used as an auxiliary tool for the company's leadership to make decisions and to present results to the outside. For example, the real-time transaction volume screen for Double 11.
Real-Time BI Report: It is the most commonly used business scenario for operations and product managers, and is suitable for most report analysis scenarios.
User Profile: It is often used in advertising recommendation scenarios to label users through more detailed algorithms. By doing so, marketing activities can be delivered to targeted consumers more effectively.
Monitoring and Alert Big Screen: It can monitor the traffic of websites and apps, and trigger alarms when the traffic reaches a certain threshold.

In the big data business scenarios mentioned above, the industry has long started to meet the needs of these scenarios through the construction of data warehouses. Traditionally, an offline data warehouse is constructed, as shown in the following figure. First, all kinds of data are collected and processed by ETL. Then, data aggregation, filtering, and other processing are conducted through layer-by-layer modeling. Finally, when needed, data is presented based on the tools of the application layer, or reports are created.

Although this method can be used to connect multiple data sources, it has some obvious pain points:

Complex ETL logic, and high costs of storage and time
Long data processing procedure
No support for real-time or near-real-time data, with data processed in T+1 only.

2) Pain Points of Lambda Architecture

With the rise of real-time computing technology, Lambda architecture emerged. The following figure shows how the Lambda architecture works. It adds a layer to process real-time data on top of the traditional offline data warehouse. Then, the offline data warehouse and the data generated by the real-time procedure are merged in the serving layer. This allows users to query the offline and real-time data.

Since 2011, Lambda architecture has been adopted by many Internet companies and has solved some problems. However, the increase of data volume and application complexity has gradually brought out some problems in Lambda architecture:

The Lambda architecture is composed of multiple engines and systems. The development, maintenance, and learning curve are high.
Data is repeatedly stored in different Views, which wastes space and causes data consistency issues.
Batch processing, streaming processing, merge operations, and query processing involve different development languages, which are uneasy to use.
High learning cost increases the application cost.

Alibaba has also encountered the problems mentioned above. The following figure shows the real-time data warehouse architecture developed by Alibaba from 2011 to 2016. The architecture was essentially a Lambda architecture. However, as the business volume and data increase, the complexity of relationships increases and the cost increases dramatically as well. Therefore, a more elegant solution was urgently needed to solve similar problems.

2. HSAP: Service-Analysis Unification

Based on the preceding background information and facing the pain points of traditional approach, HSAP (Hybrid Serving and Analytical Process) concept was proposed. HSAP supports complex analytical queries with high write QPS within the same system.

What are the core requirements of HSAP implementation?

A powerful storage system is required to store real-time and offline data in a unified manner.
An efficient query service is also required. Under the same interface (such as SQL), it supports high QPS queries, complex analysis, query federation, and federated analytics.
The system can be directly connected to frontend applications, such as reports and online services. Additional import and export are not needed. As such, data services are unified, and data movements are reduced.

3. About Hologres

Based on HSAP, the corresponding products needed to be developed and implemented. Therefore, Hologres was created.

Hologres is the best HSAP system in the market at this time. It is compatible with the PostgreSQL ecosystem. It supports direct query of MaxCompute data (offline data), real-time data ingestion, real-time query, and real-time offline federated analytics. It helps enterprises quickly build a real-time data warehouse featuring stream-batch unification with low-cost and high efficiency.

The term Hologres is a combination of Holographic and Postgres. Postgres means that Hologres is compatible with the PostgreSQL ecosystem, which is easy to understand. Holographic needs to be explained in more detail. Let's take a look at the following figure:

Holographic is translated as "全息" in Chinese, which is often seen in the term “3D holographic projection technology.”

In physics, the Holographic Principle is used to explain the description of a volume of space. It can be thought of as encoded on a lower-dimensional boundary to the region. The picture above is an imaginary black hole. There are critical points a certain distance away from the black hole, which constitutes the Event Horizon. Objects that are outside the Event Horizon can overcome the gravity of the black hole. In the picture, the shining circle represents the Event Horizon. The Holographic Principle considers that the content of all objects that fall into the black hole may be completely contained on the surface of Event Horizon.

Hologres stores all of the information in the data black hole and perform various types of computing operations.

4. Introduction to the Core Technologies of Hologres

The Hologres architecture is very straightforward. It is a storage-compute separation architecture. All data are stored in the same distributed file system. The system architecture diagram is shown in the following figure:

The Backend serves as the service node to receive, store, and query data.
The Frontend serves as the execution engine to receive routed SQL statements and then generate a logical execution plan. After that, it generates a distributed physical execution plan through the optimizer and delivers the plan to the Backend for distributed execution.
LBS performs load balancing tasks on the connection side.
The orange parts in the figure are all deployed in containers, so the entire distributed system can be highly fault-tolerant.
As we talked about earlier, Hologres is compatible with the PostgreSQL ecosystem. It can directly connect to open-source or commercial development/BI tools at the upper layer and can be used out of the box.

Storage-Compute Separation

Hologres uses a storage-compute separation architecture, allowing users to scale in or out either storage or computing resources elastically based on their business needs.

In distributed storage, there are three common architectures:

Shared Disk/Storage: Many disks are mounted on the storage cluster, and each compute node can directly access these shared disks.
Shared Nothing: Each compute node is mounted to its own storage, which supports communication between nodes. However, disks are not shared among nodes, causing some waste of resources.
Storage Disaggregation: A storage cluster is considered as a large disk that can be accessed by each compute node. Each compute node also has a certain caching capacity that can be used to access cached data. Compute nodes does not need to manager storage cluster. This architecture is convenient for flexible scaling and effective resource-saving.

Storage Based on Stream-Batch Unification

Hologres adopted unified storage based on stream-batch unification. In a typical Lambda architecture, real-time data is written into real-time data storage through the real-time data procedure, and offline data is written into offline storage through the offline data procedure. Then, different queries are put into different storages to perform merge operations. As a result, multiple storage overheads and complex merge operations at the application layer are required.

With Hologres, data can be collected and then processed through different procedures. The processing results can be directly written to Hologres, and data consistency is guaranteed. There is no need to distinguish between real-time tables and offline tables, which greatly reduces the complexity as well as the learning cost of IT professionals.

Storage Engine

The underlying layer of Hologres supports two storage orientation formats: row storage and column storage. Row storage is suitable for point queries based on primary key (PK), and column storage is suitable for OLAP complex queries. Hologres deals with the two storage formats a little differently at the underlying layer, as shown in the following figure.

Logs are written first and stored in the distributed file system to ensure the integrity of all of the service data when data is ingested into Hologres. Logs can be recovered from the distributed system even if the server breaks down. MemTable, namely the memory table, is written after logs are written. In this case, the data is written successfully as considered by the system. MemTable has a certain capacity. When the capacity is full, the data in the MemTable is gradually flushed into files, which are stored in the distributed file system. There is a difference between row storage and column storage. During the file flushing process, row storage tables are flushed into row storage file format and column storage tables are flushed into column storage file format. After many flushing, a lot of small files are accumulated, the system will have backend process to compact these small files into large files.

Execution Engine

The Hologres execution engine is an all-round distributed query engine that focuses on optimizing high-concurrency and low-latency real-time queries. “All-round” means that all types of SQL queries can be expressed and executed efficiently by the Hologres execution engine. Some other distributed query engines focus on optimizing common single real-time table queries but do not perform well in complex queries against all tables. Others support complex queries but have poor performance in real-time scenarios. Hologres aims to ensure high performance in all scenarios.

The Hologres execution engine can process various types of queries with high performance based on the following characteristics:

The end-to-end asynchronous processing framework can greatly alleviate the bottleneck of high-concurrency systems. It can also make full use of resources and avoid the impact of data reading latency caused by storage-compute separation systems.
A query is represented by an execution directed acyclic graph (DAG). The DAG composes of asynchronous operators, which facilitates the connection to the query optimizers. Various query optimization technologies in the industry can be used.
Vectorized execution is used as much as possible when processing data within the operator.
The in-depth integration with storage engines and flexible execution model makes full use of various indexes. The delay in vectorization and computing is maximized to avoid unnecessary data reading and computing.
The adaptive incremental processing is applied to the query mode of common real-time data.
Unique optimizations for some query modes are conducted.

Optimizer

Hologres is aimed at out-of-box usage for users. Users can perform all daily business analysis through SQL statements without additional modeling and processing. Based on new hardware technologies, Hologres designs and implements its unique computing and storage engines. The optimizer plays a role in efficiently running SQL statements executed by users on computing engines.

The Hologres optimizer adopts a cost-based optimizer (CBO). It can generate complex query federation execution plans and utilize the capabilities of multiple computing engines. In addition, many optimization methods have been developed during long-term dealing with different business scenarios inside and outside Alibaba. Thus, Hologres computing engine can exert extreme performance in different business scenarios.

HoloOS and HoloFlow

Blackhole, the core component of Hologres, is a storage and computing engine developed by Alibaba. It adopts asynchronous programming mode. Underneath, there are two frameworks: HoloOS and HoloFlow.

HoloOS (HOS) is a flexible and efficient asynchronous framework that is extracted from the bottom layer of Blackhole. In addition to high performance, HOS also implements load balancing and solves the long tail problem during query execution. A variety of sharing and isolation mechanisms are also realized to achieve efficient utilization of resources.

At the same time, HOS has been applied to the distributed environment and developed HoloFlow, a distributed task scheduling framework. Thus, the flexibility of stand-alone scheduling in the distributed environment can be ensured.

Frontend

As the access layer of Hologres, Frontend is compatible with the PostgreSQL protocol and responsible for accessing and processing user requests and metadata management. However, PostgreSQL is a stand-alone system and is less capable of processing user requests with high concurrency. Hologres is faced with complex business scenarios and hundreds of millions of user requests. Therefore, Frontend adopted a distributed architecture. It also realizes the real-time synchronization of information among multiple frontends through multi-version management and metadata synchronization. It supports full linear scaling and ultra-high QPS through load balancing on the LBS layer.

Extended Execution Engine

Based on Frontend, Hologres also provides extended execution engines.

P Query Engine (PQE): An executor that runs SQL statements and various functions. Hologres is compatible with the extended capabilities provided by Postgres and supports various extension components of the Postgres ecosystem, such as PostGIS and UDF (PL/Java, PL/SQL, and PL/Python). It can perfectly meet the needs of different users in different scenarios and provide stronger computing capabilities.
S Query Engine (SQE): It seamlessly connects to the executor of MaxCompute (ODPS) to implement native access to MaxCompute. No data migration and importing are required for access to the file formats of MaxCompute in a high-performance and full-compatibility manner. It also supports complex tables, such as Hash/Range clustered tables, and implements interactive analysis on PB-level of offline data.

Ecosystem and Data Integration

Hologres is a real-time data warehouse featuring stream-batch unification. It supports real-time and offline data writing for multiple heterogeneous data sources, such as MySQL and DataHub. It also enables tens of millions of data writing and data query per second. What’s more, data can be queried immediately right after it is written. These powerful capabilities are based on Hologres JDBC interfaces.

Hologres is fully compatible with PostgreSQL on interfaces, including the syntax, semantics, and protocol of PostgreSQL. The JDBC Driver of PostgreSQL can be used to connect Hologres and perform data reading and writing. Currently, all data tools on the market, such as BI tools and ETL tools, support the PostgreSQL JDBC Driver. This indicates that Hologres was born with strong tool compatibility and a powerful ecosystem. Thus, Hologres provides a complete closed-loop of the big data ecosystem from data processing to visualized data analysis.

Online Service Optimization

Hologres is the best implementation practice of HSAP. In addition to the analytical query processing, it has powerful online service capabilities, such as Key-Value (KV) point query and vector search. In the KV point query scenario, Hologres supports million-level QPS throughput and extremely low latency easily and stably through SQL interfaces. In the vector search scenario, users can also use SQL statements to import vector data, create vector indexes, and perform queries. No additional conversion is needed for queries, and the performance is better than other products. Some non-analytical queries can also be applied in serving scenarios through reasonable table structures and the powerful indexing capabilities of Hologres.

5. Upgrades on the Data Warehouse Architecture

Based on Hologres, architecture upgrades have also been done in multiple business scenarios. The business architecture is simplified, as shown in the following figure:

6. Summary

As the cloud-native next-generation real-time data warehouse, Hologres and Flink have implemented the "stream-batch unification" for the first time in core data business scenarios during the 2020 Double 11 Global Shopping Festival. It has passed the tests in stability and performance; and achieved the real-time full procedure of business. The millisecond-level data processing capability provides merchants and consumers with a more intelligent consumption experience.

With the development of business and technologies, Hologres will continue to improve its core technological competitiveness, realize the goal of service-analysis unification, and bring more services and value to users.

More articles about the core technologies above will be released. Please stay tuned for more information.

About the Author

Xiaojun Jin (Xianyin), is a Senior Technical Expert with Alibaba Cloud. He has ten years of working experience in the big data field and is currently engaged in the design and R&D of Hologres.

Community