AnalyticDB for PostgreSQL allows you to scale out coordinator nodes to provide better performance than that of the single-coordinator architecture. Multiple coordinator nodes work with scalable compute nodes to further improve system capabilities such as the number of connections and the read and write performance. The multi-coordinator architecture can better meet the requirements of real-time data warehousing and hybrid transaction/analytical processing (HTAP) scenarios.

Background information

HTAP is a term created by Gartner, Inc. in 2014. HTAP refers to a database system that combines online transaction processing (OLTP) and online analytical processing (OLAP) capabilities. In traditional scenarios, OLTP and OLAP tasks are handled by separate database systems. The raw data that is stored in the OLTP system is asynchronously transmitted to the OLAP system by using various methods such as offline importing, extract, transform, load (ETL), or Data Transmission Service (DTS). Then, the data can be used for subsequent data analysis in the OLAP system.

In earlier versions, AnalyticDB for PostgreSQL assumes important roles in OLAP scenarios and also provides OLTP capabilities. As HTAP becomes more popular, AnalyticDB for PostgreSQL has optimized its OLTP performance as of V6.0. The multi-coordinator architecture is provided to go beyond the limit of a single coordinator node. The OLTP performance that is scaled out can better meet the requirements in real-time data warehousing and HTAP scenarios.

Multi-coordinator architecture design

Compared with the single-coordinator architecture, the multi-coordinator architecture contains a role named secondary coordinator node in addition to the primary and standby coordinator nodes. Secondary coordinator nodes provide capabilities similar to those of the primary coordinator node to process requests such as DDL and DML statements. You can add secondary coordinator nodes to improve overall system performance.

The following table describes the coordinator node roles and their capabilities.

Role Description
Global Transaction Manager (GTM) Provides global transaction IDs and snapshots for each transaction. GTM is a core component to process distributed transactions.
Fault Tolerance Service (FTS) Examines the health status of compute nodes and secondary coordinator nodes, and performs failover between the primary and secondary compute nodes when the primary compute node fails.
Catalog Stores system tables that contain global metadata.
Primary coordinator node Receives business requests and distributes workloads across different compute nodes for distributed processing. The primary coordinator node also hosts the GTM, FTS, and Catalog roles.
Standby coordinator node Serves as a standby for the primary coordinator node. If the primary coordinator node fails, the standby coordinator node takes over as the new primary coordinator node.
Secondary coordinator node Provides capabilities similar to those of the primary coordinator node to receive business requests and distribute workloads to compute nodes. GTM proxies on the secondary coordinator nodes communicate with the GTM on the primary coordinator node and all the compute nodes to process distributed transactions.
Note An upper-level Server Load Balancer (SLB) instance provides a weight-based method to balance the workloads between the primary and secondary coordinator nodes. If the primary coordinator node has the same specifications as secondary coordinator nodes, the primary coordinator node can be configured with a smaller weight to receive less business requests than those of secondary coordinator nodes. This reserves more processing capabilities for the GTM and FTS.

Comparison between the single-coordinator and multi-coordinator architectures

AnalyticDB for PostgreSQL provides the coordinator and compute nodes to undertake different types of tasks.

  • Coordinator nodes: receive business requests, optimize queries, distribute workloads, and manage metadata and transactions.
  • Compute nodes: compute and store data.

When a coordinator node receives a query request, it parses and optimizes the SQL statement and then distributes an optimized execution plan to compute nodes. Compute nodes read the data stored on them, complete tasks such as data computation and data shuffling, and then return the computing results to coordinator nodes. When a coordinator node receives a request to create a table or write data, it manages metadata and transactions and then coordinates workloads between compute nodes.

Figure 1. Single-coordinator architecture
Single-coordinator architecture

AnalyticDB for PostgreSQL is developed based on Greenplum Database. In earlier versions, AnalyticDB for PostgreSQL uses the same single-coordinator architecture as that of Greenplum Database. Typically, only the primary coordinator node works and the standby coordinator node is provided for high availability purposes. When the primary coordinator node fails, the standby coordinator node takes over as the new primary coordinator node.

As real-time data warehousing and HTAP requirements increase, the single-coordinator architecture encounters a large number of issues. CPU and memory resources of the coordinator node are consumed when data is processed in the end of a query. High performance is required to insert, update, or delete large amounts of data in real time. If a large number of connections are concurrently established, a single coordinator node may fail to process the enormous amount of data. Although these issues can be mitigated by scaling up the coordinator node, these issues cannot be fundamentally resolved.

Figure 2. Multi-coordinator architecture
Multi-coordinator architecture

The multi-coordinator architecture allows you to scale out coordinator nodes to provide higher capabilities and meet the requirements in real-time data warehousing and HTAP scenarios. The preceding figure shows the multi-coordinator architecture, in which multiple secondary coordinator nodes are added to improve performance. The standby coordinator node is reserved to ensure high availability.

To ensure OLTP capabilities, AnalyticDB for PostgreSQL has made the following optimizations on the multi-coordinator architecture:

  • Supports the multi-coordinator architecture to enhance distributed transaction capabilities.
  • Innovates and modifies the algorithms that support global deadlock detection, DDL operations, and distributed table-level locks.
  • Redesigns the fault tolerance and high-availability capabilities for AnalyticDB for PostgreSQL instances.