Architecture

Apsara PolarDB runs in a cluster architecture. Each cluster contains a primary node and one or more read-only nodes. Clients can connect to an Apsara PolarDB cluster by using two types of endpoint: the cluster endpoint and primary endpoint. We recommend that you use the cluster endpoint because the nodes in the cluster can share the same endpoint to achieve read/write splitting.

Architecture

Read/write splitting of PolarDB for PostgreSQL

Data replication from the primary node to read-only nodes is a simple method. You only need to asynchronously transfer the write-ahead logging (WAL) of the primary node to read-only nodes. Data replication enables read-only nodes to process queries. This reduces workloads of the primary node and ensures high availability.

However, if read-only nodes are used to process queries, you need to consider the following issues:

  • 1. Clients can connect to the primary node and read-only nodes through two endpoints. You must specify the endpoint for connections in your applications.
  • 2. PolarDB for PostgreSQL replicates data asynchronously. Data may not be synchronized to read replicas immediately after a client commits data modifications. Therefore, data in read-only nodes may not be up-to-date, which means that data consistency is not guaranteed.

To solve issue 1, Apsara PolarDB uses the read/write splitting proxy. The proxy establishes connections with clients for the PolarDB for PostgreSQL cluster and parses each query from the clients. It sends write requests, such as UPDATE, DELETE, INSERT, and CREATE, to the primary node. However, read requests such as SELECT are sent to read-only nodes.

WAL for read/write splitting in PolarDB for PostgreSQL

However, data inconsistency caused by the synchronization latency is still not resolved. If you execute the SELECT statement to retrieve data from read-only nodes, the returned data may be inconsistent with that stored on the primary node. If the load on a PolarDB for PostgreSQL cluster is light, the synchronization latency can be reduced to less than five seconds. In scenarios that involve a heavy load, the synchronization latency may significantly increase. For example, this occurs when the cluster needs to execute data definition language (DDL) statements to add columns on large tables or insert a large amount of data.

Eventual consistency and session consistency

  • Eventual consistency: Apsara PolarDB synchronizes data from the primary node to read-only nodes through asynchronous physical replication. Updates to the primary node are replicated to read-only nodes. In most cases, data changes are synchronized to read-only nodes with a latency of a few milliseconds. The latency is based on the load of write requests on the primary node. This allows you to achieve eventual consistency through asynchronous replication.
  • Session consistency: Session consistency is used to resolve the issue of data inconsistency that occurs before eventual consistency is reached. Physical replication is fast. Based on this feature, Apsara PolarDB forwards queries to the read-only nodes that have completed asynchronous replication. For more information, see How it works.

Session consistency based on read/write splitting

Apsara PolarDB runs in a read/write splitting architecture. Read/write splitting of traditional methods only ensures the eventual consistency. The latency of data replication from the primary node to read-only nodes causes data inconsistency. This may result in different results returned from different read-only nodes for the same query. For example, you can execute the following statements within a session:

INSERT INTO t1(id, price) VALUES(111, 96);
UPDATE t1 SET price = 100 WHERE id=111;
SELECT price FROM t1;

In this example, the result of the last query may be incorrect because Apsara PolarDB may send the SELECT request to a read-only node where data has not been updated. To avoid this problem, you have to make changes on the client side instead of the server side. The most common solution is to divide your workloads. For workloads that require high consistency, send requests only to the primary node. Otherwise, for workloads that require eventual consistency, distribute write requests to the primary node and read requests to read-only nodes accordingly. However, this makes application development more complicated, increases the load of the primary node, and adversely affects read/write splitting.

To address this issue, Apsara PolarDB provides session consistency. Within a session, requests are sent to read-only nodes where data has been updated. In this way, queries only hit the up-to-date data on read-only nodes.

How it works

How it works

Apsara PolarDB uses a middle layer (proxy) to achieve read/write splitting. The proxy tracks Redo logs that are applied on each node and records each log sequence number (LSN). When the data is updated in the primary node, the LSN of the new update is recorded as a session LSN. If a new read request arrives, the proxy compares the session LSN with the LSNs of read-only nodes. It forwards the request to a read-only node where the LSN is greater than or equal to the session LSN to achieve session consistency. To ensure efficient synchronization, the read-only node returns the result to the client while the replication to other read-only nodes are being processed. This allows read-only nodes to update data before subsequent read requests arrive. In most scenarios where reads are heavier than writes, this mechanism ensures session consistency, read/write splitting, and load balancing.

Best practices for consistency level selection

We recommend that you use session consistency. This consistency level only has a minimal effect on cluster performance and supports most scenarios.

The following solution can be applied to reach consistency among sessions:

Use hints to forcibly redirect specific queries to the primary node.

eg: /*FORCE_MASTER*/ select * from user;