This topic describes the following features of PolarProxy: consistency level, transaction splitting, offloading of reads from the primary node, connection pool, hint, and parallel query.
Apsara PolarDB architecture and overview
Apsara PolarDB adopts a cluster architecture. Each cluster contains a primary node and multiple read-only nodes. By default, Apsara PolarDB provides two types of endpoints: primary endpoints and cluster endpoints. PolarProxy provides the cluster endpoint feature. Cluster endpoints include both read/write endpoints and read-only endpoints. Read/write cluster endpoints support read/write splitting. Read-only cluster endpoints evenly distribute connections to read-only nodes. For more information about read/write splitting, see Read-write splitting. The following sections describe the features of PolarProxy.
MySQL uses asynchronous replication to synchronize data between the primary and secondary nodes. In read/write splitting mode, a read request that follows a write request may fail to obtain the latest write result, leading to data inconsistency. Apsara PolarDB provides three levels of consistency:
- Eventual consistency
Apsara PolarDB transmits data from a primary node to read-only nodes by using asynchronous physical replication. Updated data in the primary node is copied to read-only nodes with a latency of milliseconds to achieve eventual consistency. Latency may differ with write workloads upon Apsara PolarDB clusters. This level does not guarantee data consistency when you perform read/write splitting on an endpoint.
- Session consistency (default)
Apsara PolarDB uses fast physical replication to avoid inconsistent queries associated with eventual consistency. The internal database proxy sends queries to read-only nodes who have updated data to maintain session consistency.Note Session consistency can ensure the consistency of requests only within a session. Session consistency cannot guarantee the consistency between different sessions. If none of the read-only nodes meet the consistency requirement, read requests are routed to the primary node. This may place a heavy read workload on the primary node.
- Global consistency
The read/write splitting module of Apsara PolarDB provides session consistency to ensure the causal consistency of requests within the same session. However, in some cases, such as when short-lived connections or connection pools are used, logical dependency is also needed between sessions. In such cases, global consistency is required. PolarProxy tracks the replication endpoint to achieve global consistency. If the primary-to-secondary latency is high, more requests are routed to the primary node, which may place a heavy workload on the primary node and increase service latency.
At the default Read Committed isolation level, Apsara PolarDB does not start a transaction immediately after it receives a transactional statement. You can use BEGIN or SET AUTOCOMMIT=0 to check that Apsara PolarDB starts the transaction after a write operation occurs.
Typically, the database proxy sends all requests of a transaction to the primary node. However, requests are encapsulated in transactions due to the framework. As a result, the primary node is overloaded. To resolve this issue, you can enable the transaction splitting feature. When this feature is enabled, Apsara PolarDB proxy identifies the current transaction status and distributes read requests to read-only nodes by using Server Load Balancer (SLB) before the transaction is started.
- In read/write splitting mode, transaction splitting is enabled by default.
- If you enable transaction splitting, consistency cannot be ensured for some services. We recommend that you evaluate whether transaction splitting is suitable for your business before you enable it.
Offloading of reads from the primary node
If offload reads from primary node is enabled for the primary node, normal read requests are no longer routed to the primary node. In transactions, read requests that require consistency are routed to the primary node to meet your business needs. Additionally, if all read-only nodes fail, the read requests are routed to the primary node. If your business does not require high consistency, set Consistency Level to Eventual Consistency to reduce read requests that are routed to the primary node. You can also use transaction splitting to reduce read requests that are routed to the primary node before the transaction is started. Broadcast requests, such as set and prepare, are still routed to the primary node.
Apsara PolarDB provides session-level connection pool and transaction-level connection pool.
- Session-level connection pool
Session-level connection pools apply to to PHP short-lived connection scenarios.
The PHP short-lived connection optimization can reduce heavy MySQL loads by establishing fewer short-lived connections. If your connection is disconnected, the system determines whether the current connection is an idle connection. If the connection is idle, the system retains the connection in the connection pool for a short period of time. If a new connection needs to be established during this process, it is obtained from the connection pool when conditions such as the user, client IP, and database name are matched. The PHP short-lived connection optimization reduces the overhead needed to establish a connection to a database. If there is no available connection to the database, a new connection is established.Note This optimization does not reduce the number of concurrent connections to the database, but reduces only the frequency at which connections are established between applications and the database. This reduces the overhead of the main MySQL thread to more efficiently process business requests. However, idle connections in the connection pool account for a share of your connections for a temporary period of time.
- Transaction-level connection pool
Transaction-level connection pools apply to scenarios where tens of thousands of connections exist.
A transaction-level connection pool is used to both reduce the number of business connections and the load caused by frequent new short-lived connections. A client can establish a large number of connections to the proxy, but the proxy creates only a small number of connections to a database. When a client sends a connection request, the proxy selects a connection that meets conditions and sends the request to the database. Currently, system variable consistency is required. After the current transaction ends, the proxy stores the connection in the connection pool.
A transaction connection pool has the following limits:
- When you perform the following operations, the connection is locked until it is ended.
The connection is no longer stored in the connection pool for use by other users.
- Execute the PREPARE statement.
- Create a temporary table.
- Modify user variables.
- Send large packets (for example, 16 MB or more in size).
- Execute the LOCK TABLE statement.
- Execute multiple statements.
- Call stored procedures.
- The FOUND_ROWS, ROW_COUNT, and LAST_INSERT_ID functions can be called successfully, but the accuracy of the results cannot be guaranteed.
- For a connection with wait_timeout settings, the wait_timeout status on the client may not take effect because the connection is obtained from the connection pool for each request. After the period specified by wait_timeout elapses, only backend connections in the connection pool are disconnected, but the client remains connected.
- If the business depends on session-level system variables other than
time_zone, you must execute the SET statement in explicit mode on the client after connections are established. Otherwise, the connection pool may reuse connections whose system variables have been changed.
- Connections may be reused. Therefore, the actual thread ID of the current request
may be different from the one returned from
- Connections may be reused. Therefore, the IP address and port number returned from the SHOW PROCESSLIST statement may be different from the actual IP address and port number of the client.
- The database proxy merges and returns the results of the SHOW PROCESSLIST statement on all nodes. After the transaction-level connection pool is enabled, the thread IDs of frontend and backend connections cannot match with each other. The KILL statement may return an error even if it is executed. You can execute the SHOW PROCESSLIST statement to check whether the connection is disconnected.
- When you perform the following operations, the connection is locked until it is ended. The connection is no longer stored in the connection pool for use by other users.
/* FORCE_MASTER * / or
/* FORCE_SLAVE * / before an SQL statement to specify the direction to which you want to route the SQL
SELECT * FROM test is routed to a read-only node by default. If the SQL statement is changed to
/* FORCE_MASTER */ SELECT * FROM test, it is routed to the primary node.
- Hint has the highest level of routing priority and is not constrained by the consistency level or transaction splitting. Evaluate this configuration before using it.
- A hint must not contain statements that change environment variables such as
/*FORCE_SLAVE*/ SET NAMES utf8;. Otherwise, an error may occur in the subsequent procedure.