PolarDB provides the following consistency levels: eventual consistency, transaction consistency, and global consistency. You can select a consistency level based on your business requirements in different scenarios.
Issues and solutions
The primary-secondary replication feature of MySQL allows PolarDB to asynchronously synchronize the binary logs of the primary database to secondary databases. PolarDB also parses and applies binary logs in secondary databases in real time. This ensures data consistency in secondary databases, reduces the loads of the primary database, and ensures high availability.
Data in the secondary databases can be queried. However, the following issues occur:
- Typically, the primary database and secondary databases provide different endpoints. When you access different databases, you must modify the code in your applications to change the endpoint that is used to connect to databases.
- MySQL implements asynchronous replication. Therefore, data in the secondary databases is not the latest and replication delay exists. This causes data inconsistency between the primary database and secondary databases.
To fix the first issue, MySQL provides a proxy that supports read/write splitting. Typically, the proxy establishes connections from applications to MySQL and parses each Structured Query Language (SQL) statement. Write requests, such as UPDATE, DELETE, INSERT, and CREATE operations, are forwarded to the primary database by the proxy. The SELECT operations are forwarded to secondary databases.
However, the read/write splitting feature cannot fix the data inconsistency issue that is caused by a replication delay. The replication delay increases if the loads on databases are heavy. For example, when you execute data definition language (DDL) statements to add columns to a large table or insert a large amount of data. In this case, you cannot retrieve the latest data from read-only nodes.
PolarDB achieves data synchronization between the primary node and read-only nodes by implementing asynchronous physical replication. After the data on the primary node is updated, the updates are synchronized to read-only nodes. The replication delay depends on the write loads (in milliseconds) on the primary node. The asynchronous replication ensures eventual data consistency between the primary node and read-only nodes. PolarDB provides the following consistency levels to meet your business requirements in different scenarios:
PolarDB runs in a read/write splitting architecture. Traditional read/write splitting ensures only eventual consistency. The retrieved results from different nodes may be different due to a primary-secondary replication delay. For example, if you execute the following statements within a session, the SELECT statement may return different results. The actual result depends on the replication delay.
INSERT INTO t1(id, price) VALUES(111, 96); UPDATE t1 SET price = 100 WHERE id=111; SELECT price FROM t1;
To reduce loads on the primary node and send as many read requests to read-only nodes as possible, we recommend that you select eventual consistency.
To solve the data inconsistency issue caused by eventual consistency, requests are split in most cases. The requests that require high consistency are sent to the primary node. The requests that accept eventual consistency are sent to read-only nodes by using the read/write splitting feature. However, this increases the loads on the primary node, reduces the performance of read/write splitting, and complicates the application development.
To solve the issue, PolarDB provides session consistency that is also known as causal consistency. In a session, session consistency makes sure that you can obtain the data that is updated before read requests are sent. This ensures the monotonicity of data.
PolarDB uses a proxy to achieve read/write splitting. The proxy tracks redo logs that are applied on each node and records each log sequence number (LSN). When the data in the primary node is updated, PolarDB records the LSN of the new update as a session LSN. When a new read request arrives, PolarDB compares the session LSN with the LSN on each node and forwards the request to a node where the LSN is greater than or equal to the session LSN. This ensures session consistency. PolarDB implements efficient physical replication.
To ensure efficient synchronization, the read-only node returns the result to the client when replications to other read-only nodes are being processed. This allows read-only nodes to update data before subsequent read requests arrive. PolarDB achieves read consistency without the need to handle heavy loads on the primary node. In most scenarios that require more read requests than write requests, this mechanism ensures session consistency, read/write splitting, and load balancing.
A higher consistency level of PolarDB indicates heavier loads on the primary database and lower cluster performance. We recommend that you use session consistency. This consistency level minimizes the impact on cluster performance and meets the requirements of most scenarios.
In some scenarios, dependencies exist within a session and among sessions. For example, if you use a connection pool, the requests of the same thread may be sent by using different connections. These requests belong to different sessions in the database. However, these requests depend on each other in the business process. In this case, session consistency cannot ensure the data consistency. Therefore, PolarDB provides global consistency to solve this issue.
When each read request arrives at the PolarDB database proxy, the proxy first checks the latest LSN on the primary node. For example, the latest LSN is LSN0. Batch internal optimizations are performed to reduce the number of times that the proxy queries the latest LSN on the primary node. Then, after the LSNs of all read-only nodes are updated to LSN0, the proxy sends the read request to read-only nodes. This way, the latest data that is updated before the read request is sent can be returned.The following parameters for global consistency are available.
Parameter Description ConsistTimeout Global Consistency Timeout: The allowed amount of time that is consumed when the LSNs of read-only nodes are updated to the latest LSN of the primary node. If the consumed time exceeds the specified value, the PolarDB proxy performs the operation that is specified by the ConsistTimeoutAction parameter.
Valid values: 0 to 300000. Default value: 20. Unit: milliseconds.
ConsistTimeoutAction Global Consistency Timeout Policy: If the LSNs of read-only nodes fail to be updated to the latest LSN of the primary node within the time period specified by the ConsistTimeout parameter. The PolarDB proxy performs the operation that is specified by the ConsistTimeoutAction parameter.
Note For more information about how to modify Global Consistency Timeout and Global Consistency Timeout Policy, see Modify a cluster endpoint.
- 0: The proxy sends read requests to the primary node. This is the default value.
- 1: The proxy returns the error message of
wait replication complete timeout, please retryto the application.
If the primary-secondary replication delay is high, more requests may be forwarded to the primary node when you use global consistency. This increases the loads on the primary node, and may increase service latency. Therefore, in scenarios that require more read requests than write requests, we recommend that you use global consistency.
Best practices for consistency levels
- A higher consistency level of a PolarDB cluster indicates lower cluster performance. We recommend that you use session consistency. This consistency level minimizes the impact on cluster performance and meets the requirements of most scenarios.
- If you require high data consistency among different sessions, you can select one
of the following solutions:
- Use hints to forcibly send specific queries to the primary node, as shown in the following
/*FORCE_MASTER*/ select * from user;Note
- If you need to execute the preceding statement with a hint in MySQL command line interface (CLI), add the -c parameter. Otherwise, the hint becomes invalid because the CLI of MySQL ignores the hint. For more information, see mysql Client Options.
- Hints have the highest priority and are not limited by consistency levels and transaction splitting. Before you use hints, perform an evaluation.
- The hints cannot contain statements that change environment variables. For example,
if you use
/*FORCE_SLAVE*/ set names utf8;, errors may occur.
- Use global consistency.
- Use hints to forcibly send specific queries to the primary node, as shown in the following statement.