Automatic request distribution among row store and column store nodes - PolarDB

If your application sends requests to access both OLAP and OLTP services, and you want the requests to be automatically distributed based on execution costs, you must configure the read/write mode of the cluster endpoint and enable the automatic request distribution feature. When the automatic request distribution feature is enabled, the database proxy automatically distributes requests based on the estimated number of scanned rows to accelerate request processing. When the estimated number of rows scanned by an SQL request exceeds the threshold, the request is distributed to a column store node. When the estimated number of rows scanned by an SQL request does not exceed the threshold, the request is distributed to a row store node or the primary node.

Automatic distribution solution

The database proxy of PolarDB for MySQL determines whether to distribute an SQL request to a column store node based on whether the estimated number of rows scanned by the request exceeds the specified threshold. This can fully leverage the performance of the row store and column store nodes.

Request distribution rules:

OLTP service: generally includes read and write requests. All write requests are processed by the primary node. Read requests are processed by the read-only row store node or primary node.
OLAP service: generally includes only read requests. All read requests are processed by the read-only column store node.

Automatic request distribution solution:

Request distribution between the primary node and read-only column store node: Because the primary node is also in row store mode, it can also process OLTP read requests. In this solution, write requests and OLTP read requests are distributed to the primary node, and OLAP read requests are distributed to the read-only column store node.
Request distribution between the read-only row store node and read-only column store node: In this solution, write requests are distributed to the primary node, OLTP read requests are distributed to the read-only row store node or primary node, and OLAP read requests are distributed to the read-only column store node.

混合下分流

Limits

Your cluster contains at least one read-only column store node and one row store node.

Step 1: Enable automatic request distribution among column store and row store nodes

Log on to the PolarDB console.
In the upper-left corner of the console, select the region in which the cluster that you want to manage is deployed.
Find the cluster and click the cluster ID.
In the Enterprise Edition section of the Basic Information page, find the cluster endpoint and click Modify on the right side of the cluster endpoint.
Select the appropriate Read/Write.
- The read/write mode of the cluster endpoint is set to Read/Write (Automatic Read/Write Splitting).
- The read/write mode of the cluster endpoint is set to Read-only, and the load balancing policy is set to Active requests-based load balancing.
In the Node Settings section, select the primary node and read-only row store and column store nodes that are used to process requests. In the HTAP Optimization section, enable Transactional/Analytical Processing Splitting. Click OK.
Note
You must select at least one read-only column store node in Node Settings.
Example 1: In the following figure, one primary node, one read-only row store node, and two read-only column store nodes are selected. In this case, the system distributes queries based on the following policy:
- Write requests are distributed to the primary node.
- OLAP read requests are distributed to the read-only column store nodes.
- OLTP read requests are distributed to the read-only row store node. When Primary Node Accepts Read Requests in the SLB Settings section is set to Yes, OLTP read requests may also be distributed to the primary node.
Example 2: In the following figure, one primary node and one read-only column store node are selected. When the automatic request distribution feature is enabled, write requests and OLTP read requests are distributed to the primary node, and OLAP read requests are distributed to the read-only column store node.
Note
In Read/Write (Automatic Read/Write Splitting) mode, all write requests are distributed only to the primary node, regardless of whether the primary node is selected in the Node Settings section.

Step 2: Specify the thresholds for automatic request distribution

After you enable automatic request distribution, you must specify the threshold for the number of scanned rows for an SQL statement. After the threshold is specified, the database proxy determines to which nodes requests are distributed based on this threshold. When the estimated number of rows scanned by an SQL request exceeds the threshold, the request is distributed to a column store node. When the estimated number of rows scanned by an SQL request does not exceed the threshold, the request is distributed to a row store node or the primary node.

The parameters described in the following table determine the thresholds related to the execution costs. On the Parameters page of the cluster, you can modify the values of the parameters.

Parameter	Description
loose_imci_ap_threshold	The threshold that determines whether SQL statements are distributed to column store nodes. Default value: 50000. Note For example, when the default value is used, if the number of rows scanned by an SQL statement is estimated to be greater than 50,000, the statement is distributed to a column store node.
loose_cost_threshold_for_imci	The threshold that determines whether SQL statements are executed by using column store execution plans in column store nodes. Default value: 50000. Note For example, when the default value is used, if the number of rows scanned by an SQL statement is estimated to be greater than 50,000, the statement is executed by using a column store execution plan. Otherwise, a row store execution plan is used.

You can execute the SHOW STATUS LIKE 'Last_query_cost statement to query the estimated execution cost of the last SQL statement and use the queried value to modify the parameter values.

Note

If you use a cluster endpoint to connect to a database, you must add the /* ROUTE_TO_LAST_USED*/ hint before the SHOW STATUS LIKE 'Last_query_cost' statement to ensure that the estimated execution cost of the last SQL statement is queried on the expected node. Example:

/*ROUTE_TO_LAST_USED*/SHOW STATUS LIKE 'Last_query_cost';

Example:

SHOW STATUS LIKE 'Last_query_cost';

Sample result:

+----------------------+-------+
| Variable_name        | Value |
+----------------------+-------+
| Last_query_cost      | 2     |
+----------------------+-------+
1 row in set (0.01 sec)

The query result indicates that the execution cost of the SQL statement is 2.

To distribute the SQL statement to a column store node and execute it by using a column store execution plan, you must set the loose_imci_ap_threshold and loose_cost_threshold_for_imci parameters to 1.

Use `hints` to forcefully execute a request by using a column store execution plan or a row store execution plan

If the outcome of automatic request distribution does not meet your expectations, you can use hints to forcefully execute a request by using a column store execution plan or a row store execution plan.