All Products
Search
Document Center

PolarDB:Row-column routing mechanism

Last Updated:Sep 23, 2024

This topic describes how to use the row-column routing mechanism to separate the traffic of online transaction processing (OLTP) and online analytical processing (OLAP) services.

Background information

PolarDB-X provides the following three types of instances:

  • Primary instance: A primary instance is used to process the traffic of transaction processing (TP) workloads. By default, it allows you to access row-oriented data. The primary instance provides a cluster endpoint. You can use this cluster endpoint to configure read/write splitting for the traffic of TP workloads and forward the traffic of analytical processing (AP) workloads to read-only column store instances and regular read-only instances.

  • Read-only instance: A read-only instance is used to process the read traffic of TP workloads. By default, it allows you to access row-oriented data. The read-only instance provides a read-only endpoint and is physically isolated from the primary instance. This type of instance is used if you need to configure the read/write splitting of an application or perform extract, transform, and load (ETL) operations.

  • Read-only column store instance: A read-only column store instance is used to process the read traffic of AP workloads. It allows you to access only column-oriented data. The read-only column store instance provides a read-only endpoint and is physically isolated from the primary instance. This type of instance is used if you need to analyze data.

Note
  • A latency within seconds occurs if you synchronize row-oriented data to column-oriented data. By default, the primary and read-only instances allow you to access only row-oriented data. This prevents data inconsistency when the primary instance reads data.

  • To route an SQL statement to a read-only column store instance for execution, you can add the following prefix to the SQL statement:

    /*+TDDL: worload_type=ap enable_columnar_optimizer=true*/

Routing mechanism for a primary instance with a read-only column store instance

Hybrid workloads

If the SQL requests of your OLAP and OLTP services use the same connection to access a database, you can use the cluster endpoint provided by the primary instance to access the database. The system automatically routes traffic based on the scan costs of SQL requests. The traffic of OLAP services is automatically routed to the read-only column store instance. The traffic of OLTP services is automatically routed to the primary instance.

image
Note

Automatic routing rules:

  • If the estimated scan cost of an SQL request is higher than the specified threshold, the read traffic is determined as the traffic of AP workloads. The SQL request is automatically routed to the read-only column store instance.

  • If the estimated scan cost of an SQL request is lower than the specified threshold, the read traffic and write traffic are determined as the traffic of TP workloads. The SQL request is automatically routed to the primary instance.

  • If a primary instance has multiple read-only column store instances, the system automatically uses massively parallel processing (MPP) to route traffic.

To configure automatic routing, perform the following steps:

  1. Log on to the PolarDB-X console.

  2. In the top navigation bar, select the region where the instance is deployed.

  3. On the Instances page, click the PolarDB-X 2.0 tab.

  4. Click the ID of a primary instance to go to the Basic Information page.

  5. In the Connection Information section, click Configuration Management.

  6. In the Configuration Management dialog box, perform the following steps:

    1. In the Resource Configuration section, select a read-only column store instance in the Available Instances section and click the rightwards arrow to add the instance to the Selected Instances section.

    2. In the Read-only Column Store Instance Configurations section, select Enable. Then, click OK.

Important

Data synchronization between a read-only column store instance and the primary instance has a latency within seconds. If you require data consistency between the primary and read-only column store instances, we recommend that you do not use the automatic routing mechanism.

Separated workloads

If your OLAP and OLTP services use different applications to access a database, you can configure different endpoints for the applications to separate the traffic of OLAP and OLTP services. For example, configure the cluster endpoint of the primary instance for the OLTP application, and configure the read-only endpoint of the read-only column store instance for the OLAP application.

image

Routing mechanism for a primary instance with a read-only instance

Transparent read/write splitting

PolarDB-X provides transparent read/write splitting that ensures strong consistency. If you use this capability, you do not need to modify the code of your application. You need to only connect your application to the cluster endpoint of the primary instance. The system automatically routes a proportion of read traffic to the read-only instance. You can specify the proportion in advance.

image

To configure transparent read/write splitting, perform the following steps:

  1. Log on to the PolarDB-X console.

  2. In the top navigation bar, select the region where the instance is deployed.

  3. On the Instances page, click the PolarDB-X 2.0 tab.

  4. Click the ID of a primary instance to go to the Basic Information page.

  5. In the Connection Information section, click Configuration Management.

  6. In the Configuration Management dialog box, perform the following steps:

    1. In the Resource Configuration section, select a read-only instance in the Available Instances section and click the rightwards arrow to add the instance to the Selected Instances section.

    2. In the Read-only Instance Configurations section, configure the parameters.

      The following table describes the parameters.

      Parameter

      Valid value or parameter

      Description

      Strongly Consistent Read

      Strong Consistency

      Allows requests routed to a read-only data node (DN) to read the latest data of the cluster. If the latency of the read-only DN exceeds the specified threshold, the system automatically routes the requests to other read-only DNs with normal latency (if any) or to the primary instance. To route such requests to the primary instance, you must set the Read-only Traffic Switched Back to Primary Instance parameter to Enable.

      Weak Consistency

      Allows requests routed to a read-only DN to read the latest data of the read-only DN.

      Query Acceleration by Using MPP

      Enable or Disable

      Specifies whether to estimate the scan costs of SQL requests based on statistics and route the complex queries of AP workloads to the read-only instance, including compute nodes (CNs) and DNs, to accelerate the queries by using MPP.

      Read/Write Splitting

      Proportion of Traffic Routed to Read-only Instance

      The proportion of requests routed to the DNs of the read-only instance. The requests are originally routed to the DNs of the primary instance.

      Read-only Instance Latency Threshold (Seconds)

      The latency threshold of a read-only DN. If the latency of the read-only DN exceeds the specified threshold, the system automatically routes requests to other read-only DNs with normal latency (if any) or to the primary instance. To route such requests to the primary instance, you must set the Read-only Traffic Switched Back to Primary Instance parameter to Enable.

      Read-only Traffic Switched Back to Primary Instance

      Specifies whether to route read-only traffic to the primary instance if the latency of a read-only DN exceeds the specified threshold.

  7. Click OK.

Manual read/write splitting

If you want to implement read/write splitting in your application, you can configure different endpoints for read and write operations. For example, configure the cluster endpoint of the primary instance for write operations, and configure the read-only endpoint of the read-only instance for read operations. This way, read traffic and write traffic are isolated without affecting each other.

image

Routing mechanism for a single primary instance

If you have only the primary instance, your application can access only row-oriented data by default.

image

FAQ

  1. How do I use column-oriented storage to accelerate queries after I purchase a primary instance and a read-only column store instance?

    You can use the read-only column store instance in one of the following ways:

    • Connect to the read-only endpoint of the read-only column store instance to access column-oriented data. This is suitable if you can distinguish the queries that can be directly run on column-oriented data. For more information, see the Separated workloads section of this topic.

    • Connect to the cluster endpoint of the primary instance. In this case, the database automatically routes complex queries to the read-only column store instance based on the scan costs of SQL queries. This is more suitable for hybrid workload scenarios in which you do not need to manually distinguish query types. For more information, see the Hybrid workloads section of this topic.

  2. Are the scan costs estimated based on statistics reliable? What do I do if I want queries to be routed to a read-only column store instance as expected?

    Statistics are collected on a regular basis. The scan costs of SQL queries may be inaccurately estimated if statistics are not up-to-date. As a result, queries are not routed as expected. For complex queries that are particularly important, we recommend that you use a hint to forcibly execute SQL statements by using a column store execution plan. This ensures that the SQL queries are routed to read-only column store instances.

  3. How do I use a hint to forcibly execute SQL statements by using a column store execution plan without changing the code of an application?

    PolarDB-X provides the execution plan management feature. You can use a hint to forcibly execute SQL statements by using a column store execution plan. Then, the database automatically routes these SQL statements to read-only column store instances for execution.

  4. What do I do if the query performance of a read-only column store instance is not significantly improved compared with that of the primary instance?

    In most cases, read-only column store instances are used to run complex queries. These queries require high instance specifications. We recommend that you purchase a read-only column store instance that has at least 32 CPU cores for better performance. Such an instance is more cost-effective.