All Products
Search
Document Center

General issues in SQL tuning

Last Updated: Jun 18, 2021

A user writes an SQL query without following specifications of OceanBase Database

The way you write an SQL query determines the performance of its execution. When you use OceanBase Database, you need to keep in mind its development specifications.

Incorrect selection of the execution plan due to cost model defects

OceanBase Database uses its built-in cost model. The selection of the optimal execution plan depends on this cost model. Therefore, if an incorrect plan is selected due to possible defects in this cost model, you can only bind the "correct" execution plan to avoid selecting the incorrect plan.

Inaccurate data statistics

Query optimization relies on accurate data statistics. By default, the OceanBase optimizer collects some statistics during the major compaction of data, which may be inaccurate if you modify data a lot and causes latency in statistics updates. You can actively update the statistics through daily major compaction.

In addition to collecting statistics, the optimizer also samples data from the storage layer based on query conditions for subsequent optimization and selection. Currently, OceanBase Database only supports sampling from local storage. If the data partitions are stored on remote nodes, the optimizer can only estimate the cost based on the collected statistics, which may lead to inaccurate cost estimation.

Decrease in query performance due to physical design of the database

The query performance depends largely on the physical design of the database, including the schema information of the data to be accessed. For example, in a query that involves a secondary index but the required projected column is not included in the indexed columns, the cost of the query is much higher because of the TABLE ACCESS BY INDEX PRIMARY KEY operations. In this case, you can add the projected column to the indexed columns to form a "covering index" to avoid table access by index primary key.

The response of an SQL query is affected by the system workload.

The overall load of the system not only affects the overall throughput of the system but also causes changes in the response time of an SQL query. In OceanBase Database, the SQL engine processes queries linearly, which means that new queries are put in a queue if all threads are occupied until a thread completes the current query. You can see the queue time of a query in the (g)v$sql_audit view.

A routing feedback logic error occurred between the client and the server

One of the main functions of OBProxy is to route SQL queries to the right server. Specifically, if you do not specify week-consistency read in the query, OBProxy routes the query to the server where the leader replica of a partition is located to avoid secondary forwarding between servers. Otherwise, the OBProxy forwards the query to the right server based on the preset rules.

The loose coupling between OBProxy and the server may cause latencies in refreshing the physical distribution of data cached on the OBProxy, resulting in routing errors. The routing information may be changed in the following circumstances:

  • Re-election of the leader between servers due to network instability

  • Re-election of the leader due to server start/deletion and rotating major compactions

  • Re-election of the leader due to load balancing

You need to consider these circumstances when a large number of remote execution queries are found in the SQL audit or plan cache. Because of the routing feedback logic between the client and the server, the client refreshes the physical distribution of data when an error occurs. Then, the routing function is restored.