SQL tuning strategies - PolarDB - Alibaba Cloud Documentation Center

Three optimization strategies are available when SQL queries run slowly: create indexes for queries that skip indexes entirely, refine the index structure for queries with suboptimal query plans, or rewrite the SQL logic when indexes alone cannot improve performance. Use EXPLAIN ANALYZE to identify which strategy applies before making changes.

Create indexes

When EXPLAIN ANALYZE shows a sequential scan on a large table, the query is reading every row to find matches. Add an index on the filtered column to switch the query to an index scan.

The following example shows the performance difference before and after adding an index.

Create test data.

CREATE TABLE a(id int);
INSERT INTO a SELECT generate_series(1,1000000);

Run EXPLAIN ANALYZE without an index.

EXPLAIN ANALYZE SELECT * FROM a WHERE id = 10;

Sample output:

QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..12315.50 rows=5875 width=4) (actual time=0.574..52.157 rows=1 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on a  (cost=0.00..10728.00 rows=2448 width=4) (actual time=9.118..26.290 rows=0 loops=3)
         Filter: (id = 10)
         Rows Removed by Filter: 333333
 Planning Time: 0.473 ms
 Execution Time: 52.188 ms
(8 rows)

What to look for: Parallel Seq Scan means the query scanned all 1,000,000 rows and discarded 333,333 that did not match. The execution time is 52.188 ms.

Create an index and run the query again.

CREATE INDEX ON a(id);
EXPLAIN ANALYZE SELECT * FROM a WHERE id = 10;

Sample output:

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Index Only Scan using a_id_idx on a  (cost=0.42..1.54 rows=1 width=4) (actual time=0.117..0.118 rows=1 loops=1)
   Index Cond: (id = 10)
   Heap Fetches: 0
 Planning Time: 0.141 ms
 Execution Time: 0.133 ms
(5 rows)

What to look for: Index Only Scan confirms the query now uses the index. Execution time dropped from 52.188 ms to 0.133 ms — a ~390x improvement.

Use an optimal query plan

When EXPLAIN ANALYZE shows a Bitmap Heap Scan filtering a large number of rows after the index lookup, a single-column index is not selective enough for the query. Replace it with a composite index covering all columns in the WHERE clause.

The following example shows the difference between a single-column index and a composite index on a two-column filter.

Create test data.

CREATE TABLE test(id int, name int);
INSERT INTO test SELECT random()*(id/100),random()*(id/100) FROM generate_series(1,100000) t(id);

Create a single-column index on id and run the query.

CREATE INDEX ON test(id);
EXPLAIN ANALYZE SELECT * FROM test WHERE id = 1 and name =10;

Sample output:

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on test  (cost=5.14..337.45 rows=2 width=8) (actual time=0.073..0.338 rows=8 loops=1)
   Recheck Cond: (id = 1)
   Filter: (name = 10)
   Rows Removed by Filter: 678
   Heap Blocks: exact=199
   ->  Bitmap Index Scan on test_id_idx  (cost=0.00..5.14 rows=500 width=0) (actual time=0.028..0.028 rows=686 loops=1)
         Index Cond: (id = 1)
 Planning Time: 0.111 ms
 Execution Time: 0.361 ms
(9 rows)

What to look for: The index lookup on id returned 686 rows, but 678 were removed by the name = 10 filter. The index is not selective enough — the query still reads most of the matching heap pages.

Create a composite index on (id, name) and run the query again.

CREATE INDEX ON test(id,name);
EXPLAIN ANALYZE SELECT * FROM test WHERE id = 1 and name =10;

Sample output:

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using test_id_name_idx on test  (cost=0.29..3.63 rows=2 width=8) (actual time=0.066..0.076 rows=8 loops=1)
   Index Cond: ((id = 1) AND (name = 10))
   Heap Fetches: 8
 Planning Time: 0.086 ms
 Execution Time: 0.100 ms
(5 rows)

What to look for: The query now uses Index Only Scan on the composite index. The filter is resolved entirely within the index — no rows are removed after the lookup. Execution time dropped from 0.361 ms to 0.100 ms.

Modify SQL statements

When indexes cannot improve query efficiency, rewrite the SQL logic itself. A common case is pagination queries that use DISTINCT and Oracle-style rownum for row limiting, which forces a full sort of the result set before any rows are returned.

Note The following example uses a real-world table structure (dm_device). No test data is provided — apply the same pattern to your environment.

Problem: full-table sort caused by DISTINCT and rownum

The original SQL uses SELECT DISTINCT and nested rownum filters to return rows 11–1000:

SELECT *
FROM (
    SELECT row_.*, rownum AS rownum_
    FROM (
        SELECT DISTINCT
            device0_.id AS id1_1_,
            device0_.app_id AS app_id2_1_,
            device0_.app_version AS app_version3_1_,
            device0_.created_time AS created_time4_1_,
            device0_.fcm_token AS fcm_token5_1_,
            device0_.member_id AS member_id12_1_,
            device0_.name AS name6_1_,
            device0_.notification AS notification7_1_,
            device0_.sys_version AS sys_version8_1_,
            device0_.token AS token9_1_,
            device0_.type AS type10_1_,
            device0_.updated_time AS updated_time11_1_
        FROM dm_device device0_
        WHERE device0_.notification = 1
          AND device0_.token IS NOT NULL
        ORDER BY device0_.id ASC
    ) row_
    WHERE rownum <= 1000
)
WHERE rownum_ > 10;

Sample output from EXPLAIN ANALYZE:

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan on _unnamed_subquery_0 (cost=417184.72..473136.18 rows=133218 width=723) (actual time=2000.039..2888.102 rows=990 loops=1)
  Filter: (_unnamed_subquery_0.rownum_ > 10)
  Rows Removed by Filter: 10
  Buffers: shared hit=52736, temp read=33458 written=33473
  -> Subquery Scan on row_ (cost=417184.72..468140.52 rows=399653 width=723) (actual time=2000.022..2887.974 rows=1000 loops=1)
      Filter: (rownum <= 1000)
      Rows Remove by Filter: 1190532
      Buffers: shared hit=52736, temp read=33458 written=33473
      -> Unique (cost=417184.72..453153.52 rows=1198960 width=715) (actual time=2000.016..2804.040 rows=1191532 loops=1)
          Buffers: shared hit=52736, temp read=33458 written=33473
          -> Sort (cost=417184.72..420182.12 rows=1198960 width=715) (actual time=2000.014..2416.502 rows=1191532 loops=1)
              Sort Key: device0_.id, device0_.app_id, device0_.app_version, device0_.created_time, device0_.fcm_token, device0_.number_id, device0_.name, device0_.sys_version, device0_.token,device0_.type, device0_.updated_time
              Sort Method: external merge Disk: 267664kB
              Buffers: shared hit=52736, temp read=33458 written=33473
              -> Seq Scan on dm_device device0_ (cost=0.00..72905.11 rows=1198960 width=715) (actual time=0.100..608.593 rows=1191532 loops=1)
                  Filter: ((token IS NOT NULL) AND (notification = '1'::numeric))
                  Rows Remove by Filter: 421997
                  Buffers: shared hit=52736
Planning Time: 22.127 ms
Execution Time: 2954.789 ms
(20 rows)

What to look for: Sort Method: external merge Disk: 267664kB means the database spilled ~261 MB of intermediate data to disk to sort 1.19 million rows before applying the row limit. This is the root cause of the 2,954 ms execution time.

Solution: replace DISTINCT and rownum with LIMIT

Remove SELECT DISTINCT and replace the inner rownum with LIMIT:

SELECT *
FROM (
    SELECT row_.*, rownum AS rownum_
    FROM (
        SELECT
            device0_.id AS id1_1_,
            device0_.app_id AS app_id2_1_,
            device0_.app_version AS app_version3_1_,
            device0_.created_time AS created_time4_1_,
            device0_.fcm_token AS fcm_token5_1_,
            device0_.member_id AS member_id12_1_,
            device0_.name AS name6_1_,
            device0_.notification AS notification7_1_,
            device0_.sys_version AS sys_version8_1_,
            device0_.token AS token9_1_,
            device0_.type AS type10_1_,
            device0_.updated_time AS updated_time11_1_
        FROM dm_device device0_
        WHERE device0_.notification = 1
        AND device0_.token IS NOT NULL
        ORDER BY device0_.id ASC
        LIMIT 1000
    ) row_
    WHERE rownum <= 1000
)
WHERE rownum_ > 10;

Sample output from EXPLAIN ANALYZE:

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Subquery Scan on _unnamed_subquery_0 (cost=0.43..93.61 rows=111 width=723) (actual time=0.032..1.071 rows=990 loops=1)
  Filter: (_unnamed_subquery_0.rownum_ > 10)
  Rows Removed by Filter: 10
  Buffers: shared hit=59
  -> Subquery Scan on row_ (cost=0.43..89.45 rows=333 width=723) (actual time=0.020..0.942 rows=1000 loops=1)
      Filter: (rownum <= 1000)
      Buffers: shared hit=59
      -> Limit (cost=0.43..76.95 rows=1000 width=715) (actual time=0.019..0.771 rows=1000 loops=1)
          Buffers: shared hit=59
          -> Index Scan using dm_device_pkey on dm_device device0_ (cost=0.43..91742.47 rows=1198960 width=715) (actual time=0.018..0.675 rows=1000 loops=1)
              Filter: ((token IS NOT NULL) AND (notification = '1'::numeric))
              Rows Remove by Filter: 431
              Buffers: shared hit=59
Planning Time: 1.459 ms
Execution Time: 1.147 ms
(15 rows)

What to look for: Index Scan using dm_device_pkey replaced the sequential scan, and the Limit node stops reading after 1,000 rows — no disk sort occurs. Buffer usage dropped from 52,736 to 59 pages. Execution time dropped from 2,954 ms to 1.147 ms.

Two changes drive this improvement:

`LIMIT` replaces `rownum`: LIMIT is evaluated at the database engine level during row retrieval, so the query stops as soon as it has enough rows. rownum, by contrast, is applied after the full result set is built — the database sorts and deduplicates all matching rows before counting them.
Removing `DISTINCT`: SELECT DISTINCT forces a sort and deduplication pass across all selected columns before any row limit is applied. If the data is already unique by primary key, DISTINCT is redundant and eliminates the query optimizer's ability to use LIMIT as an early stop condition.