All Products
Search
Document Center

:Use Elastic Parallel Query to accelerate index creation

Last Updated:Nov 27, 2023

You can use the Elastic Parallel Query feature to accelerate the process of creating B-tree indexes and the process of creating global indexes of B-tree indexes. This topic describes why the feature can be used to accelerate the process of creating B-tree indexes and the process of creating global indexes of B-tree indexes. This topic also describes how to use the feature to create B-tree indexes and global indexes of B-tree indexes.

How it works

When PolarDB for PostgreSQL creates indexes, this service scans the heap table for which you want to create indexes and creates indexed entries. Then, this service creates an index B-tree based on the indexed entries.

When you use this feature to accelerate the process of creating indexes, the system automatically creates a query coordinator (QC) process to scan the heap table in parallel. Then, the process used to create indexes receives the scan results sent by the QC process and creates the indexes.

Precautions

  • You can use this feature to accelerate the process of creating indexes only for general columns. You cannot use this feature to accelerate the process of creating indexes when you use syntax such as CONCURRENTLY or INCLUDE to create indexes.

  • You cannot create indexes for columns that are specified by expressions.

Parameters

If you want to use the Elastic Parallel Query feature to accelerate the process of creating indexes, configure the following parameters.

Parameter

Description

polar_px_enable_btbuild

Specifies whether to use this feature to accelerate index creation. Default value: off. Valid values:

  • off

  • on

polar_px_dop_per_node

Specifies the degree of parallelism that is in effect when you use this feature to accelerate the process of creating indexes. The default value is 1. We recommend that you set the value to 8 or 16.

This parameter also specifies the degree of parallelism for Elastic Parallel Query. For more information, see Use the Elastic Parallel Query feature to run analytical queries.

You can also set this parameter to on for a specified database role.

polar_px_enable_replay_wait

If you set polar_px_enable_btbuild to on, the polar_px_enable_replay_wait parameter automatically takes effect for the current session. This ensures that the most recent entries in the table are indexed. After the indexes are created, the system resets the parameter to the default value in the database.

polar_bt_write_page_buffer_size

Specifies the policy on the write I/O operations when indexes are created. By default, the parameter is set to 0. This value indicates that the indexed entries are flushed to a disk by block when you create indexes. The unit of measurement is block. The maximum value is 8192. We recommend that you set the value to 4096.

  • If you set this parameter to 0, all index entries on an index page are flushed to a disk block by block when the index page is fully loaded during index creation.

  • If you set this parameter to a value that is not 0, the indexed entries to be flushed are stored in a kernel buffer of the size that is indicated by this parameter. When the buffer is fully loaded, all indexed entries in the buffer are flushed to a disk at a time. This prevents performance overheads that result from frequent I/O scheduling. This parameter helps you reduce the time that is required to create indexes by 20%.

Examples

Background information

Execute the following statement to create a table named test:

CREATE TABLE test(id int,id2 int);

Query the table schema:

\d test
               Table "public.test"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 id     | integer |           |          |
 id2    | integer |           |          |

To use Elastic Parallel Query to accelerate the process of creating indexes for the test table, perform the following steps:

  1. Set the polar_px_enable_btbuild parameter to on to use the Elastic Parallel Query feature to accelerate the process of creating indexes.

    Execute the following statement:

    SET polar_px_enable_btbuild=on;

    Check whether the configuration takes effect.

    SHOW polar_px_enable_btbuild;

    The following result is returned:

      polar_px_enable_btbuild
    -------------------------
     on
    (1 row)
  2. Execute the following statement to create indexes:

    CREATE INDEX t ON test(id) WITH(px_build=on);

    Query the table schema:

    \d test
                   Table "public.test"
     Column |  Type   | Collation | Nullable | Default
    --------+---------+-----------+----------+---------
     id     | integer |           |          |
     id2    | integer |           |          |
    Indexes:
        "t" btree (id) WITH (px_build=finish)
Note

If you want to use the Elastic Parallel Query feature to accelerate the process of creating indexes, add the px_build option in the CREATE INDEX statement.

After the indexes are created, the index type information contains the (px_build=finish) field. This field indicates that the indexes are created by using the Elastic Parallel Query feature.

If polar_px_enable_btbuild is set to on but the px_build option is not added in the CREATE INDEX statement, the native method provided by PolarDB for PostgreSQL is used to create the indexes. Example:

CREATE INDEX t ON test(id);

Query the table schema:

\d test
                Table "public.test"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 id     | integer |           |          |
Indexes:
    "t" btree (id)

Performance data

It is approximately five times faster to create indexes for a large table by using the Elastic Parallel Query feature than by using the native method provided by PolarDB for PostgreSQL.