All Products
Search
Document Center

AnalyticDB for PostgreSQL:Performance parameter optimization

Last Updated:Sep 18, 2023

AnalyticDB for PostgreSQL is a ready-to-use service applicable to most scenarios. You can also configure database parameters to improve performance of AnalyticDB for PostgreSQL and meet specific business requirements.

Parameter levels

AnalyticDB for PostgreSQL provides four parameter levels. Specific parameters can be set only at their specific levels.

Level

Effective scope

Setting method

System level

All users of all databases in an instance

Submit a ticket to contact O&M personnel.

Database level

All sessions in the specified database

Execute the following statement:

ALTER DATABASE name SET parameter { TO | = } { value | DEFAULT };

Role level

All sessions of the specified user

Execute the following statement:

ALTER ROLE name SET parameter  { TO | = } { value | DEFAULT };

Session level

The current session

Execute the following statement:

SET parameter  { TO | = } { value | DEFAULT };

The following table lists the system-level parameters that can be modified by submitting a ticket.

Parameter

Default value

Data type/Unit

Valid value

gp_autostats_mode

ON_NO_STATS

STRING

NONE, ON_CHANGE, and ON_NO_STATS

gp_autostats_mode_in_functions

NONE

STRING

NONE, ON_CHANGE, and ON_NO_STATS

gp_max_slices

50

INT

0 to 1000

log_rotation_size

102400

KB

0 to 4194304

master.rds_enable_vmem_protect

on

STRING

on and off

master.rds_max_non_super_conns

500

INT

10 to 1000

max_stack_depth

2048

KB

100 to 2048000

max_statement_mem

2048000

KB

32768 to 2147483647

optimizer

on

STRING

on and off

random_page_cost

4

DOUBLE

0 to 1000

rds.rds_enable_aliyun_oss_endpoint

on

STRING

on and off

rds.rds_enable_oss_endpoint_whitelist_check

on

STRING

on and off

rds_max_super_conns

50

INT

10 to 100

segment.rds_enable_vmem_proctect

off

STRING

on and off

segment.rds_max_non_super_conns

1000

INT

10 to 3000

statement_mem

2047000

KB

50 to 2147483647

statement_timeout

10800000

INT

0 to 2147483647

For more information about the preceding parameters, see Configuration Parameters.

Note

We recommend that you prioritize parameters at the session level, and modify parameters at higher levels only when you are sure that the modification does not affect other users.

Parameter description

The following tables describe the performance-related parameters of AnalyticDB for PostgreSQL.

  • Connection-related parameters

    Parameter

    Condition to take effect

    Dependency

    Description

    max_connections

    Instance restart

    The value of this parameter on compute nodes must be three to five times that of this parameter on coordinator nodes.

    The maximum number of concurrent connections between a node and the database server. This parameter must be set separately for each of the coordinator and compute nodes. We recommend that you adjust the rds_max_non_super_conns value instead of the max_connections value. In scenarios where short-lived connections are highly concurrent, you can modify the rds_max_non_super_conns parameter to a value between 500 and the max_connections value. The rds_max_super_conns parameter is related to superusers and does not need to be adjusted in most cases.

    max_prepared_transactions

    Instance restart

    • We recommend that you set this parameter to a value the same as the max_connections value.

    • Two times the max_prepared_transactions value must be greater than the rds_max_non_super_conns value plus the rds_max_super_conns value.

    The number of transactions in the two-phase commit (2PC) state. We recommend that you set this parameter to a value the same as the max_connections value. If errors occur in high concurrency scenarios, you must increase this value.

    rds_max_non_super_conns

    Reload configuration

    • This value must be less than the max_connections value.

    • The rds_max_non_super_conns value plus the rds_max_super_conns value must be less than or equal to the max_connections value.

    • The value of this parameter on compute nodes must be three to five times that of this parameter on coordinator nodes.

    The maximum number of connections that can be used by non-superusers.

    rds_max_super_conns

    Reload configuration

    • This value must be less than the max_connections value.

    • The rds_max_non_super_conns value plus the rds_max_super_conns value must be less than or equal to the max_connections value.

    • The value of this parameter on compute nodes must be three to five times that of this parameter on coordinator nodes.

    The maximum number of connections that can be used by superusers.

  • Memory control parameters

    Parameter

    Valid value/Unit

    Condition to take effect

    Dependency

    Description

    shared_buffers

    KB

    Instance restart

    If most data is stored in row-oriented tables, we recommend that you set this value to 25% of the instance memory size or slightly higher for compute nodes.

    The size of the shared buffer. This parameter must be set separately for each of the coordinator and compute nodes. The cache page on coordinator nodes contains data dictionary tables, and the cache page on compute nodes contains row-oriented tables. Append-optimized column-oriented (AOCO) tables are not affected by the size of the shared buffer. In most cases, this parameter does not need to be modified. If more than 100,000 tables are created, we recommend that you increase this value for the coordinator and compute nodes. If most data is stored in row-oriented tables, we recommend that you increase this value for compute nodes.

    temp_buffers

    KB

    Reload configuration

    None

    The size of the temporary buffer. The temporary buffer is used for database sessions to access temporary tables. The default value of this parameter is 8 MB. To improve the performance of access to large temporary tables, you can modify this parameter for a specific session.

    gp_vmem_protect_limit

    KB

    Instance restart

    gp_vmem_protect_limit = gp_vmem/acting_primary_segments

    The maximum amount of memory that can be used by all postgres processes on compute nodes. Unit: MB. A query fails if a memory size greater than this value is required. This parameter must be set separately for each of the primary and secondary compute nodes.

    You can estimate an appropriate value for this parameter based on the following formulas:

    gp_vmem = ((SWAP + RAM) - (7.5 GB + 0.05 × RAM))/1.7
    gp_vmem_protect_limit = gp_vmem/acting_primary_segments

    The acting_primary_segment value can be the number of primary compute nodes or that of primary and secondary compute nodes combined. If your query fails, you must increase this value.

    Note

    If you want to set a memory limit for resource queues, we recommend that you modify the resource queue parameter. For more information, see Use resource queues for load management.

    gp_vmem_protect_segworker_cache_limit

    KB

    Instance restart

    None

    The maximum cache of the query executor process. If you have a large number of connections or idle processes, you may need to decrease this value to release more memory from compute nodes. This parameter must be set separately for each of the compute nodes.

    gp_resqueue_memory_policy

    none, auto, and eager_free

    Reload configuration

    None

    The memory management policy used in AnalyticDB for PostgreSQL. A query plan contains several stages. If this parameter is set to eager_free, AnalyticDB for PostgreSQL releases the memory allocated to a stage at the end of this stage, and allocates memory to the next stage. The eager_free value reduces the possibility of insufficient memory but incurs additional performance overheads. If this parameter is set to auto, the query memory usage is controlled based on statement_mem and resource queue memory limits.

    statement_mem

    KB

    Reload configuration

    • This parameter is valid when gp_resqueue_memory_policy is set to auto.

    • This value cannot be greater than the max_statement_mem value.

    The host memory allocated to each query. The amount of memory allocated based on this parameter cannot exceed max_statement_mem or the memory limit on the resource queue that is used to submit a query. If gp_resqueue_memory_policy is set to auto, the query memory usage is controlled based on statement_mem and resource queue memory limits.

    If more memory is required for a query, temporary spill files on disk are used.

    You can estimate an appropriate value for this parameter based on the following formula:

    (gp_vmem_protect_limit GB × 0.9)/max_expected_concurrent_queries

    If gp_vmem_protect_limit is set to 8192 MB (8 GB) and a maximum of 40 concurrent queries with a 10% buffer is supposed, the following value can be calculated:

    (8GB × 0.9)/40 = 184 MB

    max_statement_mem

    KB

    Reload configuration

    (seghost_physical_memory)/(average_number_concurrent_queries)

    The maximum amount of memory that can be allocated to a query. This parameter helps avoid out-of-memory errors on compute nodes during query processing if statement_mem is set to a high value. If gp_resqueue_memory_policy is set to auto, the query memory usage is controlled based on statement_mem and resource queue memory limits. This value can be calculated based on the following formula in consideration of configurations of a single compute node host:

    (seghost_physical_memory)/(average_number_concurrent_queries)

  • Workload-related parameters

    Parameter

    Valid value/Data type

    Condition to take effect

    Dependency

    Description

    gp_resqueue_priority

    BOOLEAN

    Instance restart

    None

    Specifies whether to enable or disable query prioritization. If query prioritization is disabled, existing priority settings are not evaluated at query runtime. We recommend that you disable the priority queue in scenarios that require high query performance.

    gp_resqueue_priority_cpucores_per_segment

    0.1 to 512.0

    Instance restart

    None

    The number of CPU cores allocated to each compute node of an instance, which is specified when you create the instance. If this parameter is inappropriately configured, CPU may be underused, and query prioritization may not work properly.

    gp_resqueue_priority_sweeper_interval

    500 ms to 15000 ms

    Instance restart

    gp_resqueue_priority

    The interval at which CPU utilization is recalculated for all statements. When a statement is being executed, its CPU share is calculated based on the priority and gp_resqueue_priority_cpucores_per_segment. The smaller the interval and the more frequent the calculation, the better the results achieved by the priority settings and the larger the overheads.