pg_profile is a third-party extension supported by PolarDB for PostgreSQL. It collects statistics on resource-intensive activities in a target database to help you analyze and optimize the database. The extension is based on PostgreSQL statistical views and several community extensions. It is written entirely in PL/pgSQL and requires no external libraries or software. Use it with pg_cron to schedule periodic sampling and diagnose historical workloads — for example, to answer questions like "what was causing the slowdown two hours ago?"
Supported versions
| PostgreSQL version | Minimum minor engine version |
|---|---|
| PostgreSQL 16 | 2.0.16.9.9.0 |
| PostgreSQL 15 | 2.0.15.14.6.0 |
| PostgreSQL 14 | 2.0.14.10.18.0 |
View the minor engine version in the console or by running SHOW polardb_version;. If your cluster does not meet the version requirement, upgrade the minor engine version.How it works
pg_profile uses two dedicated databases within the same cluster:
Server database: The database being monitored. It has the statistics extensions installed and accepts sampling connections.
Profile database: The database where
pg_profileis installed. It connects to the server database overdblink, pulls the latest statistics, and stores them locally.
Because of the dblink dependency, both databases must reside in the same PolarDB for PostgreSQL cluster.
Once you have two or more samples, call get_report() to generate an HTML report that summarizes resource-intensive SQL statements and objects between any two sample points.
Prerequisites
Before you begin, make sure you have:
A PolarDB for PostgreSQL cluster running a supported version
A privileged account for the cluster
pg_stat_statementsadded to theshared_preload_librariesparameter (added by default)
To check or modify the shared_preload_libraries parameter, see . Modifying this parameter restarts the cluster. Plan your operations accordingly.Optional: Install pg_stat_kcache to collect more information. Without it, pg_profile still works but reports fewer metrics.
Usage notes
Each sampling operation takes several seconds. Take samples no more than once or twice per hour to avoid excessive overhead.
Sample data consumes disk space.
pg_profilehas an automatic cleanup mechanism, but monitor your data retention to avoid unexpected disk usage.At least two samples are required before you can generate a report.
Set up pg_profile
Step 1: Create the server database
Connect to your cluster and create the server database:
CREATE DATABASE server;Connect to the server database and install the statistics extensions:
\c server
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
CREATE EXTENSION IF NOT EXISTS pg_stat_kcache;pg_stat_kcache is optional. Omit it if you do not need the additional metrics it provides in your reports.Step 2: Create the profile database
Create the profile database:
CREATE DATABASE profile;Connect to the profile database and install pg_profile:
\c profile
CREATE EXTENSION IF NOT EXISTS pg_profile CASCADE;Step 3: Create a sampling connection
Run the following from the profile database to register the server database as a monitoring target:
SELECT create_server(
'server',
'dbname=server channel_name=localhost user=<privileged_account_of_the_server_database> password=<password_for_the_account>'
);| Parameter | Example | Description |
|---|---|---|
server (first argument) | server | A user-defined name for this connection. |
dbname | server | The name of the server database. |
channel_name | localhost | The connection channel. Only localhost is supported. |
user | test_user | A privileged account for the cluster where the server database resides. |
password | T123456! | The password for the account. |
The expected output is similar to:
show_servers
-------------------------------------------------------------------------------------
(server,"dbname=server channel_name=localhost user=test_user password=****",t,)
(1 row)To update the connection information later, use set_server_connstr():
SELECT set_server_connstr(
'server',
'dbname=server channel_name=localhost user=<privileged_account_of_the_profile_database> password=<password_for_the_account>'
);Do not change the first argument (server). It identifies the existing connection by name.Step 4: Take an initial sample
Run the following from the profile database to take the first sample:
SELECT take_sample();The output has three columns: the server name, the result (OK or an error message), and the elapsed time:
take_sample
-------------------------
(server,OK,00:00:02.81)
(1 row)Step 5: Generate a workload and take a second sample
To capture a meaningful workload difference between samples, run a stress test on the server database. The following example uses pgbench:
pgbench -s 100 -i server -U test_user -h pc-****.pg.rds.aliyuncs.com -p 5432| Parameter | Example | Description |
|---|---|---|
-s | 100 | Scale factor. 100 creates 10,000,000 rows in the pgbench_accounts table. |
-i | server | The database to initialize. |
-U | test_user | A privileged account for the cluster where the server database resides. |
-h | pc-****.pg.rds.aliyuncs.com | The endpoint of the cluster where the server database resides. |
-p | 5432 | The port. Use the port shown in the console. |
After the test completes, take a second sample from the profile database:
SELECT take_sample();View all collected samples:
SELECT * FROM show_samples('server');The expected output is similar to:
sample | sample_time | sizes_collected | dbstats_reset | bgwrstats_reset | archstats_reset
--------+------------------------+-----------------+---------------+-----------------+-----------------
1 | 2024-02-23 07:29:53+00 | t | | |
2 | 2024-02-23 08:11:39+00 | t | | |Step 6: Generate a report
Export the data between two samples to an HTML report using one of the following methods.
Method 1: Use the psql interactive command line to write the output to a local file.
\o report_1_2.html
SELECT get_report('server',1,2);Method 2: Run psql from the command line to write the output directly to a file.
psql -Aqtc "SELECT get_report('server',1,2)" \
-o report_server_1_2.html -d profile \
-h <endpoint_of_the_profile_database_cluster> \
-p <port_of_the_database_cluster> \
-U <privileged_account_for_the_profile_database_cluster>Replace the placeholders with the actual values for your profile database cluster.
What's next
Schedule periodic sampling: Use
pg_cronto automatically take samples on a schedule. The recommended frequency is one to two times per hour.