You can use the AP acceleration engine (rds_duckdb) for ApsaraDB RDS for PostgreSQL to obtain fast results from complex queries. This engine provides column-oriented tables and vectorized execution to significantly speed up complex queries without changing the original query statements. This ensures that you can retrieve results easily and efficiently.
Join the ApsaraDB RDS for PostgreSQL extension DingTalk group (ID: 103525002795) to ask questions, share ideas, give feedback, and learn more about the extension.
Introduction
rds_duckdb integrates the efficient and resource-friendly DuckDB into ApsaraDB RDS for PostgreSQL to enhance analytical query capabilities. This extension lets you export local tables from ApsaraDB RDS for PostgreSQL into column-oriented tables to enable Analytical Processing (AP) query acceleration. This feature significantly improves the execution speed of complex queries to better meet the demands of analytical workloads.
Prerequisites
The major version of your instance is ApsaraDB RDS for PostgreSQL 12 or later.
The minor engine version of your instance is 20241030 or later.
You have added rds_duckdb to the Parameter Value of shared_preload_libraries.
For more information about how to configure parameters, see Set instance parameters. For example, set Running Parameter Value to
'pg_stat_statements,auto_explain,rds_duckdb'.
If the major version of your instance is ApsaraDB RDS for PostgreSQL 15 and the minor version is 20250228, automatic incremental synchronization of column-oriented table data is enabled by default.
Create and delete the extension
Use a privileged account to create and delete the extension.
Create a plugin
CREATE EXTENSION rds_duckdb;View the DuckDB version used by the extension
SELECT rds_duckdb.duckdb_version();Delete the extension
DROP EXTENSION rds_duckdb;Manage column-oriented tables
Create a column-oriented table
Use the following command to export a local ApsaraDB RDS for PostgreSQL table, such as a user table, materialized view, or foreign table, to a column-oriented table to accelerate analytical queries.
If the major version of your instance is ApsaraDB RDS for PostgreSQL 15 and the minor version is 20250228, automatic incremental synchronization of column-oriented table data is enabled by default. Before you create a column-oriented table, you must configure the target ApsaraDB RDS for PostgreSQL instance and local table to enable automatic incremental synchronization. For more information, see Set up automatic incremental synchronization for column-oriented tables.
SELECT rds_duckdb.create_duckdb_table('local_table_name');Refresh the column-oriented table
Use the following command to refresh the exported column-oriented table with the latest data from the local ApsaraDB RDS for PostgreSQL table. This command updates both the table schema and the data.
SELECT rds_duckdb.refresh_duckdb_table('local_table_name');View the size of a column-oriented table
SELECT rds_duckdb.duckdb_table_size('local_table_name');View the total size of all exported tables in the current database
SELECT rds_duckdb.duckdb_database_size();Delete a column-oriented table
SELECT rds_duckdb.drop_duckdb_table('local_table_name');Manage AP acceleration
rds_duckdb currently accelerates only read-only queries. After you enable AP acceleration, if a query involves only tables that have corresponding DuckDB column-oriented tables, DuckDB executes the SQL statement. If the SQL statement involves unsupported DML or DDL operations, or tables without a corresponding column-oriented table, the query falls back to ApsaraDB RDS for PostgreSQL for execution.
For SQL statements that fall back to ApsaraDB RDS for PostgreSQL, the system issues a warning in the following format: WARNING: Trying to execute an operation with non-duckdb tables(test), fallback to PG. The table names in parentheses are the ApsaraDB RDS for PostgreSQL tables that do not have a corresponding DuckDB column-oriented table.
Non-read-only SQL queries also trigger a warning: WARNING: Modification operations on DuckDB tables are currently not supported, fallback to PG.
Enable AP acceleration
SET rds_duckdb.execution = on;Set AP acceleration parameters
You can adjust parameter settings in a session to control AP acceleration performance. For example:
SET rds_duckdb.worker_threads = 32;
SET rds_duckdb.memory_limit = 16384;Parameter name | Description | Recommended value |
rds_duckdb.worker_threads | The number of worker threads used for AP acceleration. Range: 1 to 255. Default: 1. This means there is only one worker thread. |
|
rds_duckdb.memory_limit | The memory limit for AP acceleration. Unit: MB. Do not add the unit when you configure the parameter. Range: 1 to INT32_MAX. Default: 100. This means the upper limit is 100 MB. |
|
For more information about DuckDB parameters, see DuckDB.
Disable AP acceleration
SET rds_duckdb.execution = off;Set up automatic incremental synchronization for column-oriented tables
If the major version of your instance is ApsaraDB RDS for PostgreSQL 15 and the minor version is 20250228, automatic incremental synchronization of column-oriented table data is enabled by default. Before you create a column-oriented table, you must perform the following steps to configure the target ApsaraDB RDS for PostgreSQL instance and local table.
In Extension Management, check the version of rds_duckdb and upgrade it to 1.3.
Set instance parameters to change the value of the wal_level parameter to logical.
(Optional) If the target local table does not have a primary key, run the following command to set a REPLICA IDENTITY index as the replication key for the table.
ALTER TABLE <local_table_name> REPLICA IDENTITY USING INDEX <index_name>;Use an account with replication permissions or a privileged account to create the column-oriented table.
View the synchronization status and offset progress of column-oriented tables
SELECT * FROM rds_duckdb.duckdb_sync_stat;View SQL execution plans
Use the EXPLAIN statement to view the execution plan of an SQL statement with AP acceleration enabled and disabled. For example:
The following execution plan is for the SQL statement after AP acceleration is enabled.
The following execution plan is for the SQL statement after AP acceleration is disabled.
Performance test
A standard TPC-H test in a Linux environment is used to evaluate the performance improvement that rds_duckdb provides for complex queries. For more information, see Performance test for the AP acceleration engine (rds_duckdb).