The rds_duckdb extension for ApsaraDB RDS for PostgreSQL accelerates complex analytical queries. By providing column-oriented tables and vectorization, it significantly improves query performance without requiring changes to your original SQL statements.
For questions, discussions, or feedback about this extension, join the ApsaraDB RDS for PostgreSQL Extensions DingTalk group (ID: 103525002795).
Overview
The rds_duckdb extension integrates the efficient, resource-friendly DuckDB into ApsaraDB RDS for PostgreSQL to enhance its analytical query capabilities. This extension lets you export local tables into column-oriented tables and enable AP query acceleration.
Prerequisites
Your ApsaraDB RDS for PostgreSQL instance runs major version PostgreSQL 13 or later.
Your instance runs minor engine version 20260130 or later.
You have added rds_duckdb to the running value of the shared_preload_libraries parameter.
For more information about how to configure parameters, see Set instance parameters. For example, you can change the running value to
'pg_stat_statements,auto_explain,rds_duckdb'.
For minor engine version 20250228 and later, automatic incremental data synchronization for column-oriented tables is enabled by default.
Create and drop the extension
Use a privileged account to create or drop the extension.
Create the extension
CREATE EXTENSION rds_duckdb;Check the DuckDB version
SELECT rds_duckdb.duckdb_version();Drop the extension
DROP EXTENSION rds_duckdb;Manage column-oriented tables
Create a column-oriented table
Run the following command to export a local table from your ApsaraDB RDS for PostgreSQL instance into a column-oriented table. Local tables can include user tables, materialized views, or foreign tables.
If your instance runs PostgreSQL 15 with minor engine version 20250228 or later, automatic incremental synchronization for column-oriented tables is enabled by default. Before creating a column-oriented table, configure the target ApsaraDB RDS for PostgreSQL instance and local table for this feature. For more information, see Configure automatic incremental synchronization for column-oriented tables.
SELECT rds_duckdb.create_duckdb_table('local_table_name');Refresh a column-oriented table
Run the following command to refresh the exported column-oriented table with the latest data from the source local table. This command updates both the table schema and its data.
SELECT rds_duckdb.refresh_duckdb_table('local_table_name');Check the table size
SELECT rds_duckdb.duckdb_table_size('local_table_name');Check the total size of exported tables
SELECT rds_duckdb.duckdb_database_size();Drop a column-oriented table
SELECT rds_duckdb.drop_duckdb_table('local_table_name');Manage AP acceleration
The rds_duckdb extension currently accelerates read-only queries. When AP query acceleration is enabled, any query that exclusively uses tables with corresponding DuckDB column-oriented tables is offloaded to DuckDB. If a query involves unsupported operations, such as DML or DDL, or references tables without a column-oriented counterpart, it automatically falls back to the standard ApsaraDB RDS for PostgreSQL engine.
For queries that fall back to the PostgreSQL engine, the system returns a warning in the following format: WARNING: Trying to execute an operation with non-duckdb tables(test), fallback to PG. The tables listed in parentheses are those that lack a corresponding DuckDB column-oriented table.
Non-read-only queries also trigger a warning: WARNING: Modification operations on DuckDB tables are currently not supported, fallback to PG.
Enable AP acceleration
SET rds_duckdb.execution = on;Configure AP acceleration parameters
Adjust parameters in your session to control the performance of AP query acceleration. For example:
SET rds_duckdb.worker_threads = 32;
SET rds_duckdb.memory_limit = 16384;Parameter | Description | Suggestion |
rds_duckdb.worker_threads | The number of worker threads for AP query acceleration. Valid values: 1 to 255. Default value: 1. |
|
rds_duckdb.memory_limit | The memory limit for AP query acceleration. Unit: MB. Do not include the unit when you set the parameter value. Valid values: 1 to INT32_MAX. Default value: 100 (which indicates 100 MB). |
|
For more information about DuckDB parameters, see the official DuckDB documentation.
Disable AP acceleration
SET rds_duckdb.execution = off;Configure auto-sync for column-oriented tables
If your instance runs PostgreSQL 15 with minor engine version 20250228 or later, automatic incremental synchronization for column-oriented tables is enabled by default. Before creating a column-oriented table, complete the following steps.
In extension management, check the
rds_duckdbversion and upgrade it to 1.3 or later.Set instance parameters to change the running value of the wal_level parameter to logical.
(Optional) If the target local table does not have a primary key, run the following command to set a REPLICA IDENTITY index as the replication key.
ALTER TABLE <local_table_name> REPLICA IDENTITY USING INDEX <index_name>;Use an account with replication permission or a privileged account to create the column-oriented table.
Check the synchronization status and progress of column-oriented tables
SELECT * FROM rds_duckdb.duckdb_sync_stat;View SQL execution plans
Use the EXPLAIN statement to compare the execution plans of a SQL query with and without AP query acceleration enabled. For example:
The following example shows the execution plan when AP query acceleration is enabled.
The following example shows the execution plan when AP query acceleration is disabled.
Performance testing
To evaluate the performance improvements that rds_duckdb provides for complex queries, refer to our standard TPC-H benchmark tests conducted in a Linux environment. For detailed procedures and results, see Performance test for the rds_duckdb extension.