By Baotiao
We see that currently there are 3 mainstream solutions for PostgreSQL integration with DuckDB running: pg_duckdb, pg_mooncake, and pg_lake.
pg_duckdb is the official DuckDB plugin. It can only provide the migration of existing row-oriented tables to DuckDB and cannot synchronize incremental data to DuckDB tables. Therefore, the applicable scenarios are relatively limited.
pg_mooncake, acquired by Databricks, can support existing and incremental data synchronization. It copies PostgreSQL data through logical replication via an additional pg_moonlink process and writes the data in Iceberg format. Subsequently, if there is a replication query, the query needs to follow the request method of postgresql => pg_moonlink => s3 (Iceberg).
pg_lake, acquired by Snowflake, also does not support incremental data synchronization. It only supports full data import and export. It feels more like a solution for a data archiving scenario.
We can see several problems. One is that the PostgreSQL logical replication capability is not mature enough, far inferior to its native physical replication capability, and cannot connect PostgreSQL and DuckDB read-only instances through logical replication. Another problem is that PostgreSQL does not support good pluggable storage engine capabilities. Although PostgreSQL provides table access method as the storage engine API, it does not provide capabilities such as primary-standby replication or crash recovery. In many scenarios, data consistency cannot be guaranteed.
This problem is well resolved in MySQL.
First, MySQL is naturally a pluggable storage engine design. In the early days, the default engine of MySQL was MyISAM. Later, because InnoDB supported row-level Multi-Version Concurrency Control (MVCC), MySQL transformed the default engine to the InnoDB engine. In MySQL, there were originally column store solutions such as InfoBright, but they did not become popular. Therefore, supporting the column store DuckDB in MySQL and adding a column store engine is a very natural thing. It does not require a solution like PostgreSQL where written data needs to be written to the row store first and then transformed into the column store.
Another aspect is the MySQL binary logging (Binlog) mechanism. The double log mechanism of MySQL has disadvantages and advantages. The existence of Binlog and redo log will definitely have an impact on write performance, but Binlog provides excellent support for the upstream and downstream of the MySQL ecosystem. Binlog provides complete SQL statements, which makes it very convenient to copy to the downstream. This is also the reason why Online Analytical Processing (OLAP) applications in the MySQL ecosystem are so popular, such as ClickHouse, StarRocks, and SelectDB.
In the scenario where MySQL uses DuckDB as a storage engine, the MySQL Binlog ecosystem is completely compatible and has not been destroyed. Therefore, it can serve as a data warehouse node. The data written to this data warehouse node can still stream the Binlog out. In the scenario of hybrid transactional and analytical processing (HTAP), the primary node MySQL InnoDB engine sends Binlog to the downstream MySQL DuckDB engine, thereby achieving completely compatible streaming.
How the PolarDB-X JDBC Driver Implements High Availability Detection
ApsaraDB - November 18, 2025
ApsaraDB - September 10, 2025
ApsaraDB - September 27, 2025
ApsaraDB - January 15, 2026
ApsaraDB - November 26, 2025
ApsaraDB - March 20, 2026
PolarDB for PostgreSQL
Alibaba Cloud PolarDB for PostgreSQL is an in-house relational database service 100% compatible with PostgreSQL and highly compatible with the Oracle syntax.
Learn More
PolarDB for MySQL
Alibaba Cloud PolarDB for MySQL is a cloud-native relational database service 100% compatible with MySQL.
Learn More
AnalyticDB for PostgreSQL
An online MPP warehousing service based on the Greenplum Database open source program
Learn More
AnalyticDB for MySQL
AnalyticDB for MySQL is a real-time data warehousing service that can process petabytes of data with high concurrency and low latency.
Learn MoreMore Posts by ApsaraDB