All Products
Search
Document Center

ApsaraDB RDS:DuckDB-based analytical instances

Last Updated:Dec 11, 2025

ApsaraDB RDS for MySQL DuckDB-based analytical instances use columnar storage and vectorization as their core technologies to improve the performance of complex analytical queries by a hundredfold. Columnar storage compression technology significantly reduces storage costs. These instances provide businesses with cost-effective, real-time analytics for large-scale data scenarios to enhance data-driven decision-making.

Features

DuckDB-based analytical instances include the following types:

  • DuckDB-based analytical primary instance

    • Definition: A standalone database instance that is similar to a regular RDS for MySQL primary instance. This type is available for RDS Cluster Edition and deployed with multiple standby nodes that can handle read requests.

    • Core features:

      • It retains all basic features of a regular RDS for MySQL instance and is compatible with related tools and ecosystems. For more information, see Overview of a DuckDB-based analytical primary instance.

      • It deeply integrates the DuckDB analytical engine into the MySQL kernel. It uses columnar storage and vectorization as its core technologies to provide both strong support for transactions and high-performance complex analytical processing capabilities.

  • DuckDB-based analytical read-only instance

    • Definition: A type of read-only instance that is attached to a regular RDS for MySQL instance that runs High-availability Edition.

    • Core features:

      • Seamless data flow: It uses native replication based on binary logging (binlog) to automatically synchronize data and transform table schemas. This eliminates the need to maintain a separate data synchronization link.

      • Integrated HTAP: It supports analytical requests through direct connections to the DuckDB-based analytical read-only instance or by automatically routing requests to the instance through a database proxy. This architecture provides integrated hybrid transactional and analytical processing (HTAP) capabilities. This ensures the transactional processing (TP) performance of the primary instance and improves the efficiency of data analytics.

Comparison item

DuckDB-based analytical primary instance

DuckDB-based analytical read-only instance

RDS for MySQL read-only instance

OLAP database

Scenario

Complex analytic queries

Complex analytic queries

Transaction processing

Complex analytic queries

Analytic query performance

Strong

Strong

Weak

Strong

Data synchronization method

DTS data synchronization link

Native replication based on binary logging

Native replication based on binary logging

DTS data synchronization link

MySQL compatibility

Data type

Fully compatible

Fully compatible

Fully compatible

Incompatible (requires field mapping)

SQL syntax

Highly compatible (> 99.9%)

Highly compatible (>99.9%)

Fully compatible

Incompatible (requires SQL rewrite)

DDL

Highly compatible

Highly compatible (supports automatic rebuilding)

Fully compatible

Partially compatible

O&M costs

Low

Low

Low

High

Deployment model

Standalone deployment

Attached to a regular RDS for MySQL instance

Attached to a regular RDS for MySQL instance

Standalone deployment

Supported edition

Cluster Edition

High-availability Edition

Basic Edition, High-availability Edition

-

Scenarios

  • Aggregate analysis: DuckDB-based analytical instances provide efficient aggregate queries for real-time aggregate analysis, such as analyzing log data.

  • Multi-table join queries: For query services that involve multi-table JOIN operations, these instances can significantly improve the analytical performance of MySQL.

Technical principles

What is DuckDB?

DuckDB is a standalone online analytical processing (OLAP) database designed for embedded scenarios. Its core architecture balances high-performance analytics with transaction processing:

  • High-performance analytics: Columnar storage significantly accelerates aggregate analysis queries, and the vectorized execution engine efficiently processes data in batches.

  • Strong transaction support: It provides full atomicity, consistency, isolation, and durability (ACID) transaction capabilities. It uses a Multi-Version Concurrency Control (MVCC) mechanism to enable efficient concurrent read and write operations in a standalone environment.

Technical optimizations

ApsaraDB RDS deeply integrates the DuckDB engine to combine DuckDB's high-performance analytics with the MySQL ecosystem. This approach balances high-performance analytics with transaction processing. This meets the efficient analytical needs of enterprises for large-scale data while ensuring enterprise-grade data reliability and consistency.

General technical optimizations (for both primary and read-only analytical instances)

  • Storage engine optimization

    • It encapsulates DuckDB's columnar storage as a transactional storage engine within MySQL and uses the standard MySQL data dictionary to manage metadata. This ensures full compatibility with the transaction semantics of the primary instance.

    • It enhances the native synchronization capability based on binary logging and uses DuckDB's write-ahead logging (WAL) for transaction persistence. This ensures data consistency between the ApsaraDB RDS for MySQL instance and the DuckDB-based analytic instance.

  • Compute engine enhancement

    • It integrates core DuckDB components. The optimizer, vectorized execution engine, and compiler are deeply integrated to support just-in-time (JIT) compilation and vectorization. This improves complex query performance by two orders of magnitude compared with InnoDB.

    • It adapts the SQL parser to support 99.9% of MySQL syntax and functions. This ensures that existing query statements can be executed directly without modification.

    • The Result Translator component automatically converts DuckDB's computation results into a MySQL protocol-compatible format. This allows for seamless client integration without requiring adjustments to existing application logic.

  • Query acceleration

    • Columnar storage: Data is stored locally in DuckDB's native columnar format. This improves aggregate query performance by more than 100 times. For example, the response time for a SUM query on the same amount of data is reduced to 1/100th of the time required by InnoDB.

    • Automatic caching of hot data: The system uses DuckDB's Buffer Pool mechanism to automatically cache frequently queried data. This provides stable support for high-concurrency business scenarios.

Additional technical optimizations for DuckDB-based analytical read-only instances

  • Data synchronization performance improvement

    • It introduces a batching mechanism that combines high-frequency small transactions into large transactions. This significantly reduces write latency.

    • Idempotent replay: The system performs idempotence checks on binary log events to ensure strong consistency in data synchronization.

  • DDL synchronization optimization

    • The system automatically identifies Data Definition Language (DDL) operations that are natively supported by DuckDB, such as creating tables and adding or removing fields. These operations are routed directly to the DuckDB engine for execution without conversion. This ensures data synchronization performance.

    • For DDL operations not supported by DuckDB, the system automatically triggers a table rebuild before execution. This prevents execution errors from interrupting replication and ensures the stability of the synchronization link.

  • Resource isolation: The DuckDB-based analytical read-only instance is isolated from the ApsaraDB RDS for MySQL primary instance. Therefore, analytical queries do not affect online transaction processing, ensuring the stability of core business operations.

Workflow of a DuckDB-based analytical instance

  1. SQL parsing: A user's SQL query is parsed by the standard MySQL parser.

  2. SQL execution: The parsed query is routed to the DuckDB compute engine for execution.

  3. Data retrieval: Columnar data is retrieved from the DuckDB storage engine.

  4. Format conversion: The computation result is converted by the Result Translator component into the MySQL protocol format and then sent to the client.

How to use