All Products
Search
Document Center

ApsaraDB RDS:Create a DuckDB-based analytical read-only instance

Last Updated:Dec 05, 2025

ApsaraDB RDS for MySQL DuckDB-based analytical read-only instances use column store and vectorization to improve the performance of complex analytic queries by up to 100 times. These instances provide real-time analytics for large-scale data scenarios and help you make data-driven decisions faster.

Product introduction

DuckDB-based analytical read-only instances have a built-in DuckDB engine with the following features:

  • High-performance analytics: Supports features such as column store, just-in-time (JIT) compilation, vectorization, efficient memory management, and parallel processing.

  • High compatibility with MySQL: Highly compatible with MySQL syntax and data formats. You can obtain results efficiently without modifying your original query statements.

  • Data synchronization mechanism:

    • Historical data synchronization: When you create a DuckDB-based analytical read-only instance, the system synchronizes historical data from the primary instance and automatically converts it to the DuckDB engine format.

    • Incremental data synchronization: After the DuckDB-based analytical read-only instance is created, incremental data from the primary instance is synchronized in real time to the analytic read-only instance using the native MySQL binary logging (binlog) replication mechanism. Data is always transferred within RDS and does not require external data synchronization tools. For more information about the technology, see How DuckDB analytic instances work.

Scenarios

  • Aggregation analysis: For aggregation analysis, such as log data analysis, DuckDB-based analytical read-only instances provide efficient aggregate queries.

  • Multi-table join query: Significantly improves MySQL's analytical performance for workloads with multi-table JOINs.

Comparison between DuckDB-based analytical read-only instances, read-only instances, and OLAP databases

Compared with using primary instances, read-only instances, or OLAP databases for complex queries, DuckDB-based analytical read-only instances have the following advantages:

  • High performance: Delivers two orders of magnitude better performance for complex queries compared to the InnoDB engine.

  • High compatibility: 100% compatible with the MySQL protocol and data types, and highly compatible with SQL syntax and Data Definition Language (DDL).

  • Efficient and stable data synchronization: Uses a native binlog replication channel for a more stable and efficient synchronization link. No additional data synchronization fees are charged.

  • Resource isolation: DuckDB-based analytical read-only instances handle complex analytic queries, while primary and regular read-only instances handle transaction processing. This isolates resources for each workload, preventing them from affecting each other.

Item

DuckDB-based analytical read-only instance

Read-only instance

OLAP database

Suitable services

Complex analytic queries

Transaction processing

Complex analytic queries

Analytic query performance

Strong

Low

Strong

Data synchronization method

Native binlog replication

Native binlog replication

DTS data synchronization link

MySQL compatibility

Data types

Fully compatible

Fully compatible

Incompatible (requires field mapping)

SQL syntax

Highly compatible (more than 99.9%)

Fully compatible

Incompatible (requires SQL rewrite)

DDL

Highly compatible

Fully compatible

Partially compatible

O&M costs

Low (integrated instance)

Low (integrated instance)

High (requires extra maintenance for the database and synchronization link)

Scenarios

To create a DuckDB-based analytical read-only instance, the attached ApsaraDB RDS for MySQL primary instance must meet the following requirements:

  • Major version: MySQL 8.0

  • Series: High-availability Series

  • Storage type: Premium ESSD or enterprise SSD (ESSD)

  • Storage capacity: 10 GB to 16,000 GB

  • The data archiving feature is disabled.

Billing

The billing for DuckDB-based analytical read-only instances is the same as for regular read-only instances. The cost is affected by factors such as the product series, instance type, storage type, and storage capacity. The actual cost is displayed on the purchase page.

Notes

  • Instance creation time: When a DuckDB-based analytical read-only instance is created, the engines of all tables in the instance are automatically converted to the DuckDB engine. Therefore, it takes longer to create an analytic read-only instance than a regular read-only instance. The creation time depends on the data volume of the primary instance.

  • Foreign key deletion: Foreign keys are automatically deleted during the creation of a DuckDB-based analytical read-only instance. DuckDB-based analytical read-only instances do not have foreign key constraints.

  • Table schema limits: DuckDB-based analytical read-only instances support only tables that use the UTF8 or UTF8MB4 character set. Partitioned tables and tables without a business primary key are not supported. A business primary key is a defined primary key or a UNIQUE constraint on a non-null field.

    Note

    For unsupported tables, the engine remains the InnoDB engine in the DuckDB-based analytical read-only instance. You can run the following SQL statement to query the list of incompatible tables.

    SELECT table_schema, table_name, engine 
    FROM information_schema.tables 
    WHERE table_schema NOT IN ('mysql', 'sys') AND engine = 'InnoDB';
  • Binlog subscription limits: DuckDB-based analytical read-only instances do not support enabling the binary logging feature. Therefore, you cannot subscribe to their binlogs. To subscribe to binlogs, select the primary instance.

  • Compatibility limits: For more information, see Compatibility of DuckDB analytic instances.

Create a DuckDB-based analytical read-only instance

  1. Log on to the ApsaraDB RDS console. In the top navigation bar, select a region. Then, click the ID of the target instance.

  2. On the Basic Information page, go to the Instance Distribution section. Click Add to the right of DuckDB Analytic Instance.

    image

  3. Configure the DuckDB-based analytical read-only instance. The following table describes the main parameters.

    Parameter

    Description

    Billing Method

    Subscription and pay-as-you-go are supported. Serverless is not supported.

    Series

    Only High-availability Series is supported.

    Product Type

    Only Standard Edition is supported.

    Primary Zone

    You can select only a zone in the same region as the primary instance.

    Deployment Method

    • Single-zone Deployment: The primary and secondary nodes of the analytic read-only instance are in the same zone.

    • Multi-zone Deployment: The primary and secondary nodes of the analytic read-only instance are in different zones within the same region. This provides cross-zone disaster recovery at no extra charge. If you select this option, you must also configure the zone for the secondary node.

    Instance Type

    Dedicated instance types are supported. For more information about the instance types, see DuckDB-based analytical read-only instance types.

    Database Proxy

    After you enable the General-purpose database proxy (free of charge), the system can automatically distribute hybrid transactional and analytical processing (HTAP) requests. Online analytical processing (OLAP) query requests are routed to the DuckDB-based analytical read-only instance, and online transactional processing (OLTP) requests are routed to the primary instance or a regular read-only instance.

    Storage Capacity

    The storage capacity of the analytic read-only instance must be at least half the storage capacity of the primary instance.

  4. Click Next: Instance Configurations. Configure the network and resource group for the instance. The following table describes the main parameters.

    Parameter

    Description

    VPC

    The VPC is the same as that of the primary instance by default and cannot be changed.

    VSwitch of Primary Node

    Use the default primary vSwitch or select one as needed.

    Resource Group

    The resource group is the same as that of the primary instance by default and cannot be changed.

  5. Click Next: Confirm Order.

  6. Confirm the Parameter Settings, select a Quantity, click Confirm Order, and complete the payment.

    If you select the subscription billing method, you must also select a Subscription Duration. We recommend that you select Auto-renewal to prevent service interruptions that are caused by an overdue payment.

  7. After a period of time, find the primary instance on the Instances page. Click the arrow to the left of the instance ID to view the DuckDB-based analytical read-only instance that is attached to the primary instance.

    image

Connect to a DuckDB-based analytical read-only instance

(Recommended) Connect through a database proxy

If your business involves both high-concurrency online transactional processing (OLTP) and complex online analytical processing (OLAP), you can use a database proxy to enable automatic request distribution for HTAP. The database proxy node estimates the execution cost of SQL statements. It automatically routes OLAP query requests to the DuckDB-based analytical read-only instance and OLTP query requests to the primary instance or a regular read-only instance. You do not need to manually classify database requests on the client side.

  1. Add a DuckDB-based analytical read-only instance to the ApsaraDB RDS for MySQL primary instance.

  2. Enable the General-purpose database proxy for the primary instance.

  3. Enable automatic request distribution for HTAP for the primary instance.

  4. Assign a read weight to the DuckDB-based analytical read-only instance.

  5. Connect to the database using the database proxy endpoint. The database proxy automatically distributes requests to the most suitable instance type based on the estimated execution cost of SQL queries.

Connect directly

A DuckDB-based analytical read-only instance has an independent endpoint. If you only need to process OLAP query requests, you can use this endpoint to connect directly to the instance.

  1. Log on to the ApsaraDB RDS console. In the navigation pane on the left, click Instances. In the top navigation bar, select a region.

  2. On the Instances page, find the primary instance. Click the arrow to the left of the instance ID to view the DuckDB-based analytical read-only instance that is attached to the primary instance.

  3. Click the ID of the DuckDB-based analytical read-only instance to go to the details page. In the Basic Information > Network Type section, click View Endpoint Details to obtain the endpoint.

    By default, ApsaraDB RDS instances provide an internal endpoint. To access the instance over the internet, first apply for a public endpoint, and then connect to the DuckDB-based analytical read-only instance.

References and API operations

Appendix: Performance test for a DuckDB-based analytical read-only instance

The following section uses the standard TPC-H benchmark test, which is used to evaluate the performance of complex database queries, to demonstrate the analytic query performance of a DuckDB-based analytical read-only instance.

Test environment

  • Dataset: The TPC-H sf-100 dataset is used. The data volume is 100 GB. The data is from the official DuckDB website.

  • Running environment: Each database runs on an ECS instance with the same specifications: a 32-core CPU, 128 GB of memory, and an ESSD. This ensures a consistent test environment.

  • Comparison objects: A regular ApsaraDB RDS for MySQL instance (version 8.0.36, InnoDB engine) and ClickHouse Community Edition (version 25.3).

  • Test method: To ensure consistent query results, each database first runs three warm-up rounds, the results of which are not recorded. Then, three official test rounds are run. The average value of the three official test rounds is used as the final performance result.

Test results

The following table compares the running time of each query on a DuckDB-based analytical read-only instance, an RDS MySQL instance (InnoDB engine), and ClickHouse in the TPC-H sf-100 scenario.

Query ID

Running time (seconds)

DuckDB analytical read-only instance

RDS MySQL standard instance (InnoDB engine)

ClickHouse

q1

0.92

1134.25

3.47

q2

0.15

1800

1.52

q3

0.53

802.94

3.65

q4

0.46

1000.45

2.77

q5

0.5

1800

5.38

q6

0.22

566.73

0.73

q7

0.59

1800

6.06

q8

0.68

1800

6.99

q9

1.44

1800

13.29

q10

0.91

894.35

3.22

q11

0.11

79.63

1.1

q12

0.44

734.35

1.69

q13

1.59

454.15

5.85

q14

0.38

574.07

0.83

q15

0.31

568.43

1.53

q16

0.32

63.56

0.52

q17

0.89

1800

7.96

q18

1.59

1800

3.11

q19

0.8

1800

2.96

q20

0.51

1800

3.38

q21

1.64

1800

out of memory

q22

0.33

361.4

4

Total

15.31

25234.31

80.01

Test conclusions

  • Comparison between a DuckDB-based analytical read-only instance and a regular ApsaraDB RDS for MySQL instance (InnoDB engine)

    image

    image

    In the TPC-H test, the total running time for the ApsaraDB RDS for MySQL instance (InnoDB engine) was 25,234.31 seconds, while the total running time for the DuckDB-based analytical read-only instance was only 15.31 seconds. The performance difference is more than three orders of magnitude. The ApsaraDB RDS for MySQL instance (InnoDB engine) showed significant deficiencies in handling complex queries. Queries q5, q7, q8, q9, q17, q18, q19, q20, and q21 failed to complete due to a timeout. The timeout period is uniformly counted as 1,800 seconds. This indicates that the ApsaraDB RDS for MySQL DuckDB-based analytical read-only instance performs excellently for large-scale data analytic query tasks and significantly improves the efficiency of complex queries.

  • Comparison between a DuckDB-based analytical read-only instance and a ClickHouse instance

    image

    image

    The total running time for the ClickHouse instance was 80.01 seconds. In comparison, the total running time for the DuckDB-based analytical read-only instance was only 15.31 seconds, showing a significant lead in overall performance. The ClickHouse instance reported a Memory limit exceeded error when running query q21 and could not complete the query. For the other queries, the DuckDB-based analytical read-only instance performed significantly better than the ClickHouse instance. This demonstrates its superior query performance and stability in complex analytic scenarios.