ApsaraDB RDS for MySQL DuckDB-based analytical read-only instances use column store and vectorization to improve the performance of complex analytic queries by up to 100 times. These instances provide real-time analytics for large-scale data scenarios and help you make data-driven decisions faster.
Product introduction
DuckDB-based analytical read-only instances have a built-in DuckDB engine with the following features:
High-performance analytics: Supports features such as column store, just-in-time (JIT) compilation, vectorization, efficient memory management, and parallel processing.
High compatibility with MySQL: Highly compatible with MySQL syntax and data formats. You can obtain results efficiently without modifying your original query statements.
Data synchronization mechanism:
Historical data synchronization: When you create a DuckDB-based analytical read-only instance, the system synchronizes historical data from the primary instance and automatically converts it to the DuckDB engine format.
Incremental data synchronization: After the DuckDB-based analytical read-only instance is created, incremental data from the primary instance is synchronized in real time to the analytic read-only instance using the native MySQL binary logging (binlog) replication mechanism. Data is always transferred within RDS and does not require external data synchronization tools. For more information about the technology, see How DuckDB analytic instances work.
Scenarios
Aggregation analysis: For aggregation analysis, such as log data analysis, DuckDB-based analytical read-only instances provide efficient aggregate queries.
Multi-table join query: Significantly improves MySQL's analytical performance for workloads with multi-table
JOINs.
Comparison between DuckDB-based analytical read-only instances, read-only instances, and OLAP databases
Compared with using primary instances, read-only instances, or OLAP databases for complex queries, DuckDB-based analytical read-only instances have the following advantages:
High performance: Delivers two orders of magnitude better performance for complex queries compared to the InnoDB engine.
High compatibility: 100% compatible with the MySQL protocol and data types, and highly compatible with SQL syntax and Data Definition Language (DDL).
Efficient and stable data synchronization: Uses a native binlog replication channel for a more stable and efficient synchronization link. No additional data synchronization fees are charged.
Resource isolation: DuckDB-based analytical read-only instances handle complex analytic queries, while primary and regular read-only instances handle transaction processing. This isolates resources for each workload, preventing them from affecting each other.
Item | DuckDB-based analytical read-only instance | Read-only instance | OLAP database | |
Suitable services | Complex analytic queries | Transaction processing | Complex analytic queries | |
Analytic query performance | Strong | Low | Strong | |
Data synchronization method | Native binlog replication | Native binlog replication | DTS data synchronization link | |
MySQL compatibility | Data types | Fully compatible | Fully compatible | Incompatible (requires field mapping) |
SQL syntax | Highly compatible (more than 99.9%) | Fully compatible | Incompatible (requires SQL rewrite) | |
DDL | Highly compatible | Fully compatible | Partially compatible | |
O&M costs | Low (integrated instance) | Low (integrated instance) | High (requires extra maintenance for the database and synchronization link) | |
Scenarios
To create a DuckDB-based analytical read-only instance, the attached ApsaraDB RDS for MySQL primary instance must meet the following requirements:
Major version: MySQL 8.0
Series: High-availability Series
Storage type: Premium ESSD or enterprise SSD (ESSD)
Storage capacity: 10 GB to 16,000 GB
The data archiving feature is disabled.
Billing
The billing for DuckDB-based analytical read-only instances is the same as for regular read-only instances. The cost is affected by factors such as the product series, instance type, storage type, and storage capacity. The actual cost is displayed on the purchase page.
Notes
Instance creation time: When a DuckDB-based analytical read-only instance is created, the engines of all tables in the instance are automatically converted to the DuckDB engine. Therefore, it takes longer to create an analytic read-only instance than a regular read-only instance. The creation time depends on the data volume of the primary instance.
Foreign key deletion: Foreign keys are automatically deleted during the creation of a DuckDB-based analytical read-only instance. DuckDB-based analytical read-only instances do not have foreign key constraints.
Table schema limits: DuckDB-based analytical read-only instances support only tables that use the
UTF8orUTF8MB4character set. Partitioned tables and tables without a business primary key are not supported. A business primary key is a defined primary key or a UNIQUE constraint on a non-null field.NoteFor unsupported tables, the engine remains the InnoDB engine in the DuckDB-based analytical read-only instance. You can run the following SQL statement to query the list of incompatible tables.
SELECT table_schema, table_name, engine FROM information_schema.tables WHERE table_schema NOT IN ('mysql', 'sys') AND engine = 'InnoDB';Binlog subscription limits: DuckDB-based analytical read-only instances do not support enabling the binary logging feature. Therefore, you cannot subscribe to their binlogs. To subscribe to binlogs, select the primary instance.
Compatibility limits: For more information, see Compatibility of DuckDB analytic instances.
Create a DuckDB-based analytical read-only instance
Log on to the ApsaraDB RDS console. In the top navigation bar, select a region. Then, click the ID of the target instance.
On the Basic Information page, go to the Instance Distribution section. Click Add to the right of DuckDB Analytic Instance.

Configure the DuckDB-based analytical read-only instance. The following table describes the main parameters.
Parameter
Description
Billing Method
Subscription and pay-as-you-go are supported. Serverless is not supported.
Series
Only High-availability Series is supported.
Product Type
Only Standard Edition is supported.
Primary Zone
You can select only a zone in the same region as the primary instance.
Deployment Method
Single-zone Deployment: The primary and secondary nodes of the analytic read-only instance are in the same zone.
Multi-zone Deployment: The primary and secondary nodes of the analytic read-only instance are in different zones within the same region. This provides cross-zone disaster recovery at no extra charge. If you select this option, you must also configure the zone for the secondary node.
Instance Type
Dedicated instance types are supported. For more information about the instance types, see DuckDB-based analytical read-only instance types.
Database Proxy
After you enable the General-purpose database proxy (free of charge), the system can automatically distribute hybrid transactional and analytical processing (HTAP) requests. Online analytical processing (OLAP) query requests are routed to the DuckDB-based analytical read-only instance, and online transactional processing (OLTP) requests are routed to the primary instance or a regular read-only instance.
Storage Capacity
The storage capacity of the analytic read-only instance must be at least half the storage capacity of the primary instance.
Click Next: Instance Configurations. Configure the network and resource group for the instance. The following table describes the main parameters.
Parameter
Description
VPC
The VPC is the same as that of the primary instance by default and cannot be changed.
VSwitch of Primary Node
Use the default primary vSwitch or select one as needed.
Resource Group
The resource group is the same as that of the primary instance by default and cannot be changed.
Click Next: Confirm Order.
Confirm the Parameter Settings, select a Quantity, click Confirm Order, and complete the payment.
If you select the subscription billing method, you must also select a Subscription Duration. We recommend that you select Auto-renewal to prevent service interruptions that are caused by an overdue payment.
After a period of time, find the primary instance on the Instances page. Click the arrow to the left of the instance ID to view the DuckDB-based analytical read-only instance that is attached to the primary instance.

Connect to a DuckDB-based analytical read-only instance
(Recommended) Connect through a database proxy
If your business involves both high-concurrency online transactional processing (OLTP) and complex online analytical processing (OLAP), you can use a database proxy to enable automatic request distribution for HTAP. The database proxy node estimates the execution cost of SQL statements. It automatically routes OLAP query requests to the DuckDB-based analytical read-only instance and OLTP query requests to the primary instance or a regular read-only instance. You do not need to manually classify database requests on the client side.
Add a DuckDB-based analytical read-only instance to the ApsaraDB RDS for MySQL primary instance.
Enable the General-purpose database proxy for the primary instance.
Enable automatic request distribution for HTAP for the primary instance.
Assign a read weight to the DuckDB-based analytical read-only instance.
Connect to the database using the database proxy endpoint. The database proxy automatically distributes requests to the most suitable instance type based on the estimated execution cost of SQL queries.
Connect directly
A DuckDB-based analytical read-only instance has an independent endpoint. If you only need to process OLAP query requests, you can use this endpoint to connect directly to the instance.
Log on to the ApsaraDB RDS console. In the navigation pane on the left, click Instances. In the top navigation bar, select a region.
On the Instances page, find the primary instance. Click the arrow to the left of the instance ID to view the DuckDB-based analytical read-only instance that is attached to the primary instance.
Click the ID of the DuckDB-based analytical read-only instance to go to the details page. In the section, click View Endpoint Details to obtain the endpoint.
By default, ApsaraDB RDS instances provide an internal endpoint. To access the instance over the internet, first apply for a public endpoint, and then connect to the DuckDB-based analytical read-only instance.
References and API operations
DuckDB-based analytical read-only instance types
Related API operations
Create a DuckDB-based analytical read-only instance by calling the CreateReadOnlyDBInstance operation.
Query the details of a specific DuckDB-based analytical read-only instance by calling the DescribeDBInstanceAttribute operation.
Query the basic information about all ApsaraDB RDS instances, including DuckDB-based analytical read-only instances, that belong to your account by calling the DescribeDBInstances operation.
Appendix: Performance test for a DuckDB-based analytical read-only instance
The following section uses the standard TPC-H benchmark test, which is used to evaluate the performance of complex database queries, to demonstrate the analytic query performance of a DuckDB-based analytical read-only instance.
Test environment
Dataset: The TPC-H sf-100 dataset is used. The data volume is 100 GB. The data is from the official DuckDB website.
Running environment: Each database runs on an ECS instance with the same specifications: a 32-core CPU, 128 GB of memory, and an ESSD. This ensures a consistent test environment.
Comparison objects: A regular ApsaraDB RDS for MySQL instance (version 8.0.36, InnoDB engine) and ClickHouse Community Edition (version 25.3).
Test method: To ensure consistent query results, each database first runs three warm-up rounds, the results of which are not recorded. Then, three official test rounds are run. The average value of the three official test rounds is used as the final performance result.
Test results
The following table compares the running time of each query on a DuckDB-based analytical read-only instance, an RDS MySQL instance (InnoDB engine), and ClickHouse in the TPC-H sf-100 scenario.
Query ID | Running time (seconds) | ||
DuckDB analytical read-only instance | RDS MySQL standard instance (InnoDB engine) | ClickHouse | |
q1 | 0.92 | 1134.25 | 3.47 |
q2 | 0.15 | 1800 | 1.52 |
q3 | 0.53 | 802.94 | 3.65 |
q4 | 0.46 | 1000.45 | 2.77 |
q5 | 0.5 | 1800 | 5.38 |
q6 | 0.22 | 566.73 | 0.73 |
q7 | 0.59 | 1800 | 6.06 |
q8 | 0.68 | 1800 | 6.99 |
q9 | 1.44 | 1800 | 13.29 |
q10 | 0.91 | 894.35 | 3.22 |
q11 | 0.11 | 79.63 | 1.1 |
q12 | 0.44 | 734.35 | 1.69 |
q13 | 1.59 | 454.15 | 5.85 |
q14 | 0.38 | 574.07 | 0.83 |
q15 | 0.31 | 568.43 | 1.53 |
q16 | 0.32 | 63.56 | 0.52 |
q17 | 0.89 | 1800 | 7.96 |
q18 | 1.59 | 1800 | 3.11 |
q19 | 0.8 | 1800 | 2.96 |
q20 | 0.51 | 1800 | 3.38 |
q21 | 1.64 | 1800 | out of memory |
q22 | 0.33 | 361.4 | 4 |
Total | 15.31 | 25234.31 | 80.01 |
Test conclusions
Comparison between a DuckDB-based analytical read-only instance and a regular ApsaraDB RDS for MySQL instance (InnoDB engine)


In the TPC-H test, the total running time for the ApsaraDB RDS for MySQL instance (InnoDB engine) was 25,234.31 seconds, while the total running time for the DuckDB-based analytical read-only instance was only 15.31 seconds. The performance difference is more than three orders of magnitude. The ApsaraDB RDS for MySQL instance (InnoDB engine) showed significant deficiencies in handling complex queries. Queries q5, q7, q8, q9, q17, q18, q19, q20, and q21 failed to complete due to a timeout. The timeout period is uniformly counted as 1,800 seconds. This indicates that the ApsaraDB RDS for MySQL DuckDB-based analytical read-only instance performs excellently for large-scale data analytic query tasks and significantly improves the efficiency of complex queries.
Comparison between a DuckDB-based analytical read-only instance and a ClickHouse instance


The total running time for the ClickHouse instance was 80.01 seconds. In comparison, the total running time for the DuckDB-based analytical read-only instance was only 15.31 seconds, showing a significant lead in overall performance. The ClickHouse instance reported a
Memory limit exceedederror when running query q21 and could not complete the query. For the other queries, the DuckDB-based analytical read-only instance performed significantly better than the ClickHouse instance. This demonstrates its superior query performance and stability in complex analytic scenarios.