All Products
Search
Document Center

ApsaraDB for SelectDB:Materialized view overview

Last Updated:Apr 24, 2026

A materialized view is a database object that pre-computes and stores query results to significantly boost the performance of complex queries. SelectDB provides two types of materialized views, a sync materialized view and an async materialized view, each suited for different business scenarios.

Use cases

Materialized views are primarily used in the following three scenarios:

  • Query acceleration: For frequent queries involving complex aggregations and multi-table JOINs, you can pre-compute and materialize the results. This allows queries to read the materialized data directly, avoiding repeated calculations and significantly reducing query latency.

  • Lightweight ETL (data processing): A materialized view can act as a lightweight ETL tool to pre-aggregate and transform raw data. This provides a processed data layer for downstream reports or analyses, enabling the construction of a layered data warehouse.

  • Lakehouse acceleration: For external data stored in a data lake (such as Hive or Iceberg), you can use an async materialized view to cache frequently queried results in SelectDB's local storage. This leverages SelectDB's high-performance engine to accelerate data lake queries.

Types of materialized views

SelectDB provides two types of materialized views for different use cases:

Sync materialized view

A sync materialized view updates in real time as data is written to the base table, guaranteeing strong consistency between the materialized view and the base table. Its features include:

  • Real-time consistency: Every write operation to the base table synchronously updates the materialized view, ensuring that query results are always consistent with the base table.

  • Single-table only: A sync materialized view can only be created on a single base table and does not support multi-table JOINs.

  • Automatic transparent rewriting: The query optimizer can automatically identify and use a sync materialized view to accelerate a query without requiring any changes to the SQL statement.

  • Applicable scenarios: Aggregation acceleration queries that require high data freshness, such as real-time dashboards and online reports.

Async materialized view

An async materialized view uses an asynchronous refresh mechanism. A flexible combination of a refresh method (full or partitioned incremental) and a trigger method (manual, scheduled, or on commit) meets various data freshness requirements. Its features include:

  • Multi-table JOIN support: An async materialized view can be created on multiple tables, including internal and external tables, supporting pre-computation of complex JOIN queries.

  • Flexible refresh strategies: It supports three trigger methods: manual refresh (ON MANUAL), scheduled refresh (ON SCHEDULE), and refresh on commit (ON COMMIT). You can configure these flexibly based on data freshness requirements.

  • Lakehouse support: Supports creating materialized views on external data sources like Hive and Iceberg to accelerate data lake queries.

  • Transparent rewriting: The query optimizer supports transparent rewriting based on the SPJG algorithm. It automatically rewrites the query to use the materialized view without modifying the business SQL.

  • Applicable scenarios: Analytical scenarios with relatively relaxed data freshness requirements (e.g., T+1) and high query complexity, such as data warehouse layering and multi-dimensional reporting.

Sync vs. async materialized views

Dimension

Sync materialized view

Async materialized view

Data consistency

Strong consistency (real-time sync)

Eventual consistency (asynchronous refresh)

Number of supported base tables

Single table only

Multiple tables (including JOINs)

External table support

Not supported

Supported (Hive, Iceberg, etc.)

Refresh method

Synchronously updated on write

Full refresh (COMPLETE) or partitioned incremental refresh (AUTO)

Trigger method

Automatic (synchronously triggered on write)

Manual refresh (ON MANUAL), scheduled refresh (ON SCHEDULE), or refresh on commit (ON COMMIT)

Transparent rewriting

Supported

Supported (SPJG algorithm)

Applicable scenarios

High data freshness requirements, single-table aggregation acceleration

T+1 analytics, multi-table JOIN pre-computation, Lakehouse acceleration

Async materialized view refresh mechanism

The refresh of an async materialized view is determined by two independent dimensions: the refresh method (the scope of each refresh) and the trigger method (when to trigger a refresh). You can combine these as needed and specify them using the REFRESH clause when creating a materialized view.

Refresh method

The refresh method determines the data scope for each refresh:

Refresh method

Refresh scope

Description

Full refresh (COMPLETE)

All data

Each refresh recalculates all data for the materialized view based on its defining query and overwrites the existing data. This is suitable for non-partitioned tables or scenarios requiring a full data rebuild.

Partitioned incremental refresh (AUTO)

Changed partitions

Refreshes only the partitions where data has changed since the last refresh, resulting in lower overhead. If the system cannot detect partition changes, it automatically falls back to a full refresh. This is ideal for partitioned tables.

Trigger method

The trigger method determines when a refresh task starts. The following three trigger methods can be combined with any of the refresh methods described above:

Trigger method

Trigger time

Description

Scenarios

Manual refresh (ON MANUAL)

User-initiated

Manually trigger a refresh by using the REFRESH MATERIALIZED VIEW statement.

Scenarios that require precise control over refresh timing.

Scheduled refresh (ON SCHEDULE)

At a set interval

Automatically triggers a refresh at a specified interval (e.g., minute, hour, or day).

Scenarios with periodic data updates, such as daily reports.

Refresh on commit (ON COMMIT)

After base table data changes

Automatically triggers a refresh after data is committed to the base table, providing the lowest data latency.

Scenarios where the base table changes infrequently, but data freshness is a high priority.

Note

The refresh method and trigger method are two independent dimensions. For example, you can configure a "full refresh triggered on a schedule" to rebuild the entire view at a set time each day, or a "partitioned incremental refresh triggered on commit" to refresh only the affected partitions after the base table is updated.