All Products
Search
Document Center

ApsaraDB for SelectDB:Asynchronous materialized view

Last Updated:Apr 24, 2026

An Asynchronous Materialized View is a feature in SelectDB that supports multi-table joins and asynchronous refreshes. It is widely used for accelerating complex queries, building data warehouse layers, and improving query performance in a Lakehouse architecture.

Use cases

The main use cases for an Asynchronous Materialized View include:

  • Query acceleration: For frequent queries involving complex aggregations and multi-table joins, you can pre-compute the results. When a query is submitted, the system can transparently rewrite it to use the materialized view, which significantly reduces query latency.

  • Simplifying ETL pipelines: In data warehousing, you can use a materialized view to perform layered aggregation and ETL processing on raw data. This helps build data layers such as ODS, DWD, DWS, and ADS while reducing redundant computations.

  • Lakehouse query acceleration: For external data stored in a data lake like Hive or Iceberg, you can use an Asynchronous Materialized View to cache hot query results locally in SelectDB. This leverages the high-performance engine of SelectDB to dramatically improve query performance.

  • Improving write efficiency: By reducing resource contention, a materialized view can optimize the data ingestion process, improving write efficiency and ensuring data consistency and integrity.

Limitations

Keep the following limitations in mind when using an Asynchronous Materialized View:

  • Eventual consistency: Data in an Asynchronous Materialized View lags behind its base tables, depending on the refresh policy. It does not guarantee strong consistency. For scenarios that require real-time data, consider using synchronous materialized views.

  • Window function restrictions: If the SQL definition of a materialized view contains a window function, the current version does not support transparent rewrite. However, you can still query the materialized view directly.

  • Join table count limit: For a transparent rewrite to be successful, the number of tables joined in the materialized view definition must not exceed the number of tables joined in the query. Otherwise, the rewrite will fail.

  • Limitations on partitioned incremental refresh for external tables: Hive external tables support partitioned incremental refresh starting from version 2.1. Iceberg, Paimon, and Hudi external tables support this feature starting from version 3.1. JDBC and ES external tables do not support partitioned incremental refresh. To refresh these tables, you must specify a COMPLETE full refresh.

  • Data model of the materialized view: A materialized view itself must use the duplicate key model, regardless of the data models of its base tables.

  • Limitations on set operations and special syntax: If a materialized view contains set operations such as UNION ALL, LIMIT, ORDER BY, or CROSS JOIN, the materialized view can be created successfully, but it cannot be used for transparent rewriting.

How it works

MTMV (Multi-Table Materialized View)

The Asynchronous Materialized View in SelectDB is built on the Multi-Table Materialized View (MTMV) framework. This framework decouples the materialized view's definition (the query statement) from its base tables. It runs refresh tasks asynchronously in the background to update the materialized view's data. Each materialized view has an independent refresh job, which a scheduler manages.

Refresh mechanism

The refresh process for an Asynchronous Materialized View is as follows:

  1. A refresh job is triggered (manually, on a schedule, or ON COMMIT).

  2. The system checks if data in the base tables has changed since the last refresh. This check is only supported for internal tables and Hive external tables.

  3. For a partitioned materialized view, the system refreshes only the partitions with data changes (partitioned incremental refresh). For a non-partitioned materialized view, it refreshes all data (full refresh).

  4. After the refresh is complete, the version information and status (such as NORMAL or SCHEMA_CHANGE) of the materialized view are updated.

Transparent rewrite (SPJG algorithm)

The SelectDB query optimizer, Nereids, supports transparent rewrite based on the SPJG (SELECT-PROJECT-JOIN-GROUP-BY) pattern. When a user submits a query, the optimizer automatically analyzes whether an existing materialized view can satisfy it. If so, the optimizer rewrites the query to use the materialized view instead of executing the original query. This accelerates the query without requiring any changes to the application's SQL.

The logic for transparent rewrite includes checking:

  • Whether the output columns of the query can be derived from the columns of the materialized view.

  • Whether the join relationships in the query are covered by the materialized view.

  • Whether the filter conditions of the query are within the data range of the materialized view.

  • Whether the data in the materialized view is in a valid state (NORMAL).

Note

Only materialized views with a NORMAL status can participate in transparent rewrite. If the status of a materialized view changes to SCHEMA_CHANGE due to reasons such as a schema change in a base table, the status is restored to NORMAL after the next refresh task is successfully completed.

Create materialized views from a data lake

SelectDB allows you to create an Asynchronous Materialized View based on external data sources (accessed via a catalog) to accelerate queries in a Lakehouse. The syntax is identical to creating a view from an internal table, but keep the following in mind:

  • The refresh process relies on the data lake's cached metadata (such as partition version information) within SelectDB, not on fetching it directly from the external environment. As a result, the data after a refresh is consistent with what you would get by querying the data lake through SelectDB, but it may differ from results obtained through other engines, depending on the cache's freshness.

  • If the underlying data in a Hive table is modified by an external process not controlled by SelectDB (such as a Spark, Hive, or Flink job) without updating the metadata (for example, by running INSERT OVERWRITE), the materialized view might assume its data is in sync with the base table when it is not. This causes inconsistent query results. To resolve this, manually force a refresh of the materialized view.

  • When creating a partitioned materialized view from an Iceberg table, the table must have a single partition column. Partition evolution is supported to a limited extent. For example, changes in the time range of a time-based partition are supported, but changes to the partition field itself are not and will cause the refresh to fail.

  • When you create a materialized view from a Hudi table, the system cannot detect changes in the base table's data. Once the materialized view (or a partition) has been refreshed, it is considered in sync with the base table. Therefore, materialized views on Hudi tables are best suited for manual, on-demand refresh scenarios.

Data lake refresh support

The following table shows the refresh capabilities for different table and catalog types.

Table type

Catalog type

Full refresh

Partition refresh

Auto trigger

Internal table

Internal

Supported in v2.1

Supported in v2.1

Supported in v2.1.4

Hive

Hive

Supported in v2.1

Supported in v2.1

Not supported

Iceberg

Iceberg

Supported in v2.1

Supported in v3.1

Not supported

Paimon

Paimon

Supported in v2.1

Supported in v3.1

Not supported

Hudi

Hudi

Supported in v2.1

Supported in v3.1

Not supported

JDBC

JDBC

Supported in v2.1

Not supported

Not supported

ES

ES

Supported in v2.1

Not supported

Not supported

Data lake transparent rewrite support

Currently, the transparent rewrite feature for an Asynchronous Materialized View supports the following table and catalog types. "Real-time base table data awareness" means the materialized view detects changes in a base table and uses the latest data for queries.

Table type

Catalog type

Transparent rewrite

Data change awareness

Internal table

Internal

Supported

Supported

Hive

Hive

Supported

Supported in v3.1

Iceberg

Iceberg

Supported

Supported in v3.1

Paimon

Paimon

Supported

Supported in v3.1

Hudi

Hudi

Supported

Not supported

JDBC

JDBC

Supported

Not supported

ES

ES

Supported

Not supported

Important

By default, a materialized view that uses an external table is not eligible for transparent rewrite. To enable transparent rewrite for external tables, run SET materialized_view_rewrite_enable_contain_external_table = true. For external tables that do not support change detection (such as JDBC and ES), you must explicitly specify REFRESH MATERIALIZED VIEW mvName COMPLETE when refreshing the view. Otherwise, an automatic refresh might not update the materialized view's data.

Relationship with internal OLAP tables

The SQL definition of an Asynchronous Materialized View has no restrictions on the data model of its base tables; it supports the duplicate key model, primary key model (both Merge-on-Write and Merge-on-Read), and aggregate model. The underlying implementation of a materialized view is based on an OLAP table with a duplicate key model and theoretically supports all core features of that model. However, to ensure stable and efficient data refreshes, the following restrictions apply:

  • SelectDB automatically creates and manages partitions for a materialized view based on its base tables. You cannot perform partition operations directly on the materialized view.

  • Because a materialized view has associated background jobs, you cannot use DROP TABLE or RENAME TABLE commands on it. Instead, you must use materialized view-specific commands, such as DROP MATERIALIZED VIEW.

  • The system automatically infers the data types for the materialized view's columns from the query statement at creation, and you cannot modify them. Changing them could cause the refresh job to fail.

  • You must modify properties specific to a materialized view using materialized view-specific commands. You can modify other common properties with the ALTER TABLE command.

References

  • To create, query, and maintain an Asynchronous Materialized View, see Create, Query, and Maintain Asynchronous Materialized Views.

  • For best practices, see Asynchronous Materialized View Best Practices.

  • For frequently asked questions, see Asynchronous Materialized View FAQ.