All Products
Search
Document Center

E-MapReduce:Asynchronous materialized view

Last Updated:Nov 06, 2025

Compared with synchronous materialized views, asynchronous materialized views support multi-table joins and a richer set of aggregate operators. Asynchronous materialized views can be refreshed through manual invocation or scheduled tasks, and support refreshing partial partitions, which can significantly reduce refresh costs. Additionally, asynchronous materialized views support various query rewrite scenarios, enabling automatic and transparent query acceleration. This topic describes how to create, use, and manage asynchronous materialized views in StarRocks.

Background

Applications in data warehouse environments frequently execute complex queries based on multiple large tables, often involving joins and aggregations of billions of rows of data between multiple tables. Processing such queries typically consumes a large amount of system resources and time, resulting in extremely high query costs.

You can solve these problems with asynchronous materialized views in StarRocks. An asynchronous materialized view is a special physical table that stores pre-computed results based on specific query statements on base tables. When you execute complex queries on base tables, StarRocks can directly reuse the pre-computed results, avoiding repeated calculations and thereby improving query performance. The higher the query frequency or the more complex the query statement, the more significant the performance gain.

You can also use asynchronous materialized views to model your data warehouse, providing a unified data interface to upper-layer applications, hiding the underlying implementation, and protecting the security of base table detailed data.

Materialized view comparison

StarRocks synchronous materialized views (Rollup) can provide better data freshness and lower refresh costs. However, synchronous materialized views have many limitations in terms of scenarios, can only be created based on a single base table, and support only limited aggregate operators. StarRocks asynchronous materialized views can be created based on multiple base tables and support a richer set of aggregate operators.

The following table compares asynchronous materialized views and synchronous materialized views (Rollup) in StarRocks from the perspective of supported features.

Materialized view

Single-table aggregation

Multi-table join

Query rewrite

Refresh strategy

Base table

Asynchronous materialized view

Yes

Yes

Yes

  • Asynchronous refresh

  • Manual refresh

Supports multi-table construction. Base tables can come from:

  • Default Catalog

  • External Catalog

  • Existing asynchronous materialized views

  • Existing views

Synchronized materialized view (Rollup)

Only some aggregate functions

No

Yes

Synchronized refresh during import

Only supports single-table construction based on Default Catalog

Terms

Concept

Description

Base Table

The driving table of a materialized view.

For StarRocks asynchronous materialized views, base tables can be internal tables in the Default Catalog, tables in external data catalogs, or even existing asynchronous materialized views or views. StarRocks supports creating asynchronous materialized views on all StarRocks table types.

Refresh

After creating an asynchronous materialized view, its data only reflects the state of the base table at the time of creation. When data in the base table changes, you need to refresh the asynchronous materialized view to update the data changes.

Currently, StarRocks supports two asynchronous refresh strategies:

  • ASYNC: Asynchronous refresh, where the materialized view automatically triggers refresh tasks based on the specified refresh interval whenever data in the base table changes.

  • MANUAL: Manual refresh trigger. The materialized view does not refresh automatically, and users need to manually maintain refresh tasks.

Query Rewrite

Query rewrite refers to the system automatically determining whether pre-computed results in materialized views can be reused when querying base tables that have materialized views built on them. If reuse is possible, the system will directly read pre-computed results from the relevant materialized views to avoid consuming system resources and time through repeated calculations.

StarRocks supports automatic and transparent query rewrite based on SPJG-type asynchronous materialized views. SPJG-type materialized views refer to materialized views whose Plan only includes Scan, Filter, Project, and Aggregate type operators.

Note

Asynchronous materialized views built based on JDBC Catalog and Hudi Catalog tables do not currently support query rewrite.

Use cases

We recommend creating asynchronous materialized views if your data warehouse environment has the following requirements.

  • Accelerate repeated aggregate queries

    Suppose your data warehouse environment has many queries containing the same aggregate function subqueries, consuming a large amount of computing resources. You can establish an asynchronous materialized view based on that subquery, computing and storing all results of that subquery. After successful establishment, the system will automatically rewrite query statements to directly query the intermediate results in the asynchronous materialized view, thereby reducing load and accelerating queries.

  • Periodic multi-table join queries

    Suppose you need to regularly join multiple tables in your data warehouse to generate a new wide table. You can establish an asynchronous materialized view for these tables and set regular refresh rules, thereby avoiding manual scheduling of join tasks. After the asynchronous materialized view is established, queries will directly return results based on the asynchronous materialized view, avoiding the latency caused by join operations.

  • Data layer category

    Suppose your base tables contain a large amount of raw data, and queries require complex ETL operations. You can implement data warehouse layering by establishing multiple layers of asynchronous materialized views on the data. This can decompose complex queries into multiple layers of simple queries, both reducing repeated calculations and helping maintenance personnel quickly locate problems. In addition, data warehouse layering can decouple raw data from statistical data, thereby protecting sensitive raw data.

  • Lakehouse acceleration

    Querying data lakes may be slow due to network latency and throughput limitations of object storage. You can improve query performance by building asynchronous materialized views on top of data lakes. Additionally, StarRocks can intelligently rewrite queries to use existing materialized views, eliminating the hassle of manually modifying queries.

For specific use cases of asynchronous materialized views, see:

Create asynchronous materialized views

Supported data sources

StarRocks supports creating asynchronous materialized views on the following data sources:

  • StarRocks internal tables (base tables support all StarRocks table types)

  • Tables in External Catalog

    • Hive Catalog

    • Hudi Catalog

    • Iceberg Catalog

    • JDBC Catalog

  • Existing asynchronous materialized views

  • Existing views

Preparations

The following example is based on two base tables in Default Catalog:

  • Table goods contains product ID item_id1, product name item_name, and product price price.

  • Table order_list contains order ID order_id, customer ID client_id, and product ID item_id2.

Where item_id1 is equivalent to item_id2.

Create tables and import the following data.

CREATE DATABASE IF NOT EXISTS test_db;
USE test_db;
CREATE TABLE goods(
    item_id1          INT,
    item_name         STRING,
    price             FLOAT
) DISTRIBUTED BY HASH(item_id1);

INSERT INTO goods
VALUES
    (1001,"apple",6.5),
    (1002,"pear",8.0),
    (1003,"potato",2.2);

CREATE TABLE order_list(
    order_id          INT,
    client_id         INT,
    item_id2          INT,
    order_date        DATE
) DISTRIBUTED BY HASH(order_id);

INSERT INTO order_list
VALUES
    (10001,101,1001,"2022-03-13"),
    (10001,101,1002,"2022-03-13"),
    (10002,103,1002,"2022-03-13"),
    (10002,103,1003,"2022-03-14"),
    (10003,102,1003,"2022-03-14"),
    (10003,102,1001,"2022-03-14");

This example business scenario requires frequent analysis of order totals, so queries need to join the two tables and call the sum() function to generate a new table based on order ID and total amount. In addition, this business scenario requires refreshing order totals daily.

The query statement is as follows.

USE test_db;
SELECT
    order_id,
    sum(goods.price) as total
FROM order_list INNER JOIN goods ON goods.item_id1 = order_list.item_id2
GROUP BY order_id;

Create an asynchronous materialized view based on a query statement

You can create a materialized view for a specific query statement using the CREATE MATERIALIZED VIEW statement.

Example

The following example creates an asynchronous materialized view based on the above query statement, using tables goods and order_list to "group by order ID and sum the prices of all products in the order", and sets its refresh mode to ASYNC, automatically refreshing every day.

USE test_db;
CREATE MATERIALIZED VIEW order_mv
DISTRIBUTED BY HASH(`order_id`)
REFRESH ASYNC START('2022-09-01 10:00:00') EVERY (interval 1 day)
AS SELECT
    order_list.order_id,
    sum(goods.price) as total
FROM order_list INNER JOIN goods ON goods.item_id1 = order_list.item_id2
GROUP BY order_id;
Note
  • When creating an asynchronous materialized view, you must specify at least one of either bucketing or refresh strategy.

  • You can set different partition and bucketing strategies for asynchronous materialized views than their base tables, but the partition key columns and bucketing columns of the asynchronous materialized view must be in the query statement.

  • Asynchronous materialized views support partition roll-up. For example, if the base table is partitioned by day, you can set the asynchronous materialized view to be partitioned by month.

  • Asynchronous materialized views do not currently support using List partition strategy, nor do they support creation based on base tables using List partitioning.

  • The query statement for creating a materialized view does not support non-deterministic functions, including rand(), random(), uuid(), and sleep().

  • Asynchronous materialized views support various data types.

  • By default, after executing the CREATE MATERIALIZED VIEW statement, StarRocks will immediately start the refresh task, which will consume certain system resources. To defer the refresh time, add the REFRESH DEFERRED parameter.

Notes

  • Asynchronous materialized view refresh mechanism

    Currently, StarRocks supports two ON DEMAND refresh strategies: asynchronous refresh (ASYNC) and manual refresh (MANUAL).

    On this basis, asynchronous materialized views support various refresh mechanisms to control refresh overhead and ensure refresh success rate:

    • Support for setting the maximum number of partitions to refresh. When an asynchronous materialized view has many partitions, a single refresh will consume a lot of resources. You can use this refresh mechanism to specify the maximum number of partitions for a single refresh, thereby splitting the refresh task to ensure that materialized views with large amounts of data can be refreshed in batches and stably.

    • Support for specifying Time to Live (TTL) for partitions of asynchronous materialized views, thereby reducing the storage space occupied by asynchronous materialized views.

    • Support for specifying the refresh range, refreshing only the most recent few partitions, reducing refresh overhead.

    • Support for specifying base tables where data changes will not trigger automatic refresh of the corresponding materialized view.

    • Support for setting resource groups for refresh tasks.

    You can also use ALTER MATERIALIZED VIEW to modify the refresh mechanism of existing asynchronous materialized views.

    Note

    To avoid full refresh tasks depleting system resources and causing task failure, it is recommended that you create partitioned materialized views based on partitioned base tables, ensuring that when data in a base table partition is updated, only the corresponding partition of the materialized view is refreshed, rather than refreshing the entire materialized view. For detailed information, see Partition modeling.

  • Nested materialized views

    StarRocks supports nested asynchronous materialized views, that is, building new asynchronous materialized views based on existing asynchronous materialized views. The refresh method of each asynchronous materialized view only affect the current materialized view. Currently, StarRocks does not restrict the number of nesting levels. In production environments, it is recommended that the nesting depth does not exceed three layers.

  • External Catalog materialized views

    StarRocks supports building asynchronous materialized views based on Hive Catalog, Hudi Catalog, Iceberg Catalog, and JDBC Catalog. External data catalog materialized views are created in the same way as regular asynchronous materialized views, but have usage limitations. For detailed information, see Using materialized views to accelerate data lake queries.

Manually refresh asynchronous materialized views

You can manually refresh a specified asynchronous materialized view using the REFRESH MATERIALIZED VIEW command.

-- Asynchronously invoke the refresh task.
REFRESH MATERIALIZED VIEW test_db.order_mv;
-- Synchronously invoke the refresh task.
REFRESH MATERIALIZED VIEW test_db.order_mv WITH SYNC MODE;

You can cancel an asynchronously invoked refresh task using CANCEL REFRESH MATERIALIZED VIEW.

-- Cancel the refresh task.
CANCEL REFRESH MATERIALIZED VIEW test_db.order_mv;
-- Force cancel the refresh task.
CANCEL REFRESH MATERIALIZED VIEW test_db.order_mv FORCE;

Directly query asynchronous materialized views

An asynchronous materialized view is essentially a physical table that stores the complete result set pre-computed according to a specific query statement. After the materialized view is refreshed for the first time, you can directly query the materialized view.

SELECT * FROM test_db.order_mv;
Note

You can directly query asynchronous materialized views, but due to the asynchronous refresh mechanism, their results may not be consistent with the results you query from the base tables.

Use asynchronous materialized views to rewrite and accelerate queries

StarRocks supports automatic transparent rewrite of SPJG-type asynchronous materialized view queries. Its query rewrite includes single-table rewrite, Join rewrite, aggregate rewrite, Union rewrite, and nested materialized view rewrite. Starting from v3.3.0, StarRocks further supports text-based materialized view rewrite. For detailed information, see Materialized view query rewrite.

Currently, StarRocks supports query rewrite for asynchronous materialized views based on Default Catalog, Hive Catalog, Hudi Catalog, and Iceberg Catalog. When querying Default Catalog data, StarRocks ensures strong consistency between the rewritten query results and the original query results by excluding materialized views where data is inconsistent with the base tables. When materialized view data is outdated, it will not be considered as a candidate materialized view. When querying external catalog data, StarRocks cannot detect data changes in external catalog partitions, so it does not guarantee strong consistency of results. For information about asynchronous materialized views based on External Catalog, see Using materialized views to accelerate data lake queries.

Note

Asynchronous materialized views built based on JDBC Catalog tables do not currently support query rewrite.

Manage asynchronous materialized views

Modify asynchronous materialized views

You can modify asynchronous materialized view properties using the ALTER MATERIALIZED VIEW command.

  • Enable a disabled asynchronous materialized view (set the materialized view's status to Active).

    ALTER MATERIALIZED VIEW test_db.order_mv ACTIVE;
  • Change the asynchronous materialized view name to order_total.

    ALTER MATERIALIZED VIEW test_db.order_mv RENAME order_total;
  • Modify the maximum refresh interval of the asynchronous materialized view to 2 days.

    ALTER MATERIALIZED VIEW test_db.order_mv REFRESH ASYNC EVERY(INTERVAL 2 DAY);

View asynchronous materialized views

You can use SHOW MATERIALIZED VIEWS or query system metadata views in the Information Schema to view asynchronous materialized views in the database.

  • View all asynchronous materialized views in the current data warehouse.

    USE test_db;
    SHOW MATERIALIZED VIEWS;
  • View a specific asynchronous materialized view.

    USE test_db;
    SHOW MATERIALIZED VIEWS WHERE NAME = "order_mv";
  • View asynchronous materialized views by name matching.

    USE test_db;
    SHOW MATERIALIZED VIEWS WHERE NAME LIKE "order%";
  • View all asynchronous materialized views through the system metadata view materialized_views in Information Schema.

    USE test_db;
    SELECT * FROM information_schema.materialized_views;

View asynchronous materialized view creation statements

You can view the creation statement of an asynchronous materialized view using the SHOW CREATE MATERIALIZED VIEW command.

SHOW CREATE MATERIALIZED VIEW test_db.order_mv;

View the execution status of asynchronous materialized views

You can view the execution status of asynchronous materialized views in the following two ways.

  • Through the Materialized View Management > Asynchronous Materialized View page in EMR StarRocks Manager.

  • By querying the tasks and task_runs metadata views in StarRocks' Information Schema to view the execution (build or refresh) status of asynchronous materialized views.

    Examples are as follows.

    1. View the TASK_NAME of the latest task in the tasks table.

      SELECT * FROM information_schema.tasks  ORDER BY CREATE_TIME DESC limit 1;
    2. View the execution status in the task_runs table based on the queried TASK_NAME.

      SELECT * FROM information_schema.task_runs WHERE task_name='mv-59299' ORDER BY CREATE_TIME;

Delete asynchronous materialized views

You can delete created asynchronous materialized views using the DROP MATERIALIZED VIEW command.

DROP MATERIALIZED VIEW test_db.order_mv;