All Products
Search
Document Center

DataWorks:Materialized views

Last Updated:Mar 26, 2026

When multiple MaxCompute tasks share expensive join or aggregation subqueries, each task recomputes the same result independently. Data Asset Governance analyzes your scheduling history, identifies these redundant subqueries, and recommends materialized views you can create in a few clicks. After creation, DataWorks rewires the scheduling graph so downstream tasks read from the pre-computed view instead of rerunning the subquery.

Creating a materialized view adds a refresh node to your scheduling graph that incurs compute costs each time it runs. For pay-as-you-go projects, saved capacity = amount of input data × SQL complexity; refresh cost = compute capacity × unit priceCompute-intensive jobs (pay-as-you-go). Evaluate the cost trade-off before creating a view.

When to use materialized view recommendations

Materialized view recommendations are a good fit when your MaxCompute workloads have one or more of the following characteristics:

  • Recurring expensive subqueries: Multiple auto-triggered tasks share the same join or aggregation subquery, and recomputing it daily wastes compute resources.

  • High-volume inputs: A shared subquery processes more than 1,000,000 input rows per run, and the result does not change between downstream jobs.

  • BI or reporting pipelines: Downstream nodes fan out from a common data preparation step, and scheduling them all to recompute that step is redundant.

How it works

Materialized view scheduling graph

When DataWorks creates a materialized view, it synchronously creates a node that generates view data and deploys it to the production environment. Nodes with similar subqueries become descendant nodes of the materialized view refresh node. DataWorks schedules the refresh node first based on scheduling dependencies. When descendant nodes need the shared subquery result, they read from the already-refreshed materialized view instead of recomputing, reducing computing frequency.

This feature relies on MaxCompute's materialized view recommendation and management capability. For more information, see Recommendations and management of materialized views.

Available regions: China (Hangzhou), China (Shanghai), China (Shenzhen), China (Beijing), and China (Chengdu).

Prerequisites

Before you begin, make sure you have:

Step 1: Enable materialized view intelligent recommendation

Only users assigned the Workspace Administrator role can enable or disable materialized view intelligent recommendation.
  1. Go to the Data Asset Governance page. Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Governance > Data Asset Governance, then click Go to Data Asset Governance.

  2. In the left-side navigation pane, choose Governance > Automation > MV.

  3. On the MV page, select a workspace from the Workspace drop-down list and click Settings for Materialized View Recommendation. In the dialog box, turn on the switches in the Intelligent Materialized View Recommendation column for the projects you want to enable.

    Settings for Materialized View Recommendation dialog

After you enable the feature for a MaxCompute project associated with DataWorks DataStudio, the system begins analyzing scheduling history. Recommendations appear on the Materialized View Recommendation tab only when all three conditions are met:

  • An auto-triggered task has run for more than three consecutive days

  • The number of input rows in a shared subquery exceeds 1,000,000

  • The shared subquery contains data reorganization operators, such as join or aggregation operators

Step 2: Review materialized view recommendations

  1. Select a workspace from the Workspace drop-down list and click the Materialized View Recommendation tab.

  2. Set filter conditions such as Project and Analysis Time Interval to find recommendations. Each recommendation row includes the following fields:

    Field Description
    Materialized view recommendation Identifies the recommendation. Format: Aggregate: xx; Inputs: xx, xx;. Aggregate indicates whether the shared subquery contains aggregation. Inputs lists all source tables the subquery reads from.
    Number of possibly affected SQL statements The number of jobs that use the shared subquery
    Possibly affected node The number of scheduling nodes that use the shared subquery
    Recommendation rating Derived from subquery repetition, complexity, and input data volume. Higher ratings indicate greater potential benefit.

    Materialized View Recommendation tab

  3. Click the identifier in the Materialized view recommendation column to view full details of the shared subquery, including source table information, the shared subquery SQL, associated jobs, and scheduling tasks.

    Shared subquery details

Evaluating a recommendation: Before creating a view, check that the recommendation rating and affected node count justify the refresh overhead. A high rating with many affected nodes typically indicates the most savings. If the recommendation covers a subquery that runs infrequently or whose source data changes rarely, the refresh cost may outweigh the benefit.

Step 3: Create a materialized view

  1. On the Materialized View Recommendation tab, find the recommendation you want to act on and click Create Materialized View in the Actions column.

  2. On the Create Materialized View page, configure the parameters. Create materialized view section — define the view itself: (Optional) Create materialized view refresh node section — create a scheduling node to keep the view current: Configure the Materialized View Refresh Node Name and Node Running Timeout Period. When the node that generates the source table produces new output, this refresh node dynamically updates the materialized view. (Optional) Add materialized view refresh node as ancestor node section — wire the refresh node into existing scheduling graphs: Adding the refresh node as an ancestor of nodes with similar computing logic increases the query hit rate for incremental data. The list includes tasks from the current workspace and other workspaces. Click View Details to inspect individual tasks before adding them.

    Parameter Description
    Input table Auto-obtained from the recommendation. No modification needed.
    Create materialized view or not Defaults to Create Materialized View. If a view with the same input tables already exists, set this to Select Existing Materialized View.
    Materialized view name Specify a name based on your business requirements.
    Lifecycle How long the view data is retained. Options: 1 Day, 7 Days, Custom Days.
    Refresh type Controls when the view data is updated. See the table below.
    Preview statement The SQL statements that define the materialized view.

    Choosing a refresh type:

    Refresh type When DataWorks selects it When to use it
    Time Management Cron Mode Default when the source table is the output of a DataWorks task Use when downstream nodes depend on the freshest output from the ancestor table. The view refreshes at the most recent output time of the ancestor node.
    Refresh at Fixed Interval Default when the source table is not a DataWorks task output Use for tables updated on an external schedule
    Do Not Refresh Never selected automatically Use when the source data is static

    Create Materialized View page

    Add materialized view refresh node as ancestor node section

    Ancestor node configuration

  3. Click Create and Execute. DataWorks starts creating the materialized view. Track progress in the Operation Status column on the Materialized View Recommendation tab, or click View Details in the Actions column.

    Status Meaning
    To be created The materialized view has not been created by DataWorks yet
    Creating Creation is in progress
    Created Creation succeeded
    Create failed Creation was initiated but failed

    Operation Status column

After creation, if you configured both the Create Materialized View Refresh Node and Add Materialized View Refresh Node as Ancestor Node sections, the refresh node is automatically set as an ancestor node of the nodes with similar computing logic. The owner of the refresh node is the account used to create the materialized view.

Step 4: Manage materialized views

  1. Select a workspace from the Workspace drop-down list and click the Materialized View Management tab.

  2. Set filter conditions such as Project to list materialized views created from recommendations.

    Important

    For pay-as-you-go MaxCompute projects, saved compute capacity = amount of input data × SQL complexity; fee = compute capacity × unit price. For pricing details, see Computing pricing (pay-as-you-go).

    Materialized View Management tab

  3. Click the name of a materialized view in the MV column to view its details.

    Materialized view details

  4. To remove a materialized view whose output no longer meets your requirements, find it in the list and click Delete in the Actions column.