This topic describes how to create materialized views.

Required permissions

  • You must have CREATE permissions on databases and tables.
  • You must have INSERT permissions on databases and tables.
  • You must have SELECT permissions on the relevant columns in all tables that are involved in materialized views.
  • If you want to enable the auto-refresh feature, you must have the permissions to refresh views on the 127.0.0.1 IP address. The system acts as a proxy creator and logs on from the local server (127.0.0.1) to perform auto-refresh. This way, if you do not have the permissions, the system cannot refresh. This helps prevent unauthorized access. If user permissions are granted on all IP addresses ('%'), unauthorized access does not exist, because all IP addresses include the 127.0.0.1 IP address.

Syntax

CREATE MATERIALIZED VIEW mv_name
[MV DEFINITION]
[REFRESH [FAST|COMPLETE] [ON <DEMAND|OVERWRITE>] [START WITH date] [NEXT date]]
[<ENABLE|DISABLE> QUERY REWRITE]
AS 
<QUERY BODY> 

Keyword description

MV DEFINITION
This keyword is used to define the properties of a materialized view.

A materialized view uses a common table schema to store data. Therefore, you can define some table-related properties, such as primary keys, partition keys, and indexes. They are consistent with the properties that are used to define tables. However, you cannot define columns that do not have output in the query.

REFRESH

This keyword is used to specify the refresh mode of a materialized view.

Materialized views can be refreshed in full (COMPLETE) or incremental (FAST) refresh mode. If you do not specify the refresh mode, full refresh is performed.

Note Only the default full refresh mode is supported.
In full refresh mode, the following trigger methods can be specified:
  • ON DEMAND: A refresh is triggered on demand. You can manually trigger the refresh or configure auto-refresh. This method is used by default.
  • ON OVERWRITE: A refresh is triggered when the base table that is referenced by a materialized view is overwritten by an INSERT OVERWRITE statement. This method is suitable for T+1 scenarios.

Auto-refresh

The START WITH and NEXT keywords can be used to configure the auto-refresh feature. START WITH specifies the first refresh time. NEXT specifies the next refresh time. The time is accurate to seconds. Milliseconds are truncated. If START WITH is not specified, the current time is used.

For more information, see Refresh materialized views.

QUERY REWRITE

This keyword is used to enable or disable the query rewrite feature.

After the feature is enabled, business queries can be rewritten to a materialized view. The materialized view can be used as a cache.

  • DISABLE QUERY REWRITE: disables the query rewrite feature on a materialized view. This value is used by default.
  • ENABLE QUERY REWRITE: enables the query rewrite feature on a materialized view.
Note The query rewrite feature is in the experimental phase.

For more information, see Query materialized views.

QUERY BODY
This keyword is the query principal of a materialized view.
  • You must define aliases for the expression columns in the query output such as sum(price). We recommend that you use descriptive aliases such as sum(price) as total_price.
  • The base tables involved in the query of materialized views cannot be deleted. The columns of these base tables cannot be deleted or modified.
  • You can use the WITH clause to query data based on logical views or materialized views.

Examples

# Refresh every 5 minutes.
CREATE MATERIALIZED VIEW myview
REFRESH NEXT now() + interval 5 minute
AS
SELECT count(*) as cnt FROM base

# Refresh at 02:00:00 every Monday.
CREATE MATERIALIZED VIEW myview
REFRESH 
 START WITH DATE_FORMAT(now() + interval 7 - weekday(now()) day, '%Y-%m-%d 02:00:00') 
 NEXT DATE_FORMAT(now() + interval 7 - weekday(now()) day, '%Y-%m-%d 02:00:00')
AS
SELECT count(*) as cnt FROM base

# Refresh at 02:00:00 every night.
CREATE MATERIALIZED VIEW myview
REFRESH 
 START WITH DATE_FORMAT(now() + interval 1 day, '%Y-%m-%d 02:00:00')
 NEXT DATE_FORMAT(now() + interval 1 day, '%Y-%m-%d 02:00:00')
AS
SELECT count(*) as cnt FROM base

# Refresh at 02:00:00 on the first day of every month.
CREATE MATERIALIZED VIEW myview
REFRESH NEXT DATE_FORMAT(last_day(now()) + interval 1 day, '%Y-%m-%d 02:00:00')
AS
SELECT count(*) as cnt FROM base

# Refresh only once.
CREATE MATERIALIZED VIEW myview
REFRESH START WITH now() + interval 1 day
AS 
SELECT count(*) as cnt FROM base

# Do not use auto-refresh. You must manually refresh.
CREATE MATERIALIZED VIEW myview (
  PRIMARY KEY (id)
) DISTRIBUTED BY HASH (id)
AS
SELECT id, name FROM base

# Create indexes on specified columns. By default, all columns are indexed.
CREATE MATERIALIZED VIEW myview (
  INDEX (name),
  PRIMARY KEY (id)
) DISTRIBUTED BY HASH (id)
AS
SELECT id, name, age FROM base

Limits

  • You cannot perform INSERT, DELETE, and UDPATE operations on materialized views.
  • Base tables referenced by materialized views cannot be deleted or renamed. Columns in these base tables referenced by materialized views cannot be deleted or renamed. To modify base tables, you must first delete materialized views.

Recommendations

  • When you create materialized views, we recommend that you keep the design to be consistent with the design of common tables. Specify partition keys and primary keys to improve the performance of subsequent queries.
  • In clusters that have lower specifications, we recommend that you do not refresh all materialized views at the same time. This way, the cluster stability is not affected.
  • By default, materialized views are indexed for all columns, which is the same as common tables. To reduce overheads, you can specify the INDEX keyword to index specific columns if materialized views are not indexed for all columns.