AnalyticDB for PostgreSQL provides real-time materialized views. Compared with non-real-time materialized views, real-time materialized views can implement automatic refresh in response to data changes without the need to execute REFRESH statements.
A real-time materialized view can be automatically updated with changes made to its base tables. In the beta stage, real-time materialized views of AnalyticDB for PostgreSQL support only statement-level automatic refresh. When an INSERT, COPY, UPDATE, or DELETE statement is executed on a base table, real-time materialized views created on this base table are updated in real time to ensure strong data consistency.
For more information about non-real-time materialized views, see Manage materialized views.
When a statement is executed on a base table, real-time materialized views created on this base table are updated in real time to ensure data consistency. Such statements include INSERT, COPY, UPDATE, and DELETE. Real-time materialized views are updated based on the following logic:
- The database kernel first updates base tables, and then updates their materialized views. If base tables fail to be updated, no data changes are made to their materialized views.
- If materialized views fail to be updated, no data changes are made to their base tables, and an error is returned for the executed statement.
For an explicit transaction that starts with the BEGIN TRANSACTION statement and ends with the COMMIT statement, after base tables are updated, data changes are made to their materialized views also in this transaction.
- If AnalyticDB for PostgreSQL uses the default READ COMMITTED isolation level, before this transaction is committed, updates of materialized views are invisible to other transactions.
- If this transaction is rolled back, base tables and their materialized views are also rolled back.
AnalyticDB for PostgreSQL imposes the following limits on the query statements that are used to create real-time materialized views:
- A single query statement can reference up to two tables.
- If a query statement references two tables, only INNER JOIN can be used to join the two tables. SELF JOIN is not supported.
- Query statements can contain most filtering and projection operations.
- Only the following aggregate operations are supported in a query statement: COUNT, SUM, AVG, MAX, and MIN. The HAVING clause is not supported.
- Complex statements such as those that contain subqueries and common table expressions (CTEs) are not supported.
After you create a real-time materialized view on a base table, DDL statements that you execute on the base table are subject to the following limits:
- When you execute the TRUNCATE statement on the base table, the real-time materialized view is not synchronously updated. You must manually refresh the real-time materialized view or create another materialized view.
- You must specify the CASCADE option to execute the DROP TABLE statement on the base table.
- ALTER TABLE statements on the base table cannot be used to delete or modify fields referenced by the materialized view.
In the beta stage, the real-time materialized view feature has the following limits:
- Real-time materialized views support only heap tables. Append-optimized (AO) tables are not supported.
We recommend that you use real-time materialized views in the following scenarios:
- The number of rows or columns in the query results is much smaller than that of the base tables. For example, the query statement may contain a WHERE clause that effectively narrows down the results or may contain an aggregate function that consolidates multiple values into a single value.
- Large amounts of computations are required to obtain the query results of the following
- Semi-structured data analysis
- Aggregate operations that take a long time to complete
- Base tables are not frequently changed.
Real-time materialized views can be used in all scenarios where materialized views are suitable. Compared with non-real-time materialized views, real-time materialized views are highly consistent with their base tables. When a base table changes, its real-time materialized views synchronously change with minimal performance cost. However, when you use non-real-time materialized views, you must manually update them each time their base tables change. Therefore, when large amounts of data are changed on a base table or a streaming update is performed, real-time materialized views have great advantages over non-real-time materialized views.
Disadvantages of real-time materialized views
Real-time materialized views are similar to indexes maintained in real time. They can significantly optimize query performance but also reduce write performance.
When you create a real-time materialized view that contains only a single table, the write performance of the related database is reduced because the data in the materialized view must be synchronously updated. The write latency can be up to three times longer compared with writing data to the base table but not updating the materialized view. We recommend that you do not create more than five real-time materialized views on the same base table.
Batch data writes help reduce the maintenance overhead of real-time materialized views. When you execute COPY or INSERT statements, we recommend that you increase the number of batch processed rows within a single statement. This significantly reduces the maintenance overhead of real-time materialized views.
When the query statement used to create a real-time materialized view contains a JOIN clause to join two tables, you must optimize the write performance of the real-time materialized view. If you do not have relevant experience or encounter low performance when you test the execution, we recommend that you use a real-time materialized view that contains only a single table. The following suggestions are available in scenarios where two tables are joined:
- Use the join key of each base table as their respective distribution keys.
- Create an index for the join key of each base table.
Create or delete a real-time materialized view
- Execute the following
CREATE INCREMENTAL MATERIALIZED VIEWstatement to create a real-time materialized view named
CREATE INCREMENTAL MATERIALIZED VIEW mv AS SELECT * FROM base WHERE id > 40;
- Execute the following
DROP MATERIALIZED VIEWstatement to delete the
DROP MATERIALIZED VIEW mv;
- Execute the following statement to create a base table:
CREATE TABLE test (a int, b int) DISTRIBUTED BY (a);
- Execute the following statement to create a real-time materialized view:
CREATE INCREMENTAL MATERIALIZED VIEW mv AS SELECT * FROM TEST WHERE b > 40 DISTRIBUTED BY (a);
- Execute the following statement to insert data to the base table:
INSERT INTO test VALUES (1, 30), (2, 40), (3, 50), (4, 60);
- Execute the following statement to view data in the base table:
SELECT * FROM test;
The following result is returned:
a | b ---+---- 1 | 30 2 | 40 3 | 50 4 | 60 (4 rows)
- Execute the following statement to view data in the materialized view:
SELECT * FROM mv;
The following result is returned, which indicates that the materialized view is updated:
a | b ---+---- 3 | 50 4 | 60 (2 rows)