This topic describes the real-time materialized views of PolarDB for PostgreSQL (Compatible with Oracle) clusters.
Background information
Unlike common views, materialized views can store query results. In complex query scenarios, using materialized views to save query results can significantly improve query efficiency. However, the data of materialized views does not change with the data in the base tables. This means that you may not always obtain the most up-to-date results when you use materialized views for querying.
To address this issue, PolarDB introduces the concept of real-time materialized views. Compared with materialized views, real-time materialized views have the following benefits:
Real-time materialized views support statement-level updates. After DML statements (INSERT, DELETE, and UPDATE) are executed on base tables, data in the materialized views is automatically updated to remain consistent with data in the base tables.
Real-time materialized views make maximum use of the incremental data in the base tables. When you refresh a materialized view, all data in the materialized view is queried. Compared with frequent refresh of materialized views, real-time materialized views provide better performance.
Real-time materialized views can greatly improve query performance and ensure data consistency with base tables.
Terms
Base table: a common table used in the definition of a materialized view.
Delta: a collection of data that is added or removed when the data in the base table changes, compared with the data in the materialized view.
Refresh: maintains a materialized view so that the data in the materialized view is consistent with the data obtained by querying the current base table based on the view definition.
Apply Delta: inserts or deletes calculated incremental data from a real-time materialized view to maintain data consistency between the real-time materialized view and the base table.
Limits
Real-time materialized views are subject to the following limits:
The base table must be a common table instead of a partitioned table, an inherited table, or a column store table.
Only
INNER JOINis supported. OtherJOINtypes are not supported.Only
IMMUTABLEfunctions are supported.Only view definitions that contain simple queries, projections,
DISTINCT, and specific aggregate functions are supported. View definitions that contain complex queries are not supported. The complex queries include subqueries,[NOT] EXISTS,[NOT] IN,LIMIT,HAVING,DISTINCT ON,WITH(CTE),ORDER BY, window functions,GROUPING SETS,CUBE,ROLLUP,UNION,INTERSECT, andEXCEPT.When the
GROUP BYclause is used, the group specified in theGROUP BYclause must be in the projection.Only the following built-in aggregate functions are supported:
MIN,MAX,SUM,AVG, andCOUNT.
Performance degradation
Real-time materialized views greatly improve query performance but have a significant impact on the write performance of the base tables. If the number of read operations is greater than the number of write operations, we recommend that you use real-time materialized views.
The impact of real-time materialized views on the write performance of the base tables depends on factors such as the view definitions and the write loads, structures, and indexes of the base tables. Before you create real-time materialized views in the production environment, we recommend that you first test the impact of real-time materialized views on the write performance of the base tables in the test environment. You can use real-time materialized views in the production environment if the write performance meets the requirements.
The following methods can be used to reduce the costs of real-time materialized views:
Create a few real-time materialized views on the same base table.
Batch write data to the base table. For example, you can execute the
COPYorINSERT INTO SELECTstatement to batch import data.Create primary keys for all base tables, and include the primary keys for all base tables in the projected column for the definitions of real-time materialized views.
How it works
Create a real-time materialized view
Rewrite the query of the materialized view and calculate the hidden columns required to maintain the real-time materialized view.
Create a trigger for the base table to refresh the real-time materialized view.
Create unique indexes for the real-time materialized view when specified conditions are met to accelerate delta refresh.
Refresh a real-time materialized view
Data changes in the base table activate the trigger.
Obtain incremental data from the base table by using the trigger.
Calculate the delta of the real-time materialized view based on the definitions and the incremental data of the current base table.
Apply the calculated incremental data to the real-time materialized view to implement delta refresh.
Delete a real-time materialized view
Delete the delta refresh trigger from the base table of the real-time materialized view.
Delete the real-time materialized view.
Usage notes
Prerequisites
You can directly use real-time materialized views in clusters that run a minor engine version of V1.1.27 (released in September 2022) or later. To use real-time materialized views in a cluster that runs a minor engine version earlier than V1.1.27, update the minor engine version to V1.1.27 and install the
polar_ivmextension in the cluster.CREATE EXTENSION polar_ivm WITH SCHEMA pg_catalog;NoteYou can execute the following statement to query the minor engine version of your PolarDB for PostgreSQL (Compatible with Oracle) cluster:
show polar_version;Create a real-time materialized view
CREATE MATERIALIZED VIEW table_name[ (column_name [, ...] ) ] [ BUILD DEFERRED|IMMEDIATE ] REFRESH FAST ON COMMIT AS queryThe following table describes the parameters.
Parameter
Description
table_name
The name of the real-time materialized view that you want to create, which can be schema-qualified.
column_name
The name of a column in the real-time materialized view. If you do not specify column names, the materialized view uses the column names from the result set of the query specified in the AS clause.
BUILD DEFERRED
Creates the view, but populates data later such as upon the first refresh.
When you query the view, no error message appears. However, no data is returned until you execute the
REFRESH MATERIALIZED VIEWstatement on the view.BUILD IMMEDIATE
Populates data immediately after you create the view. This is the default option.
query
The SQL query whose results are stored in the materialized view, which can be a SELECT, TABLE, or VALUES statement. The SQL query runs in a secure and restricted operation.
Incrementally refresh a real-time materialized view
REFRESH MATERIALIZED VIEW table_nameNotetable_name: the name of the real-time materialized view that you want to refresh.
You do not need to manually refresh a real-time materialized view that is created by using the
BUILD IMMEDIATEoption.After you refresh a real-time materialized view that is created by using the
BUILD DEFERREDoption, the view is populated with data based on the view definition, and subsequent modifications performed on the base table are synchronized to the view in real time.
Delete a real-time materialized view
DROP MATERIALIZED VIEW [ IF EXISTS ] table_name [, ...] [ CASCADE | RESTRICT ]The following table describes the parameters.
Parameter
Description
IF EXISTS
If the real-time materialized view does not exist, a prompt, instead of an error message, is returned.
table_name
The name of the real-time materialized view that you want to delete, which can be schema-qualified.
CASCADE
Automatically deletes the objects that depend on the view, such as other materialized views or regular views, and all objects that depend on the automatically deleted objects.
RESTRICT
If an object depends on the real-time materialized view, the view is not deleted. By default, this parameter is used.
Performance test
Create the dependency plug-in for the real-time materialized view.
CREATE EXTENSION IF NOT EXISTS polar_ivm WITH SCHEMA pg_catalog ;Create the base table and populate the table with data.
CREATE TABLE t( a INT, b VARCHAR); INSERT INTO t VALUES (1,'a'), (2,'b'), (3,'c'), (4,'d'), (5,'e');Create the real-time materialized view.
CREATE MATERIALIZED VIEW mv REFRESH FAST ON COMMIT AS SELECT max(a),min(a),b FROM t GROUP BY b;Execute a DML statement on the base table.
Query the data of the real-time materialized view.
SELECT * FROM mv ORDER BY b;Sample result:
max | min | b -----+-----+--- 1 | 1 | a 2 | 2 | b 3 | 3 | c 4 | 4 | d 5 | 5 | e (5 rows)The result shows that the data of the real-time materialized view is consistent with the data of the base table.
Insert new data into the base table and query the data of the real-time materialized view.
INSERT INTO t VALUES(6,'f'); SELECT * FROM mv ORDER BY b;Sample result:
max | min | b -----+-----+--- 1 | 1 | a 2 | 2 | b 3 | 3 | c 4 | 4 | d 5 | 5 | e 6 | 6 | f (6 rows)The result shows that the data of the real-time materialized view is consistent with the data of the base table.
Delete the data of the base table and query the data of the real-time materialized view.
DELETE FROM t WHERE a = 2; SELECT * FROM mv ORDER BY b;Sample result:
max | min | b -----+-----+--- 1 | 1 | a 3 | 3 | c 4 | 4 | d 5 | 5 | e 6 | 6 | f (5 rows)The result shows that the data of the real-time materialized view is consistent with the data of the base table.
Update the data of the base table and query the data of the real-time materialized view.
UPDATE t SET a = a + 1; SELECT * FROM mv ORDER BY b;Sample result:
max | min | b -----+-----+--- 2 | 2 | a 4 | 4 | c 5 | 5 | d 6 | 6 | e 7 | 7 | f (5 rows)The result shows that the data of the real-time materialized view is consistent with the data of the base table.
Delete the real-time materialized view.
DROP MATERIALIZED VIEW mv;