All Products
Search
Document Center

PolarDB:Real-time materialized views

Last Updated:Oct 08, 2023

This topic describes real-time materialized views of PolarDB for PostgreSQL (Compatible with Oracle).

Background information

Unlike common views, materialized views can store query results. In complex query scenarios, using materialized views to save query results can significantly improve query efficiency. However, the data of materialized views does not change with the data in the base tables. This means that you may not always obtain the most up-to-date results when you use materialized views for querying.

To address this issue, PolarDB introduces the concept of real-time materialized views. Compared with materialized views, real-time materialized views have the following benefits:

  • Real-time materialized views support statement-level updates. After DML statements (INSERT, DELETE, and UPDATE) are executed on base tables, data in the materialized views is automatically updated to remain consistent with data in the base tables.

  • Real-time materialized views make maximum use of the incremental data in the base tables. When you refresh a materialized view, all data in the materialized view is queried. Compared with frequent refresh of materialized views, real-time materialized views provide better performance.

  • Real-time materialized views can greatly improve query performance and ensure data consistency with base tables.

Terms

  • Base table: a common table used in the definition of a materialized view.

  • Delta: a collection of data that is added or removed when the data in the base table changes, compared with the data in the materialized view.

  • Refresh: maintains a materialized view so that the data in the materialized view is consistent with the data obtained by querying the current base table based on the view definition.

  • Apply Delta: inserts or deletes calculated incremental data from a real-time materialized view to maintain data consistency between the real-time materialized view and the base table.

Limits

Real-time materialized views are subject to the following limits:

  • The base table cannot be a partitioned table, column-oriented table or inherited table, but must be a common table.

  • Only INNER JOIN is supported. Other JOIN types are not supported.

  • Only mutable functions are supported.

  • Only view definitions that contain simple queries, projections, DISTINCT, and some aggregate functions are supported. View definitions that contain complex queries are not supported. The complex queries include subqueries, [NOT] EXISTS, [NOT] IN, LIMIT, HAVING, DISTINCT ON, WITH(CTE), ORDER BY, window functions, GROUPING SETS, CUBE, ROLLUP, UNION, INTERSECT, and EXCEPT.

  • When the GROUP BY clause is used, the group specified in the GROUP BY clause must be in the projection.

  • Only the following built-in aggregate functions are supported: MIN, MAX, SUM, AVG, and COUNT.

Performance degradation

  • Real-time materialized views greatly improve query performance but have a significant impact on the write performance of the base tables. If the number of read operations is greater than the number of write operations, we recommend that you use real-time materialized views.

  • The impact of real-time materialized views on the write performance of the base tables depends on factors such as the view definitions and the write loads, structures, and indexes of the base tables. Before you create real-time materialized views in the production environment, we recommend that you first test the impact of real-time materialized views on the write performance of the base tables in the test environment. You can use real-time materialized views in the production environment if the write performance meets the requirements.

  • The following methods can be used to reduce the costs of real-time materialized views:

    • Create a few real-time materialized views on the same base table.

    • Batch write data to the base table. For example, you can execute the COPY or INSERT INTO SELECT statement to batch import data.

    • Create primary keys for all base tables, and include the primary keys for all base tables in the projected column for the definitions of real-time materialized views.

How it works

  • Create a real-time materialized view

    • Rewrite the query of the materialized view and calculate the hidden columns required to maintain the real-time materialized view.

    • Create a trigger for the base table to refresh the real-time materialized view.

    • Create unique indexes for the real-time materialized view when specified conditions are met to accelerate delta refresh.

  • Refresh a real-time materialized view

    • Data changes in the base table activate the trigger.

    • Obtain incremental data from the base table by using the trigger.

    • Calculate the delta of the real-time materialized view based on the definitions and the incremental data of the current base table.

    • Apply the calculated incremental data to the real-time materialized view to implement delta refresh.

  • Delete a real-time materialized view

    • Delete the delta refresh trigger from the base table of the real-time materialized view.

    • Delete the real-time materialized view.

Usage

  • Prerequisites

    For clusters that use the minor engine version 1.1.27 released in September 2022, you can create real-time materialized views. For existing clusters, you must update the minor engine version to 1.1.27 and then create the polar_ivm plug-in in the cluster.

    CREATE EXTENSION polar_ivm WITH SCHEMA pg_catalog;
    Note

    You can execute the following statement to view the minor engine version of your PolarDB for PostgreSQL (Compatible with Oracle) cluster:

    show polar_version;
  • Create a real-time materialized view

    CREATE MATERIALIZED VIEW table_name[ (column_name [, ...] ) ]
    [ BUILD DEFERRED|IMMEDIATE ]
    REFRESH FAST
    ON COMMIT
    AS query
    [ WITH [ NO ] DATA ]

    The following table describes the required parameters.

    Parameter

    Description

    table_name

    The name of the real-time materialized view, which can be schema-qualified.

    column_name

    The name of a column in the real-time materialized view. If no column name is provided, it is obtained from the queried column names.

    BUILD DEFERRED

    Only the structure of the real-time materialized view is created, but no data is populated. No real-time refresh is performed on the real-time materialized view.

    When you query the real-time materialized view, no error is reported but no data is found before the REFRESH MATERIALIZED VIEW statement is executed on the real-time materialized view.

    BUILD IMMEDIATE

    The default option. A complete real-time materialized view is immediately created.

    query

    The definition of the real-time materialized view, which can be a SELECT statement, a TABLE expression, or a VALUES expression. The query runs within a secure and restricted operation.

  • Refresh a real-time materialized view

    REFRESH MATERIALIZED VIEW table_name
    Note

    • table_name: the name of the real-time materialized view.

    • For a real-time materialized view created by using the BUILD IMMEDIATE and WITH DATA parameters, no manual refresh is required to ensure data consistency between the view and the base table.

    • For a real-time materialized view created by using the BUILD DEFERRED and WITH NO DATA parameters, manual refresh generates data based on the definition and enables real-time refresh for subsequent modifications of the base table.

  • Delete a real-time materialized view

    DROP MATERIALIZED VIEW [ IF EXISTS ] table_name [, ...] [ CASCADE | RESTRICT ]

    The following table describes the required parameters.

    Parameter

    Description

    IF EXISTS

    If the real-time materialized view does not exist, a prompt, instead of an error, is returned.

    table_name

    The name of the real-time materialized view, which can be schema-qualified.

    CASCADE

    The objects that depend on the real-time materialized view (such as other materialized views or regular views) are automatically deleted, and then all objects that depend on automatically deleted objects are deleted.

    RESTRICT

    If any object depends on the real-time materialized view, the real-time materialized view is not deleted. This parameter is selected by default.

Performance test

  1. Create the dependency plug-in for the real-time materialized view.

    CREATE EXTENSION IF NOT EXISTS polar_ivm WITH SCHEMA pg_catalog ;
  2. Create the base table and populate the table with data.

    CREATE TABLE t( a INT, b VARCHAR);
    INSERT INTO t VALUES
      (1,'a'),
      (2,'b'),
      (3,'c'),
      (4,'d'),
      (5,'e');
  3. Create the real-time materialized view.

    CREATE MATERIALIZED VIEW mv
    REFRESH FAST
    ON COMMIT
    AS
    SELECT max(a),min(a),b FROM t GROUP BY b;
  4. Execute a DML statement on the base table.

    • Query the data of the real-time materialized view.

      SELECT * FROM mv ORDER BY b;

      Sample result:

       max | min | b
      -----+-----+---
         1 |   1 | a
         2 |   2 | b
         3 |   3 | c
         4 |   4 | d
         5 |   5 | e
      (5 rows)

      The result shows that the data of the real-time materialized view is consistent with the data of the base table.

    • Insert new data into the base table and query the data of the real-time materialized view.

      INSERT INTO t VALUES(6,'f');
      
      SELECT * FROM mv ORDER BY b;

      Sample result:

       max | min | b
      -----+-----+---
         1 |   1 | a
         2 |   2 | b
         3 |   3 | c
         4 |   4 | d
         5 |   5 | e
         6 |   6 | f
      (6 rows)

      The result shows that the data of the real-time materialized view is consistent with the data of the base table.

    • Delete the data of the base table and query the data of the real-time materialized view.

      DELETE FROM t WHERE a = 2;
      
      SELECT * FROM mv ORDER BY b;

      Sample result:

       max | min | b
      -----+-----+---
         1 |   1 | a
         3 |   3 | c
         4 |   4 | d
         5 |   5 | e
         6 |   6 | f
      (5 rows)

      The result shows that the data of the real-time materialized view is consistent with the data of the base table.

    • Update the data of the base table and query the data of the real-time materialized view.

      UPDATE t SET a = a + 1;
      
      SELECT * FROM mv ORDER BY b;

      Sample result:

       max | min | b
      -----+-----+---
         2 |   2 | a
         4 |   4 | c
         5 |   5 | d
         6 |   6 | e
         7 |   7 | f
      (5 rows)

      The result shows that the data of the real-time materialized view is consistent with the data of the base table.

  5. Delete the real-time materialized view.

    DROP MATERIALIZED VIEW mv;