Multi-table joins and complex aggregate operations, which are time-consuming and demand significant computing resources, can be efficiently addressed using AnalyticDB for MySQL materialized views. These views precompute and store the results of user-defined queries. By directly accessing the stored results from materialized views, you can significantly enhance query response times. This topic describes the process for creating a materialized view.
Prerequisites
The kernel version is 3.1.3.4 or later.
Create a materialized view
Permission requirements
Before you create a materialized view, ensure you have the following permissions:
-
The CREATE permission on tables in the database where you want to create the materialized view.
-
The SELECT permission on specific or all columns of all base tables referenced in the materialized view.
-
If you want to create a materialized view that can be automatically refreshed, you must also have the following permissions:
-
You can connect to
'%'
from any IP address, which means access to AnalyticDB for MySQL is permitted from any location. -
The INSERT permission on the materialized view or all tables in the database where the materialized view resides. Otherwise, the data in the materialized view cannot be refreshed.
-
Prepare base tables for creating a materialized view
The following example uses the customer and sales tables to guide you in quickly creating a materialized view.
/*+ RC_DDL_ENGINE_REWRITE_XUANWUV2=false */ -- Specify the engine of the table as the XUANWU engine.
CREATE TABLE customer (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(255),
is_vip Boolean
);
/*+ RC_DDL_ENGINE_REWRITE_XUANWUV2=false */ -- Specify the engine of the table as the XUANWU engine.
CREATE TABLE sales (
sale_id INT PRIMARY KEY,
product_id INT,
customer_id INT,
price DECIMAL(10, 2),
quantity INT,
sale_date TIMESTAMP
);
The following example of creating a materialized view does not specify a resource group. If no resource group is specified, the computing resources of the default interactive resource group are used to create and refresh the materialized view. If you want to understand the differences between the interactive resource group and the job resource group when creating and refreshing materialized views, or if you want to use the job resource group to create and refresh materialized views, see Use elastic resources to create or refresh materialized views.
Create a full refresh materialized view
You can create a full refresh materialized view, hereafter referred to as a full materialized view, using internal tables, external tables, existing materialized views, and views from AnalyticDB for MySQL.
This topic uses the base tables customer
and sales
as examples to create a full materialized view join_mv
with a manual refresh trigger mechanism.
CREATE MATERIALIZED VIEW join_mv
REFRESH COMPLETE ON DEMAND
AS
SELECT
sale_id,
SUM(price * quantity) AS price
FROM customer
INNER JOIN (SELECT sale_id,customer_id,price,quantity FROM sales) sales ON customer.customer_id = sales.customer_id
GROUP BY customer.customer_id;
To update the data in the materialized view, manually execute REFRESH MATERIALIZED VIEW join_mv;
.
Create an incremental refresh materialized view
You can create an incremental materialized view based on AnalyticDB for MySQL internal tables.
The following section guides you to create an incremental materialized view with multiple table associations.
Before you create an incremental materialized view, complete the following preparations:
-
Check whether the kernel version of your cluster is 3.1.9.0 or later.
-
Enable the binary logging feature for the cluster and the base tables.
If an error occurs when you enable the binary logging feature for the base tables, see Query execution error: : Can not create FAST materialized view, because demotable doesn't support getting incremental data for solutions.
SET ADB_CONFIG BINLOG_ENABLE=true; ALTER TABLE customer binlog=true; ALTER TABLE sales binlog=true;
Create an incremental materialized view sales_mv_incre
. This materialized view involves only query calculations on the sales
table.
CREATE MATERIALIZED VIEW sales_mv_incre
REFRESH FAST NEXT now() + INTERVAL 3 minute
AS
SELECT
sale_id,
SUM(price * quantity) AS price
FROM sales
GROUP BY sale_id;
If the kernel version of your cluster is 3.2.1.0 or later, you can also create an incremental materialized view join_mv_incre
with multiple table associations. The refresh interval is set to automatically refresh every 3 minutes.
CREATE MATERIALIZED VIEW join_mv_incre
REFRESH FAST NEXT now() + INTERVAL 3 minute
AS
SELECT
customer.customer_id,
SUM(sales.price) AS price
FROM customer
INNER JOIN (SELECT customer_id,price FROM sales) sales ON customer.customer_id = sales.customer_id
GROUP BY customer.customer_id;
For more information about the syntax and more examples of materialized views, see CREATE MATERIALIZED VIEW.
Query the materialized views that are being created
You can query the materialized views that are being created by executing SHOW PROCESSLIST WHERE info LIKE '%CREATE MATERIALIZED VIEW%';
.
Each row in the query results indicates a materialized view that is being created. The user
field indicates the database account that creates the materialized view. The status
field indicates the creation status. The Info
field contains the creation statement of the materialized view. For more information about the fields in the query results, see SHOW PROCESSLIST.
If no records are returned by SHOW PROCESSLIST
, the materialized view has been created (including the table structure of the created materialized view and the initial data loaded in the materialized view).
How to write base table queries when creating a materialized view
Base table queries for full materialized views
For full refresh materialized views, the base tables can include internal tables, external tables, existing materialized views, and views from AnalyticDB for MySQL. There are no restrictions on the queries of base tables. For more information about the query syntax, see SELECT.
Base table queries for incremental materialized views
For incremental refresh of materialized views, the base tables must be internal tables from AnalyticDB for MySQL. Base table queries are subject to the following rules:
SELECT output columns
-
For grouping and aggregation queries (aggregate queries with GROUP BY), the output columns of SELECT must include all grouping columns in the GROUP BY clause.
-
For non-grouping and aggregation queries (aggregate queries without GROUP BY), SELECT can only output constant columns and aggregated columns or only output aggregated columns.
For non-aggregation queries, SELECT must output all primary key columns of the base table.
The output expression columns must be aliased. It is recommended to use meaningful aliases, such as
SUM(price) AS total_price
.
Other limitations
Use elastic resources to create or refresh materialized views
Elastic resources refer to the computing resources in job resource groups. Creating and subsequently refreshing materialized views consume the computing resources of the cluster. If a resource group is not specified (the MV_PROPERTIES='{"mv_resource_group":"rg_name"}'
is not defined), creating and refreshing materialized views will use the reserved computing resources of the default interactive resource group (the resource group named user_default).
If you specify a job resource group when you create a materialized view, the job resource group is used to create and refresh the materialized view. You can also define the amount of resources that can be used by the materialized view by using "mv_refresh_hints":{"elastic_job_max_acu":"value"}
. For more information, see mv_properties.
The difference between using the default interactive resource group and the job resource group is that the job resource group does not require you to purchase resources in advance, which effectively prevents resource waste and reduces costs. However, the refresh performance of the default interactive resource group is better than that of a job resource group because each time you refresh materialized views, the job resource group requires several seconds or minutes to provision computing resources.
If your cluster meets the following conditions, you can use elastic resources to create and refresh materialized views.
The edition of the cluster is Enterprise Edition, Basic Edition, or Data Lakehouse Edition.
The minor version of the cluster is 3.1.9.3 or later.
Take the customer
table as an example. Use the elastic resources of the job resource group my_job_rg to create and refresh materialized views with high priority.
CREATE MATERIALIZED VIEW job_mv
MV_PROPERTIES='{
"mv_resource_group":"my_job_rg",
"mv_refresh_hints":{"query_priority":"HIGH"}
}'
REFRESH COMPLETE ON DEMAND
START WITH now()
NEXT now() + INTERVAL 1 DAY
AS
SELECT * FROM customer;
Refresh trigger mechanisms of materialized views
Materialized views display data from the most recent refresh and may not reflect the latest data of base tables. Materialized views do not automatically refresh data after each query. To ensure the timeliness and accuracy of the data in materialized views, you can configure refresh trigger mechanisms for materialized views, such as automatic refresh at scheduled intervals, automatic refresh when base tables are overwritten, and manual refresh.
Materialized views are refreshed based on their refresh policies: complete refresh and fast refresh.
For more information about the differences between refresh trigger mechanisms and refresh policies, common scenarios, and more details, see Refresh materialized views.
Limits
General limitations
General limitations apply to all materialized views, including full materialized views and incremental materialized views.
Operations such as
INSERT
,DELETE
, orUPDATE
are not supported on materialized views.You cannot delete or rename base tables that are referenced in a materialized view or columns in the base tables. Before you modify the base tables, you must delete the materialized view.
By default, an AnalyticDB for MySQL cluster supports a maximum number of materialized views as follows:
V3.1.4.7 or later: up to 64 materialized views.
Versions earlier than V3.1.4.7: up to eight materialized views.
NoteIf the number of materialized views already reaches the upper limit, you can contact technical support to increase the maximum number of materialized views.
Limitations of full materialized views
Asynchronous tasks cannot be performed when the number of reserved nodes in a cluster is increased or decreased. Full refresh of materialized views is an asynchronous task. Therefore, full refresh cannot be performed at this time. Incremental refresh is not affected.
Limitations of incremental materialized views
Limitations on base tables of incremental materialized views:
XUANWU_V2 tables cannot be used as base tables for incremental materialized views.
The reason is that tables of the XUANWU_V2 engine do not support enabling binary logging.
In clusters with kernel versions earlier than 3.2.3.0, partitioned tables cannot be used as base tables for incremental materialized views.
In clusters with kernel versions earlier than 3.2.3.1, base tables of incremental materialized views do not support INSERT OVERWRITE and TRUNCATE operations, which will cause errors.
When the query_body uses aggregate functions such as MAX, MIN, APPROX_DISTINCT, or COUNT(DISTINCT), base tables of incremental materialized views only support INSERT operations. Operations that delete data, such as DELETE, UPDATE, REPLACE, and INSERT ON DUPLICATE KEY UPDATE, are not supported.
Limitations on refresh trigger mechanisms of incremental materialized views:
Only scheduled automatic refresh is supported. Manual refresh is not supported. The minimum refresh interval is 5 seconds (s), and the maximum refresh interval is 5 minutes (min).
Rules and limitations on query_body of incremental materialized views:
Nondeterministic expressions, such as NOW() and RAND(), are not supported.
ORDER BY operations are not supported.
HAVING clauses are not supported.
Window functions are not supported.
Set operations such as UNION, EXCEPT, and INTERSECT are not supported.
JOIN operations support only INNER JOIN. Foreign key fields must meet all the following conditions: They must be original fields of the table. Their data types must be the same, and they must have INDEX indexes. A maximum of five tables can be joined.
If you need to join more tables, please to contact technical support.
Only the following aggregate functions are supported: COUNT, SUM, MAX, MIN, AVG, APPROX_DISTINCT, and COUNT(DISTINCT).
The AVG() function does not support the DECIMAL type.
The COUNT(DISTINCT) function supports only the INTEGER type.
FAQ
How do I retain the data of only the most recent year in a materialized view?
When you define a materialized view, you can use the date column as the partition key and specify a lifecycle to retain the data of only the most recent year.
For example, new sales data is generated in the sales table every day. When you create a materialized view based on the sales table, you can use the sale_date column as the partition key and set the lifecycle to 365. This way, the materialized view retains the data of only the most recent year.
CREATE MATERIALIZED VIEW sales_mv_lifecycle
PARTITION BY VALUE(DATE_FORMAT(sale_date, '%Y%m%d')) LIFECYCLE 365
REFRESH FAST NEXT now() + INTERVAL 100 second
AS
SELECT
sale_date,
SUM(price * quantity) AS price
FROM sales
GROUP BY sale_date;
Common errors and troubleshooting
Query execution error: : Can not create FAST materialized view, because demotable doesn't support getting incremental data
Cause of error: The binary logging feature is not enabled for the base table demotable of the incremental refresh materialized view. The binary logging feature must be enabled for all base tables involved in the incremental refresh materialized view.
Solution: Execute ALTER TABLE demotable binlog=true;
to enable the binary logging feature for the base table.
If an error occurs when you enable the binary logging feature for the base table, and the error message is XUANWU_V2 engine not support ALTER_BINLOG_ENABLE now
, it indicates that the engine of the base table demotable
is XUANWU_V2, and the XUANWU_V2 engine does not support the binary logging feature. Because the engine of a table cannot be changed, you need to disable the XUANWU_V2 engine switch, recreate the table with the XUANWU engine, migrate the data from the XUANWU_V2 table to the XUANWU table, and create an incremental refresh materialized view based on the XUANWU table.
To disable the XUANWU_V2 engine switch, perform the following steps:
If the base table is automatically created by tools such as DTS, seamless integration, and console data synchronization, you can disable the XUANWU_V2 engine switch at the cluster level. After the switch is disabled, new tables will be created with the XUANWU engine.
Disable the XUANWU_V2 engine at the global level of the cluster:
SET adb_config RC_DDL_ENGINE_REWRITE_XUANWUV2=false;
If the base table is manually created by using the CREATE TABLE statement, and the number of base tables for which you want to create incremental materialized views is small, you can disable the XUANWU_V2 engine switch in the current CREATE TABLE statement. In this case, only the tables created by the current CREATE TABLE statement will use the XUANWU engine. Other tables created by other CREATE TABLE statements will still use the XUANWU_V2 engine.
Disable the XUANWU_V2 engine at the table level:
/*+ RC_DDL_ENGINE_REWRITE_XUANWUV2=false */CREATE TABLE ...
Query execution error: : PRIMARY key ID must output to MV.
Cause of error: The query_body of the incremental refresh materialized view is a non-aggregation query without GROUP BY. In this case, the primary key of the materialized view must be the same as the primary key of the base table, but the SELECT clause of the materialized view query does not output the primary key column of the base table.
The following sample code provides an example of how a signed URL is incorrectly used:
CREATE MATERIALIZED VIEW wrong_example1
REFRESH FAST ON DEMAND
NEXT now() + interval 200 second
AS
SELECT product_id,price -- The primary key of the sales base table is sale_id. However, sale_id is not contained in the SELECT clause.
FROM sales;
Solution: Output the primary key column of the base table in the SELECT clause of the materialized view query.
The following code provides an example on how to correctly specify the X-Amz-Algorithm parameter:
CREATE MATERIALIZED VIEW correct_example1
REFRESH FAST ON DEMAND
NEXT now() + interval 200 second
AS
SELECT sale_id,product_id, price -- Add the primary key of the base table, sale_id, to the SELECT clause.
FROM sales;
Query execution error: : MV PRIMARY KEY must be equal to base table PRIMARY KEY.
Cause of error: The query_body of the incremental refresh materialized view is a non-aggregation query without GROUP BY. In this case, the primary key of the materialized view must be the same as the primary key of the base table, but the definition of the materialized view (mv_definition) defines columns that are not primary keys of the base table as primary keys of the materialized view.
The following code provides a wrong example on how to use a signed URL:
CREATE MATERIALIZED VIEW wrong_example2
(PRIMARY KEY(sale_id,product_id)) -- product_id is a non-primary-key column of the sales base table.
REFRESH FAST ON DEMAND
NEXT now() + interval 200 second
AS
SELECT sale_id,product_id,price
FROM sales;
Solution: Modify the primary key of the materialized view to remove columns that are not primary keys of the base table.
The following code provides an example on how to correctly specify the X-Amz-SignedHeaders parameter:
CREATE MATERIALIZED VIEW correct_example2
(PRIMARY KEY(sale_id)) -- Remove the product_id column from the primary key of the materialized view.
REFRESH FAST ON DEMAND
NEXT now() + interval 200 second
AS
SELECT sale_id,product_id,price
FROM sales;
Query execution error: : FAST materialized view must define PRIMARY KEY
Cause of error: The error may be caused by the following two reasons:
No correct primary key is specified for a materialized view that supports fast refresh.
A function is invoked on the primary key of the base table in the materialized view query.
Solution: Choose an appropriate solution based on the cause of the error.
If no correct primary key is specified: Modify the definition of the materialized view (mv_definition) to ensure that the primary key meets the following requirements.
If you specify a GROUP BY clause, the primary key must be the columns referenced in the GROUP BY clause. For example, if
GROUP BY a,b
is specified, the primary key must be columns a and b.If you specify an aggregate function without GROUP BY, the primary key must be a globally unique constant.
If you specify a non-aggregate query without GROUP BY, the primary key must be the same as the primary key of the base table.
If a function is invoked on the primary key of the base table: Modify the query of the materialized view to remove the function.
Query execution error: : The join graph is not supported.
Cause of error: The query_body of the incremental refresh materialized view contains multiple table joins, and the data types of the join columns are different. For example, if sales INNER JOIN customer ON customer.id=sales.id
is specified, and the data types of customer.id and sales.id are different, this error will occur.
Solution: Execute ALTER TABLE tablename MODIFY COLUMN columnname newtype;
to modify the data types of the join columns in multiple table joins. For more information, see Change the data type of a column.
Query execution error: : Unable to use index join to refresh this fast MV.
Cause of error: The query_body of the incremental refresh materialized view contains multiple table joins, and the join columns do not have INDEX indexes. For example, if sales INNER JOIN customer ON customer.id=sales.id
is specified, and the customer.id column in the customer base table or the sales.id column in the sales base table does not have an INDEX index, this error will occur.
Solution: Execute ALTER TABLE tablename ADD KEY idx_name(columnname);
to add an index to the column in the base table. For more information, see Add an index.
Query execution error: : Query exceeded reserved memory limit
Cause of error: The memory used by the query on a single node exceeds the limit.
Solution: SQL statements with high memory usage generally contain operators such as Aggregation, TopN, Window, and Join. You can use the SQL diagnosis feature to query the stages and operators with high memory usage, and then perform SQL optimization on the operators with high memory usage. For more information, see Memory metrics and Analyze queries using stage and task details.
References
Materialized views: Understand the concept, common scenarios, and attribute change records of materialized views.
CREATE MATERIALIZED VIEW: Understand the syntax details of materialized views.
Refresh materialized views: Understand the refresh policies and trigger mechanisms of materialized views and how to manually refresh materialized views.
Manage materialized views: Query the definitions and refresh records of materialized views, query the list of all materialized views, and delete materialized views.
Query materialized views: Query materialized views.