By Junwei Xia, Senior Product Expert of Alibaba Cloud and Junzheng Zheng, Senior Technical Expert of Alibaba Cloud
The materialized view in MaxCompute is a data object that pre-computes and stores result data. A materialized view behaves like a virtual table within a MaxCompute project. It contains aggregated, filtered, and joined results from one or more tables. Materialized views significantly reduce query processing time and save job computing resources. Leveraging the powerful automatic query rewriting capability of the MaxCompute optimizer, complex operations can be automatically replaced with operations that read materialized view results when a job can reuse them.
To use materialized views, it is necessary to understand their working principle, business data behavior, and usage scenarios. This poses a challenge for regular users.
MaxCompute offers the Artificial Intelligence Recommendation of materialized views, which allows users to leverage materialized views seamlessly. By enabling this feature, MaxCompute can automatically analyze business data usage scenarios, recommend suitable materialized views, and visualize their impact. This greatly reduces the threshold for using materialized views and expands the range of scenarios where they can be applied.
· Easy to use. Instead of having to understand the intricate details of materialized views, users can simply select their projects and enable the automatic intelligent analysis feature.
· Intelligent. MaxCompute automatically analyzes users' historical jobs, identifies recurring jobs, intelligently extracts common computational logic from the job collection as materialized view computational logic, and presents it to the user in an easily understandable SQL text format, sorted by recommendation degree.
· Easy to manage. The MaxCompute console offers a comprehensive solution for activating, managing, and displaying the effectiveness of materialized views.
With the growth of enterprise business, there is an increase in business data, and each department has diverse data analysis needs. In daily operations, there is often cross-usage of data among different departments, resulting in a significant amount of repeated calculations with the same logic.
Finding these repeated calculations is challenging for regular users or big data platform administrators because the repetitive part may only be a fraction of the overall computational logic. Modifying these repeated calculations is also difficult. If a table with repeated calculations is re-abstracted, it requires modifying all downstream dependent jobs and going through testing before relaunching. This additional workload makes it difficult to promote efficient data governance.
By using the Artificial Intelligence Recommendation of materialized views, MaxCompute automatically analyzes common computational logic within projects and provides recommendations for creating materialized views. With materialized views, you can leverage the optimizer's powerful rewriting capabilities to automatically apply the calculation results to jobs without modifying the original logic.
For example, as shown in the following figure, if there is no materialized view, the logic of rhombuses and circles in Tab4 and Tab5 is calculated repeatedly, which is calculated twice in this figure.
After the materialized view MV1 is created, the logic of rhombuses and circles is calculated only once. This saves computing resources and improves the computation speed.
The first step in traditional big data processing is for data analysis experts who possess both technical and business knowledge to construct a data warehouse and layer it accordingly. A typical model consists of operational data storage, data warehouse details, data warehouse services, and application data services. However, traditional modeling methods have the following drawbacks:
With the Artificial Intelligence Recommendation of materialized views, users no longer need to rely on experts for advance modeling. Intelligent and automatic modeling can be achieved. After users utilize the data, the backend automatically analyzes the repeated computational logic. MaxCompute then recommends and creates materialized views to achieve flexible and fast automatic modeling. Users do not need to worry about data storage or the efficiency of computing resources, allowing them to focus more on business development. This feature is particularly beneficial for small and medium-sized companies as they do not need to hire data modelers. They can rely on the Artificial Intelligence Recommendation of materialized views provided by MaxCompute.
This feature also provides acceleration capabilities for users' intelligent BI reports and dashboards. MaxCompute automatically analyzes data that is frequently refreshed, recommending the creation of materialized views. With materialized views, users can pre-compute the data required for reports or dashboards. When reports or dashboards are used, MaxCompute automatically rewrites routes to query the materialized views, greatly reducing the response time.
Using the Artificial Intelligence Recommendation of Materialized Views is very simple and can be done by following these steps:
The data middle platform team at Alibaba Group is responsible for building the common layer of the entire Alibaba data warehouse. Their goal is to consolidate the logic of repeated calculations, allowing multiple downstream businesses to access the same result table and save computing and storage resources. However, with the rapid growth of data volume and business complexity, it has become challenging for the traditional common layer to maintain its original state. The main reasons for this are as follows:
• Difficulty in identifying numbers.
• Similar logic, but the result table is not fully accessible.
• Difficulty in manually identifying common logic.
The Artificial Intelligence Recommendation feature of materialized views provided by MaxCompute can address these challenges. The data middle platform team converts the recommendations into materialized views, significantly reducing repeated calculations between downstream jobs and saving computing resources.
While materialized views can bring positive benefits to users in most cases, it is important to note that they may not solve all problems. Users should keep the following points in mind:
MaxCompute Unleashed - Part 1: Harnessing Compiler Errors and Warnings Effectively
137 posts | 19 followers
FollowAlibaba Cloud MaxCompute - September 7, 2022
Alibaba Cloud MaxCompute - August 15, 2022
Alibaba Cloud MaxCompute - October 18, 2021
Alibaba Clouder - April 1, 2021
Alibaba Clouder - July 21, 2020
Alibaba Cloud MaxCompute - January 22, 2021
137 posts | 19 followers
FollowAlibaba Cloud provides big data consulting services to help enterprises leverage advanced data technology.
Learn MoreConduct large-scale data warehousing with MaxCompute
Learn MoreAlibaba Cloud experts provide retailers with a lightweight and customized big data consulting service to help you assess your big data maturity and plan your big data journey.
Learn MoreFollow our step-by-step best practices guides to build your own business case.
Learn MoreMore Posts by Alibaba Cloud MaxCompute