全部產品
Search
文件中心

E-MapReduce:物化視圖查詢改寫

更新時間:Oct 16, 2024

StarRocks的非同步物化視圖採用了主流的基於SPJG(select-project-join-group-by)模式透明查詢改寫演算法。在不修改查詢語句的前提下,StarRocks可以自動將在基表上的查詢改寫為在物化視圖上的查詢,通過其中包含的預計算結果,協助您顯著降低計算成本,並大幅加速查詢執行。本文為您介紹如何利用StarRocks的非同步物化視圖來改寫並加速查詢。

適用情境

StarRocks的非同步物化視圖查詢改寫功能,主要應用於以下情境:

  • 指標預彙總

    如果您需要處理高維度資料,可以使用物化視圖來建立預彙總指標層。

  • 寬表Join

    物化視圖允許您在複雜情境下透明加速包含大寬表Join的查詢。

  • 湖倉加速

    構建基於External Catalog的物化視圖可以輕鬆加速針對資料湖中資料的查詢。

說明

基於JDBC Catalog表構建的非同步物化視圖暫不支援查詢改寫。

功能特點

StarRocks的非同步物化視圖自動查詢改寫功能具有以下特點:

  • 強資料一致性:如果基表是StarRocks內表,StarRocks可以保證通過物化視圖查詢改寫獲得的結果與直接查詢基表的結果一致。

  • Staleness rewrite:StarRocks支援Staleness rewrite,即允許容忍一定程度的資料到期,以應對資料變更頻繁的情況。

  • 多表Join:StarRocks的非同步物化視圖支援各種類型的Join,包括一些複雜的Join情境,如View Delta Join和Join派生改寫,可用於加速涉及大寬表的查詢情境。

  • 彙總改寫:StarRocks可以改寫帶有彙總操作的查詢,以提高報表效能。

  • 嵌套物化視圖:StarRocks支援基於嵌套物化視圖改寫複雜查詢,擴充了可改寫的查詢範圍。

  • Union改寫:您可以將Union改寫特性與物化視圖分區的存留時間(TTL)相結合,實現冷熱資料的分離,允許您從物化視圖查詢熱資料,從基表查詢歷史資料。

  • 基於視圖構建物化視圖:您可以在基於視圖建模的情景下加速查詢。

  • 基於External Catalog構建物化視圖:您可以通過該特性加速資料湖中的查詢。

  • 複雜運算式改寫:支援在運算式中調用函數和算術運算,滿足複雜分析和計算需求。

使用限制

單物化視圖查詢改寫能力,StarRocks目前存在以下限制:

  • StarRocks不支援非確定性函數的改寫,包括rand、random、uuid以及sleep。

  • StarRocks不支援視窗函數的改寫。

  • 如果物化視圖定義語句中包含LIMIT、ORDER BY、UNION、EXCEPT、INTERSECT、MINUS、GROUPING SETS、WITH CUBE或WITH ROLLUP,則無法用於改寫。

  • 基於External Catalog的物化視圖不保證查詢結果強一致。

  • 基於JDBC Catalog表構建的非同步物化視圖暫不支援查詢改寫。

針對基於視圖的物化視圖查詢改寫,StarRocks目前存在以下限制:

  • StarRocks不支援分區Union改寫。

  • 如果視圖包含隨機函數,則不支援查詢改寫,包括rand()、random()、uuid()和sleep()。

  • 如果視圖包含具有相同名稱的列,則不支援查詢改寫。您必須為具有相同名稱的列設定不同的別名。

  • 用於建立物化視圖的視圖必須至少包含以下資料類型之一的列:

    • 整數類型

    • 日期類型

    • 字串類型

Join改寫

StarRocks 支援改寫具有各種類型 Join 的查詢,包括 Inner Join、Cross Join、Left Outer Join、Full Outer Join、Right Outer Join、Semi Join 和 Anti Join。

以下樣本展示 Join 查詢的改寫。建立以下資料庫及基表:

CREATE DATABASE IF NOT EXISTS test_db;
USE test_db;
CREATE TABLE customer (
  c_custkey     INT(11)     NOT NULL,
  c_name        VARCHAR(26) NOT NULL,
  c_address     VARCHAR(41) NOT NULL,
  c_city        VARCHAR(11) NOT NULL,
  c_nation      VARCHAR(16) NOT NULL,
  c_region      VARCHAR(13) NOT NULL,
  c_phone       VARCHAR(16) NOT NULL,
  c_mktsegment  VARCHAR(11) NOT NULL
) ENGINE=OLAP
DUPLICATE KEY(c_custkey)
DISTRIBUTED BY HASH(c_custkey) BUCKETS 12;

CREATE TABLE lineorder (
  lo_orderkey         INT(11) NOT NULL,
  lo_linenumber       INT(11) NOT NULL,
  lo_custkey          INT(11) NOT NULL,
  lo_partkey          INT(11) NOT NULL,
  lo_suppkey          INT(11) NOT NULL,
  lo_orderdate        INT(11) NOT NULL,
  lo_orderpriority    VARCHAR(16) NOT NULL,
  lo_shippriority     INT(11) NOT NULL,
  lo_quantity         INT(11) NOT NULL,
  lo_extendedprice    INT(11) NOT NULL,
  lo_ordtotalprice    INT(11) NOT NULL,
  lo_discount         INT(11) NOT NULL,
  lo_revenue          INT(11) NOT NULL,
  lo_supplycost       INT(11) NOT NULL,
  lo_tax              INT(11) NOT NULL,
  lo_commitdate       INT(11) NOT NULL,
  lo_shipmode         VARCHAR(11) NOT NULL
) ENGINE=OLAP
DUPLICATE KEY(lo_orderkey)
DISTRIBUTED BY HASH(lo_orderkey) BUCKETS 48;

基於上述基表,建立以下物化視圖。

USE test_db;
CREATE MATERIALIZED VIEW join_mv1
DISTRIBUTED BY HASH(lo_orderkey)
AS
SELECT lo_orderkey, lo_linenumber, lo_revenue, lo_partkey, c_name, c_address
FROM lineorder INNER JOIN customer
ON lo_custkey = c_custkey;

該物化視圖可以改寫以下查詢。

SELECT lo_orderkey, lo_linenumber, lo_revenue, lo_partkey, c_name, c_address
FROM lineorder INNER JOIN customer
ON lo_custkey = c_custkey;

其原始查詢計劃和改寫後的計劃如下。

image

StarRocks支援改寫具有複雜運算式的Join查詢,如算術運算、字串函數、日期函數、CASE WHEN運算式和謂詞OR等。例如,上述物化視圖可以改寫以下查詢:

SELECT 
    lo_orderkey, 
    lo_linenumber, 
    (2 * lo_revenue + 1) * lo_linenumber, 
    upper(c_name), 
    substr(c_address, 3)
FROM lineorder INNER JOIN customer
ON lo_custkey = c_custkey;

除了常規情境,StarRocks還支援在更複雜的情景下改寫Join查詢。

Query Delta Join改寫

Query Delta Join是指查詢中Join的表是物化視圖中Join的表的超集的情況。例如,以下查詢Join了表lineorder、表customer和表part。如果物化視圖join_mv1僅包含lineordercustomer的Join,StarRocks可以使用join_mv1來改寫查詢。

樣本:

SELECT lo_orderkey, lo_linenumber, lo_revenue, c_name, c_address, p_name
FROM
    lineorder INNER JOIN customer ON lo_custkey = c_custkey
    INNER JOIN part ON lo_partkey = p_partkey;

其原始查詢計劃和改寫後的計劃如下:

image

View Delta Join改寫

View Delta Join指的是查詢中Join的表是物化視圖中Join的表的子集的情況。通常在涉及大寬表的情景中使用此功能。例如,在Star Schema Benchmark (SSB)的背景下,您可以通過建立物化視圖,Join所有表以提高查詢效能。測試發現在通過物化視圖透明改寫查詢後,多表Join的查詢效能可以達到與查詢相應大寬表相同的效能水平。

要啟用View Delta Join改寫,必須確保物化視圖包含了與查詢中所有相關表存在的1:1的Cardinality Preservation Join。滿足以下約束條件的九種Join都被視為Cardinality Preservation Join,可以用於啟用View Delta Join改寫。

image

以SSB測試為例,建立以下資料庫和基表。

CREATE DATABASE IF NOT EXISTS test_db;
USE test_db;
CREATE TABLE customer (
  c_custkey         INT(11)       NOT NULL,
  c_name            VARCHAR(26)   NOT NULL,
  c_address         VARCHAR(41)   NOT NULL,
  c_city            VARCHAR(11)   NOT NULL,
  c_nation          VARCHAR(16)   NOT NULL,
  c_region          VARCHAR(13)   NOT NULL,
  c_phone           VARCHAR(16)   NOT NULL,
  c_mktsegment      VARCHAR(11)   NOT NULL
) ENGINE=OLAP
DUPLICATE KEY(c_custkey)
DISTRIBUTED BY HASH(c_custkey) BUCKETS 12
PROPERTIES (
"unique_constraints" = "c_custkey"   -- 指定唯一鍵。
);

CREATE TABLE dates (
  d_datekey          DATE          NOT NULL,
  d_date             VARCHAR(20)   NOT NULL,
  d_dayofweek        VARCHAR(10)   NOT NULL,
  d_month            VARCHAR(11)   NOT NULL,
  d_year             INT(11)       NOT NULL,
  d_yearmonthnum     INT(11)       NOT NULL,
  d_yearmonth        VARCHAR(9)    NOT NULL,
  d_daynuminweek     INT(11)       NOT NULL,
  d_daynuminmonth    INT(11)       NOT NULL,
  d_daynuminyear     INT(11)       NOT NULL,
  d_monthnuminyear   INT(11)       NOT NULL,
  d_weeknuminyear    INT(11)       NOT NULL,
  d_sellingseason    VARCHAR(14)   NOT NULL,
  d_lastdayinweekfl  INT(11)       NOT NULL,
  d_lastdayinmonthfl INT(11)       NOT NULL,
  d_holidayfl        INT(11)       NOT NULL,
  d_weekdayfl        INT(11)       NOT NULL
) ENGINE=OLAP
DUPLICATE KEY(d_datekey)
DISTRIBUTED BY HASH(d_datekey) BUCKETS 1
PROPERTIES (
"unique_constraints" = "d_datekey"   -- 指定唯一鍵。
);

CREATE TABLE supplier (
  s_suppkey          INT(11)       NOT NULL,
  s_name             VARCHAR(26)   NOT NULL,
  s_address          VARCHAR(26)   NOT NULL,
  s_city             VARCHAR(11)   NOT NULL,
  s_nation           VARCHAR(16)   NOT NULL,
  s_region           VARCHAR(13)   NOT NULL,
  s_phone            VARCHAR(16)   NOT NULL
) ENGINE=OLAP
DUPLICATE KEY(s_suppkey)
DISTRIBUTED BY HASH(s_suppkey) BUCKETS 12
PROPERTIES (
"unique_constraints" = "s_suppkey"   -- 指定唯一鍵。
);

CREATE TABLE part (
  p_partkey          INT(11)       NOT NULL,
  p_name             VARCHAR(23)   NOT NULL,
  p_mfgr             VARCHAR(7)    NOT NULL,
  p_category         VARCHAR(8)    NOT NULL,
  p_brand            VARCHAR(10)   NOT NULL,
  p_color            VARCHAR(12)   NOT NULL,
  p_type             VARCHAR(26)   NOT NULL,
  p_size             TINYINT(11)   NOT NULL,
  p_container        VARCHAR(11)   NOT NULL
) ENGINE=OLAP
DUPLICATE KEY(p_partkey)
DISTRIBUTED BY HASH(p_partkey) BUCKETS 12
PROPERTIES (
"unique_constraints" = "p_partkey"   -- 指定唯一鍵。
);

CREATE TABLE lineorder (
  lo_orderdate       DATE          NOT NULL, -- 指定為 NOT NULL。
  lo_orderkey        INT(11)       NOT NULL,
  lo_linenumber      TINYINT       NOT NULL,
  lo_custkey         INT(11)       NOT NULL, -- 指定為 NOT NULL。
  lo_partkey         INT(11)       NOT NULL, -- 指定為 NOT NULL。
  lo_suppkey         INT(11)       NOT NULL, -- 指定為 NOT NULL。
  lo_orderpriority   VARCHAR(100)  NOT NULL,
  lo_shippriority    TINYINT       NOT NULL,
  lo_quantity        TINYINT       NOT NULL,
  lo_extendedprice   INT(11)       NOT NULL,
  lo_ordtotalprice   INT(11)       NOT NULL,
  lo_discount        TINYINT       NOT NULL,
  lo_revenue         INT(11)       NOT NULL,
  lo_supplycost      INT(11)       NOT NULL,
  lo_tax             TINYINT       NOT NULL,
  lo_commitdate      DATE          NOT NULL,
  lo_shipmode        VARCHAR(100)  NOT NULL
) ENGINE=OLAP
DUPLICATE KEY(lo_orderdate,lo_orderkey)
PARTITION BY RANGE(lo_orderdate)
(PARTITION p1 VALUES [("0000-01-01"), ("1993-01-01")),
PARTITION p2 VALUES [("1993-01-01"), ("1994-01-01")),
PARTITION p3 VALUES [("1994-01-01"), ("1995-01-01")),
PARTITION p4 VALUES [("1995-01-01"), ("1996-01-01")),
PARTITION p5 VALUES [("1996-01-01"), ("1997-01-01")),
PARTITION p6 VALUES [("1997-01-01"), ("1998-01-01")),
PARTITION p7 VALUES [("1998-01-01"), ("1999-01-01")))
DISTRIBUTED BY HASH(lo_orderkey) BUCKETS 48
PROPERTIES (
"foreign_key_constraints" = "(
    (lo_custkey) REFERENCES customer(c_custkey),
    (lo_partkey) REFERENCES part(p_partkey),
    (lo_suppkey) REFERENCES supplier(s_suppkey)
    )"  -- 指定外鍵約束。
);

建立Join表lineorder、表customer、表supplier、表part和表dates的物化視圖lineorder_flat_mv

USE test_db;
CREATE MATERIALIZED VIEW lineorder_flat_mv
DISTRIBUTED BY HASH(LO_ORDERDATE, LO_ORDERKEY) BUCKETS 48
PARTITION BY LO_ORDERDATE
REFRESH MANUAL
PROPERTIES (
    "partition_refresh_number"="1"
)
AS SELECT /*+ SET_VAR(query_timeout = 7200) */     -- 設定重新整理逾時時間。
       l.LO_ORDERDATE        AS LO_ORDERDATE,
       l.LO_ORDERKEY         AS LO_ORDERKEY,
       l.LO_LINENUMBER       AS LO_LINENUMBER,
       l.LO_CUSTKEY          AS LO_CUSTKEY,
       l.LO_PARTKEY          AS LO_PARTKEY,
       l.LO_SUPPKEY          AS LO_SUPPKEY,
       l.LO_ORDERPRIORITY    AS LO_ORDERPRIORITY,
       l.LO_SHIPPRIORITY     AS LO_SHIPPRIORITY,
       l.LO_QUANTITY         AS LO_QUANTITY,
       l.LO_EXTENDEDPRICE    AS LO_EXTENDEDPRICE,
       l.LO_ORDTOTALPRICE    AS LO_ORDTOTALPRICE,
       l.LO_DISCOUNT         AS LO_DISCOUNT,
       l.LO_REVENUE          AS LO_REVENUE,
       l.LO_SUPPLYCOST       AS LO_SUPPLYCOST,
       l.LO_TAX              AS LO_TAX,
       l.LO_COMMITDATE       AS LO_COMMITDATE,
       l.LO_SHIPMODE         AS LO_SHIPMODE,
       c.C_NAME              AS C_NAME,
       c.C_ADDRESS           AS C_ADDRESS,
       c.C_CITY              AS C_CITY,
       c.C_NATION            AS C_NATION,
       c.C_REGION            AS C_REGION,
       c.C_PHONE             AS C_PHONE,
       c.C_MKTSEGMENT        AS C_MKTSEGMENT,
       s.S_NAME              AS S_NAME,
       s.S_ADDRESS           AS S_ADDRESS,
       s.S_CITY              AS S_CITY,
       s.S_NATION            AS S_NATION,
       s.S_REGION            AS S_REGION,
       s.S_PHONE             AS S_PHONE,
       p.P_NAME              AS P_NAME,
       p.P_MFGR              AS P_MFGR,
       p.P_CATEGORY          AS P_CATEGORY,
       p.P_BRAND             AS P_BRAND,
       p.P_COLOR             AS P_COLOR,
       p.P_TYPE              AS P_TYPE,
       p.P_SIZE              AS P_SIZE,
       p.P_CONTAINER         AS P_CONTAINER,
       d.D_DATE              AS D_DATE,
       d.D_DAYOFWEEK         AS D_DAYOFWEEK,
       d.D_MONTH             AS D_MONTH,
       d.D_YEAR              AS D_YEAR,
       d.D_YEARMONTHNUM      AS D_YEARMONTHNUM,
       d.D_YEARMONTH         AS D_YEARMONTH,
       d.D_DAYNUMINWEEK      AS D_DAYNUMINWEEK,
       d.D_DAYNUMINMONTH     AS D_DAYNUMINMONTH,
       d.D_DAYNUMINYEAR      AS D_DAYNUMINYEAR,
       d.D_MONTHNUMINYEAR    AS D_MONTHNUMINYEAR,
       d.D_WEEKNUMINYEAR     AS D_WEEKNUMINYEAR,
       d.D_SELLINGSEASON     AS D_SELLINGSEASON,
       d.D_LASTDAYINWEEKFL   AS D_LASTDAYINWEEKFL,
       d.D_LASTDAYINMONTHFL  AS D_LASTDAYINMONTHFL,
       d.D_HOLIDAYFL         AS D_HOLIDAYFL,
       d.D_WEEKDAYFL         AS D_WEEKDAYFL
   FROM lineorder            AS l
       INNER JOIN customer   AS c ON c.C_CUSTKEY = l.LO_CUSTKEY
       INNER JOIN supplier   AS s ON s.S_SUPPKEY = l.LO_SUPPKEY
       INNER JOIN part       AS p ON p.P_PARTKEY = l.LO_PARTKEY
       INNER JOIN dates      AS d ON l.LO_ORDERDATE = d.D_DATEKEY;    

SSB Q2.1樣本如下。

USE test_db;
SELECT
    SUM(lo_revenue) AS lo_revenue,
    d_year,
    p_brand
FROM lineorder
    JOIN dates ON lo_orderdate = d_datekey
    JOIN part ON lo_partkey = p_partkey
    JOIN supplier ON lo_suppkey = s_suppkey
WHERE p_category = 'MFGR#12' AND s_region = 'AMERICA'
GROUP BY d_year, p_brand
ORDER BY d_year, p_brand;

SSB Q2.1涉及四個表的Join,但與物化視圖lineorder_flat_mv相比,缺少了customer表。在lineorder_flat_mv中,lineorder INNER JOIN customer本質上是一個Cardinality Preservation Join。因此邏輯上,可以消除該Join而不影響查詢結果。因此,Q2.1可以使用lineorder_flat_mv進行改寫。

其原始查詢計劃和改寫後的計劃如下。

image

同樣,SSB中的其他查詢也可以通過使用lineorder_flat_mv進行透明改寫,從而最佳化查詢效能。

Join派生改寫

Join派生是指物化視圖和查詢中的Join類型不一致,但物化視圖的Join結果包含查詢Join結果的情況。

目前支援以下兩種情境:

三表或以上Join

假設物化視圖包含表t1和表t2之間的Left Outer Join,以及表t2和表t3之間的Inner Join。兩個Join的條件都包括來自表t2的列。

而查詢則包含t1t2之間的Inner Join,以及t2t3之間的Inner Join。兩個Join的條件都包括來自表t2的列。

在這種情況下,上述查詢可以通過物化視圖改寫。這是因為在物化視圖中,首先執行Left Outer Join,然後執行Inner Join。Left Outer Join產生的右表沒有匹配結果(即右表中的列為NULL)。這些結果在執行Inner Join期間被過濾掉。因此,物化視圖和查詢的邏輯是等效的,可以對查詢進行改寫。

例如,建立以下物化視圖join_mv5

USE test_db;
CREATE MATERIALIZED VIEW join_mv5
PARTITION BY lo_orderdate
DISTRIBUTED BY hash(lo_orderkey)
PROPERTIES (
  "partition_refresh_number" = "1"
)
AS
SELECT lo_orderkey, lo_orderdate, lo_linenumber, lo_revenue, c_custkey, c_address, p_name
FROM customer LEFT OUTER JOIN lineorder
ON c_custkey = lo_custkey
INNER JOIN part
ON p_partkey = lo_partkey;

join_mv5可改寫以下查詢。

USE test_db;
SELECT lo_orderkey, lo_orderdate, lo_linenumber, lo_revenue, c_custkey, c_address, p_name
FROM customer INNER JOIN lineorder
ON c_custkey = lo_custkey
INNER JOIN part
ON p_partkey = lo_partkey;

其原始查詢計劃和改寫後的計劃如下。

image

同樣,如果物化視圖定義為t1 INNER JOIN t2 INNER JOIN t3,而查詢為LEFT OUTER JOIN t2 INNER JOIN t3,那麼查詢也可以被改寫。而且,在涉及超過三個表的情況下,也具備上述的改寫能力。

兩表Join

兩表Join的派生改寫支援以下幾種細分情境。

image

在情境1至9中,需要向改寫結果補償過濾謂詞,以確保語義等效性。例如,建立以下物化視圖。

USE test_db;
CREATE MATERIALIZED VIEW join_mv3
DISTRIBUTED BY hash(lo_orderkey)
AS
SELECT lo_orderkey, lo_linenumber, lo_revenue, c_custkey, c_address
FROM lineorder LEFT OUTER JOIN customer
ON lo_custkey = c_custkey;

join_mv3可以改寫以下查詢,其查詢結果需補償謂詞c_custkey IS NOT NULL

USE test_db;
SELECT lo_orderkey, lo_linenumber, lo_revenue, c_custkey, c_address
FROM lineorder INNER JOIN customer
ON lo_custkey = c_custkey;

其原始查詢計劃和改寫後的計劃如下。

image

在情境10中,需要Left Outer Join查詢中包含右表中IS NOT NULL的過濾謂詞,如=<>><<=>=LIKEINNOT LIKENOT IN。例如,建立以下物化視圖。

USE test_db;
CREATE MATERIALIZED VIEW join_mv4
DISTRIBUTED BY hash(lo_orderkey)
AS
SELECT lo_orderkey, lo_linenumber, lo_revenue, c_custkey, c_address
FROM lineorder INNER JOIN customer
ON lo_custkey = c_custkey;

join_mv4可以改寫以下查詢,其中customer.c_address = "Sb4gxKs7"IS NOT NULL謂詞。

USE test_db;
SELECT lo_orderkey, lo_linenumber, lo_revenue, c_custkey, c_address
FROM lineorder LEFT OUTER JOIN customer
ON lo_custkey = c_custkey
WHERE customer.c_address = "Sb4gxKs7";

其原始查詢計劃和改寫後的計劃如下。

image

彙總改寫

StarRocks非同步物化視圖的多表彙總查詢改寫支援所有彙總函式,包括bitmap_union、hll_union和percentile_union等。例如,建立以下物化視圖。

USE test_db;
CREATE MATERIALIZED VIEW agg_mv1
DISTRIBUTED BY hash(lo_orderkey)
AS
SELECT 
  lo_orderkey, 
  lo_linenumber, 
  c_name, 
  sum(lo_revenue) AS total_revenue, 
  max(lo_discount) AS max_discount 
FROM lineorder INNER JOIN customer
ON lo_custkey = c_custkey
GROUP BY lo_orderkey, lo_linenumber, c_name;

該物化視圖可以改寫以下查詢。

USE test_db;
SELECT
  lo_orderkey, 
  lo_linenumber, 
  c_name, 
  sum(lo_revenue) AS total_revenue, 
  max(lo_discount) AS max_discount 
FROM lineorder INNER JOIN customer
ON lo_custkey = c_custkey
GROUP BY lo_orderkey, lo_linenumber, c_name;

其原始查詢計劃和改寫後的計劃如下。

image

以下詳細闡述了彙總改寫功能可用的情境。

彙總上卷改寫

StarRocks支援通過彙總上卷改寫查詢,即StarRocks可以使用通過GROUP BY a,b子句建立的非同步物化視圖改寫帶有GROUP BY a子句的彙總查詢。例如,agg_mv1可以改寫以下查詢。

USE test_db;
SELECT 
  lo_orderkey, 
  c_name, 
  sum(lo_revenue) AS total_revenue, 
  max(lo_discount) AS max_discount 
FROM lineorder INNER JOIN customer
ON lo_custkey = c_custkey
GROUP BY lo_orderkey, c_name;

其原始查詢計劃和改寫後的計劃如下。

image

說明

當前暫不支援grouping set、grouping set with rollup以及grouping set with cube的改寫。

僅有部分彙總函式支援彙總上卷查詢改寫。下表展示了原始查詢中的彙總函式與用於構建物化視圖的彙總函式之間的對應關係。您可以根據自己的業務情境,選擇相應的彙總函式構建物化視圖。

原始查詢彙總函式

支援 Aggregate Rollup 的物化視圖構建彙總函式

sum

sum

count

count

min

min

max

max

avg

sum/count

bitmap_union、bitmap_union_count、count(distinct)

bitmap_union

hll_raw_agg、hll_union_agg、ndv、approx_count_distinct

hll_union

percentile_approx、percentile_union

percentile_union

沒有相應GROUP BY列的DISTINCT彙總無法使用彙總上卷查詢改寫。但是,從StarRocks v3.1開始,如果彙總上卷對應DISTINCT彙總函式的查詢沒有GROUP BY列,但有等價的謂詞,該查詢也可以被相關物化視圖改寫,因為StarRocks可以將等價謂詞轉換為GROUP BY常量運算式。

在以下樣本中,StarRocks可以使用物化視圖order_agg_mv1改寫對應查詢Query。

USE test_db;
CREATE MATERIALIZED VIEW order_agg_mv1
DISTRIBUTED BY HASH(`order_id`) BUCKETS 12
REFRESH ASYNC START('2022-09-01 10:00:00') EVERY (interval 1 day)
AS
SELECT
    order_date,
    count(distinct client_id) 
FROM order_list 
GROUP BY order_date;


-- Query
USE test_db;
SELECT
    order_date,
    count(distinct client_id) 
FROM order_list WHERE order_date='2023-07-03';

彙總下推

從v3.3.0版本開始,StarRocks支援物化視圖查詢改寫的彙總下推功能。啟用此功能後,彙總函式將在查詢執行期間下推至Scan Operator,並在執行Join Operator之前被物化視圖改寫。此舉可以緩解Join操作導致的資料膨脹,從而提高查詢效能。

系統預設禁用該功能。要啟用此功能,必須將系統變數enable_materialized_view_agg_pushdown_rewrite設定為true

假設需要加速以下基於SSB的查詢SQL1

-- SQL1
USE test_db;
SELECT 
    LO_ORDERDATE, sum(LO_REVENUE), max(LO_REVENUE), count(distinct LO_REVENUE)
FROM lineorder l JOIN dates d 
ON l.LO_ORDERDATE = d.d_date 
GROUP BY LO_ORDERDATE 
ORDER BY LO_ORDERDATE;

SQL1包含lineorder表內的彙總以及lineorderdates表之間的Join。彙總發生在lineorder內部,與dates的Join僅用於資料過濾。所以SQL1在邏輯上等同於以下SQL2

-- SQL2
USE test_db;
SELECT 
    LO_ORDERDATE, sum(sum1), max(max1), bitmap_union_count(bitmap1)
FROM 
 (SELECT
  LO_ORDERDATE,  sum(LO_REVENUE) AS sum1, max(LO_REVENUE) AS max1, bitmap_union(to_bitmap(LO_REVENUE)) AS bitmap1
  FROM lineorder 
  GROUP BY LO_ORDERDATE) l JOIN dates d 
ON l.LO_ORDERDATE = d.d_date 
GROUP BY LO_ORDERDATE 
ORDER BY LO_ORDERDATE;

SQL2將彙總提前,大量減少Join的資料量。您可以基於SQL2的子查詢建立物化視圖,並啟用彙總下推以改寫和加速彙總。

-- 建立物化視圖 mv0
USE test_db;
CREATE MATERIALIZED VIEW mv0 REFRESH MANUAL AS
SELECT
  LO_ORDERDATE, 
  sum(LO_REVENUE) AS sum1, 
  max(LO_REVENUE) AS max1, 
  bitmap_union(to_bitmap(LO_REVENUE)) AS bitmap1
FROM lineorder 
GROUP BY LO_ORDERDATE;

-- 啟用彙總下推
SET enable_materialized_view_agg_pushdown_rewrite=true;

此時,SQL1將通過物化視圖進行改寫和加速。改寫後的查詢如下。

USE test_db;
SELECT 
    LO_ORDERDATE, sum(sum1), max(max1), bitmap_union_count(bitmap1)
FROM 
 (SELECT LO_ORDERDATE, sum1, max1, bitmap1 FROM mv0) l JOIN dates d 
ON l.LO_ORDERDATE = d.d_date 
GROUP BY LO_ORDERDATE
ORDER BY LO_ORDERDATE;

請注意,只有部分支援彙總上卷改寫的彙總函式可以下推。目前支援下推的彙總函式有:

  • MIN

  • MAX

  • COUNT

  • COUNT DISTINCT

  • SUM

  • BITMAP_UNION

  • HLL_UNION

  • PERCENTILE_UNION

  • BITMAP_AGG

  • ARRAY_AGG_DISTINCT

說明
  • 下推後的彙總函式需要進行上卷才能對齊原始語義。有關彙總上卷的更多說明,請參閱彙總上卷改寫

  • 彙總下推支援基於Bitmap或HLL函數的Count Distinct上卷改寫。

  • 彙總下推僅支援將查詢中的彙總函式下推至Join、Filter、Where Operator之下的Scan Operator之上。

  • 彙總下推僅支援基於單張表構建的物化視圖進行查詢改寫和加速。

COUNT DISTINCT改寫

StarRocks支援將COUNT DISTINCT計算改寫為BITMAP類型的計算,從而使用物化視圖實現高效能、精確的去重。例如,建立以下物化視圖。

USE test_db;
CREATE MATERIALIZED VIEW distinct_mv
DISTRIBUTED BY hash(lo_orderkey)
AS
SELECT lo_orderkey, bitmap_union(to_bitmap(lo_custkey)) AS distinct_customer
FROM lineorder
GROUP BY lo_orderkey;

該物化視圖可以改寫以下查詢。

USE test_db;
SELECT lo_orderkey, count(distinct lo_custkey) 
FROM lineorder 
GROUP BY lo_orderkey;

嵌套物化視圖改寫

StarRocks支援使用嵌套物化視圖改寫查詢。例如,建立以下物化視圖join_mv2agg_mv2agg_mv3

USE test_db;
CREATE MATERIALIZED VIEW join_mv2
DISTRIBUTED BY hash(lo_orderkey)
AS
SELECT lo_orderkey, lo_linenumber, lo_revenue, lo_discount, c_name, c_address
FROM lineorder INNER JOIN customer
ON lo_custkey = c_custkey;


CREATE MATERIALIZED VIEW agg_mv2
DISTRIBUTED BY hash(lo_orderkey)
AS
SELECT 
  lo_orderkey, 
  lo_linenumber, 
  c_name, 
  SUM(lo_revenue) AS total_revenue, 
  MAX(lo_discount) AS max_discount 
FROM join_mv2
GROUP BY lo_orderkey, lo_linenumber, c_name;

CREATE MATERIALIZED VIEW agg_mv3
DISTRIBUTED BY hash(lo_orderkey)
AS
SELECT 
  lo_orderkey, 
  SUM(total_revenue) AS total_revenue, 
  MAX(max_discount) AS max_discount 
FROM agg_mv2
GROUP BY lo_orderkey;

其關係如下。

image

agg_mv3可改寫以下查詢。

USE test_db;
SELECT 
  lo_orderkey, 
  SUM(lo_revenue) AS total_revenue, 
  MAX(lo_discount) AS max_discount 
FROM lineorder INNER JOIN customer
ON lo_custkey = c_custkey
GROUP BY lo_orderkey;

其原始查詢計劃和改寫後的計劃如下。

image

Union改寫

謂詞Union改寫

當物化視圖的謂詞範圍是查詢的謂詞範圍的子集時,可以使用UNION操作改寫查詢。

例如,建立以下物化視圖。

USE test_db;
CREATE MATERIALIZED VIEW agg_mv4
DISTRIBUTED BY hash(lo_orderkey)
AS
SELECT 
  lo_orderkey, 
  SUM(lo_revenue) AS total_revenue, 
  MAX(lo_discount) AS max_discount 
FROM lineorder
WHERE lo_orderkey < 300000000
GROUP BY lo_orderkey;

該物化視圖可以改寫以下查詢。

USE test_db;
SELECT 
  lo_orderkey, 
  SUM(lo_revenue) AS total_revenue, 
  MAX(lo_discount) AS max_discount 
FROM lineorder
GROUP BY lo_orderkey;

其原始查詢計劃和改寫後的計劃如下。

image

其中,agg_mv4包含lo_orderkey < 300000000的資料,lo_orderkey >= 300000000的資料通過直接查詢表lineorder得到,最終通過Union操作之後再彙總,擷取最終結果。

分區Union改寫

假設基於分區表建立了一個分區物化視圖。當查詢掃描的分區範圍是物化視圖最新分區範圍的超集時,查詢可被UNION改寫。

例如,有如下的物化視圖agg_mv5。基表lineorder當前包含分區p1p7,物化視圖也包含分區p1p7

USE test_db;
CREATE MATERIALIZED VIEW agg_mv5
DISTRIBUTED BY hash(lo_orderkey)
PARTITION BY RANGE(lo_orderdate)(
  START ("1993-01-01") END ("1999-01-01") EVERY (INTERVAL 1 YEAR)
)
REFRESH MANUAL
AS
SELECT 
  lo_orderdate, 
  lo_orderkey, 
  sum(lo_revenue) AS total_revenue, 
  max(lo_discount) AS max_discount 
FROM lineorder
GROUP BY lo_orderdate,lo_orderkey;

如果lineorder新增分區p8,其範圍為[("19990101"), ("20000101")),則以下查詢可被UNION改寫。

USE test_db;
SELECT 
  lo_orderdate, 
  lo_orderkey, 
  sum(lo_revenue) AS total_revenue, 
  max(lo_discount) AS max_discount 
FROM lineorder
GROUP BY lo_orderdate,lo_orderkey;

其原始查詢計劃和改寫後的計劃如下。

image

如上所示,agg_mv5包含來自分區p1p7的資料,而分區p8的資料來源於lineorder。最後,這兩組資料使用UNION操作合并。

基於視圖的物化視圖查詢改寫

自v3.1.0起,StarRocks支援基於視圖建立物化視圖。如果基於視圖的查詢為SPJG類型,StarRocks將會內聯展開查詢,然後進行改寫。預設情況下,對視圖的查詢會自動延伸為對視圖的基表的查詢,然後進行透明匹配和改寫。

然而,在實際情境中,您可能會基於複雜的嵌套視圖進行資料建模,這些視圖無法直接展開。因此,基於這些視圖建立的物化視圖無法改寫查詢。為了改進在上述情況下的能力,從v3.3.0開始,StarRock最佳化了基於視圖的物化視圖查詢改寫邏輯。

基本原理

在前面的查詢改寫邏輯中,StarRocks會將基於視圖的查詢展開為針對視圖基表的查詢。如果展開後查詢的執行計畫與SPJG模式不匹配,物化視圖將無法改寫查詢。

為瞭解決這個問題,StarRocks引入了一個新的運算元LogicalViewScanOperator,該運算元用於簡化執行計畫樹的結構,且無需展開查詢,使查詢執行計畫樹盡量滿足SPJG模式,從而最佳化查詢改寫。

以下樣本展示了一個包含彙總子查詢的查詢,一個建立在子查詢之上的視圖,基於視圖展開之後的查詢,以及建立在視圖之上的物化視圖。

-- 原始查詢。
SELECT 
  v1.a,
  t2.b,
  v1.total
FROM(
  SELECT 
    a,
    sum(c) AS total
  FROM t1
  GROUP BY a
) v1
INNER JOIN t2 ON v1.a = t2.a;

-- 建立在子查詢之上的視圖。
CREATE VIEW view_1 AS
SELECT 
  t1.a,
  sum(t1.c) AS total
FROM t1
GROUP BY t1.a;
    
-- 基於視圖的查詢。
SELECT 
  v1.a,
  t2.b,
  v1.total
FROM view_1 v1
JOIN t2 ON v1.a = t2.a;
    
-- 基於視圖的物化視圖。
CREATE MATERIALIZED VIEW mv1
DISTRIBUTED BY hash(a)
REFRESH MANUAL
AS
SELECT 
  v1.a,
  t2.b,
  v1.total
FROM view_1 v1
JOIN t2 ON v1.a = t2.a;

原始查詢的執行計畫如下圖左側所示。由於JOIN內的LogicalAggregateOperator與SPJG模式不匹配,StarRocks不支援這種情況下的查詢改寫。然而,如果將子查詢定義為一個視圖,原始查詢可以展開為針對該視圖的查詢。通過LogicalViewScanOperator,StarRocks可以將不匹配的部分轉換為SPJG模式,從而允許改寫查詢。

image

配置項

StarRocks預設禁用基於視圖的物化視圖查詢改寫。

要啟用此功能,您必須設定以下變數。

SET enable_view_based_mv_rewrite = true;

使用情境

基於單個視圖的物化視圖查詢改寫

StarRocks支援通過基於單個視圖的物化視圖進行查詢改寫,包括彙總查詢。

例如,您可以為TPC-H Query 18構建以下視圖和物化視圖。

USE test_db;
CREATE VIEW q18_view
AS
SELECT
  c_name,
  c_custkey,
  o_orderkey,
  o_orderdate,
  o_totalprice,
  sum(l_quantity)
FROM
  customer,
  orders,
  lineitem
WHERE
  o_orderkey IN (
    SELECT
      l_orderkey
    FROM
      lineitem
    GROUP BY
      l_orderkey having
        sum(l_quantity) > 315
  )
  AND c_custkey = o_custkey
  AND o_orderkey = l_orderkey
GROUP BY
    c_name,
    c_custkey,
    o_orderkey,
    o_orderdate,
    o_totalprice;

CREATE MATERIALIZED VIEW q18_mv
DISTRIBUTED BY hash(c_custkey, o_orderkey)
REFRESH MANUAL
AS
SELECT * FROM q18_view;

物化視圖可以改寫以下兩個查詢。

USE test_db;
-- SQL1
EXPLAIN LOGICAL SELECT * FROM q18_view;
+-------------------------------------------------------------------------------------------------------+
| Explain String                                                                                        |
+-------------------------------------------------------------------------------------------------------+
| - Output => [2:c_name, 1:c_custkey, 9:o_orderkey, 10:o_orderdate, 13:o_totalprice, 52:sum]            |
|     - SCAN [q18_mv] => [1:c_custkey, 2:c_name, 52:sum, 9:o_orderkey, 10:o_orderdate, 13:o_totalprice] |
|             MaterializedView: true                                                                    |
|             Estimates: {row: 9, cpu: 486.00, memory: 0.00, network: 0.00, cost: 243.00}               |
|             partitionRatio: 1/1, tabletRatio: 96/96                                                   |
|             1:c_custkey := 60:c_custkey                                                               |
|             2:c_name := 59:c_name                                                                     |
|             52:sum := 64:sum(l_quantity)                                                              |
|             9:o_orderkey := 61:o_orderkey                                                             |
|             10:o_orderdate := 62:o_orderdate                                                          |
|             13:o_totalprice := 63:o_totalprice                                                        |
+-------------------------------------------------------------------------------------------------------+
-- SQL2
EXPLAIN LOGICAL SELECT c_name, sum(`sum(l_quantity)`) FROM q18_view GROUP BY c_name;
+-----------------------------------------------------------------------------------------------------+
| Explain String                                                                                      |
+-----------------------------------------------------------------------------------------------------+
| - Output => [2:c_name, 59:sum]                                                                      |
|     - AGGREGATE(GLOBAL) [2:c_name]                                                                  |
|             Estimates: {row: 9, cpu: 306.00, memory: 306.00, network: 0.00, cost: 1071.00}          |
|             59:sum := sum(59:sum)                                                                   |
|         - EXCHANGE(SHUFFLE) [2]                                                                     |
|                 Estimates: {row: 9, cpu: 30.60, memory: 0.00, network: 30.60, cost: 306.00}         |
|             - AGGREGATE(LOCAL) [2:c_name]                                                           |
|                     Estimates: {row: 9, cpu: 61.20, memory: 30.60, network: 0.00, cost: 244.80}     |
|                     59:sum := sum(52:sum)                                                           |
|                 - SCAN [q18_mv] => [2:c_name, 52:sum]                                               |
|                         MaterializedView: true                                                      |
|                         Estimates: {row: 9, cpu: 306.00, memory: 0.00, network: 0.00, cost: 153.00} |
|                         partitionRatio: 1/1, tabletRatio: 96/96                                     |
|                         2:c_name := 60:c_name                                                       |
|                         52:sum := 65:sum(l_quantity)                                                |
+-----------------------------------------------------------------------------------------------------+

基於視圖的物化視圖改寫JOIN查詢

StarRocks支援對包含視圖之間或視圖與表之間的 JOIN 的查詢進行改寫,包括在JOIN上進行彙總。

例如,您可以建立以下視圖和物化視圖。

USE test_db;
CREATE VIEW view_1 AS
SELECT 
  l_partkey,
  l_suppkey,
  sum(l_quantity) AS total_quantity
FROM lineitem
GROUP BY 
  l_partkey,
  l_suppkey;


CREATE VIEW view_2 AS
SELECT 
  l_partkey,
  l_suppkey,
   sum(l_tax) AS total_tax
FROM lineitem
GROUP BY 
  l_partkey,
  l_suppkey;


CREATE MATERIALIZED VIEW mv_1 
DISTRIBUTED BY hash(l_partkey, l_suppkey) 
REFRESH MANUAL AS
SELECT 
  v1.l_partkey,
  v2.l_suppkey,
  total_quantity,
  total_tax
FROM view_1 v1
JOIN view_2 v2 ON v1.l_partkey = v2.l_partkey
  AND v1.l_suppkey = v2.l_suppkey;

物化視圖可以改寫以下兩個查詢。

USE test_db;
-- SQL1
EXPLAIN LOGICAL SELECT v1.l_partkey,
       v2.l_suppkey,
       total_quantity,
       total_tax
FROM view_1 v1
JOIN view_2 v2 ON v1.l_partkey = v2.l_partkey
  AND v1.l_suppkey = v2.l_suppkey;
+--------------------------------------------------------------------------------------------------------+
| Explain String                                                                                         |
+--------------------------------------------------------------------------------------------------------+
| - Output => [4:l_partkey, 25:l_suppkey, 17:sum, 37:sum]                                                |
|     - SCAN [mv_1] => [17:sum, 4:l_partkey, 37:sum, 25:l_suppkey]                                       |
|             MaterializedView: true                                                                     |
|             Estimates: {row: 799541, cpu: 31981640.00, memory: 0.00, network: 0.00, cost: 15990820.00} |
|             partitionRatio: 1/1, tabletRatio: 96/96                                                    |
|             17:sum := 43:total_quantity                                                                |
|             4:l_partkey := 41:l_partkey                                                                |
|             37:sum := 44:total_tax                                                                     |
|             25:l_suppkey := 42:l_suppkey                                                               |
+--------------------------------------------------------------------------------------------------------+
-- SQL2
EXPLAIN LOGICAL SELECT v1.l_partkey,
       sum(total_quantity),
       sum(total_tax)
FROM view_1 v1
JOIN view_2 v2 ON v1.l_partkey = v2.l_partkey
   AND v1.l_suppkey = v2.l_suppkey
group by v1.l_partkey;
+--------------------------------------------------------------------------------------------------------------------+
| Explain String                                                                                                     |
+--------------------------------------------------------------------------------------------------------------------+
| - Output => [4:l_partkey, 41:sum, 42:sum]                                                                          |
|     - AGGREGATE(GLOBAL) [4:l_partkey]                                                                              |
|             Estimates: {row: 196099, cpu: 4896864.00, memory: 3921980.00, network: 0.00, cost: 29521223.20}        |
|             41:sum := sum(41:sum)                                                                                  |
|             42:sum := sum(42:sum)                                                                                  |
|         - EXCHANGE(SHUFFLE) [4]                                                                                    |
|                 Estimates: {row: 136024, cpu: 489686.40, memory: 0.00, network: 489686.40, cost: 19228831.20}      |
|             - AGGREGATE(LOCAL) [4:l_partkey]                                                                       |
|                     Estimates: {row: 136024, cpu: 5756695.20, memory: 489686.40, network: 0.00, cost: 18249458.40} |
|                     41:sum := sum(17:sum)                                                                          |
|                     42:sum := sum(37:sum)                                                                          |
|                 - SCAN [mv_1] => [17:sum, 4:l_partkey, 37:sum]                                                     |
|                         MaterializedView: true                                                                     |
|                         Estimates: {row: 799541, cpu: 28783476.00, memory: 0.00, network: 0.00, cost: 14391738.00} |
|                         partitionRatio: 1/1, tabletRatio: 96/96                                                    |
|                         17:sum := 45:total_quantity                                                                |
|                         4:l_partkey := 43:l_partkey                                                                |
|                         37:sum := 46:total_tax                                                                     |
+--------------------------------------------------------------------------------------------------------------------+

基於視圖的物化視圖改寫外表查詢

您可以在External Catalog中的外表上構建視圖,然後基於這些視圖構建物化視圖來改寫查詢。其使用方式類似於內部表。

基於External Catalog構建物化視圖

StarRocks支援基於Hive Catalog、Hudi Catalog、Iceberg Catalog和Paimon Catalog的外部資料源上構建非同步物化視圖,並支援透明地改寫查詢。基於External Catalog的物化視圖支援大多數查詢改寫功能,但存在以下限制:

  • 基於Hudi、Paimon和JDBC Catalog建立的物化視圖不支援Union改寫。

  • 基於Hudi、Paimon和JDBC Catalog建立的物化視圖不支援View Delta Join改寫。

  • 基於Hudi和JDBC Catalog建立的物化視圖不支援分區增量重新整理。

基於文本的物化視圖改寫

自v3.3.0起,StarRocks支援基於文本的物化視圖改寫,極大地拓展了自身的查詢改寫能力。

基本原理

為實現基於文本的物化視圖改寫,StarRocks將對查詢(或其子查詢)的抽象文法樹與物化視圖定義的抽象文法樹進行比較。當雙方匹配時,StarRocks就可以基於物化視圖改寫該查詢。基於文本的物化視圖改寫簡單高效,與常規的SPJG類型物化視圖查詢改寫相比限制更少。正確使用此功能可顯著增強查詢效能。

基於文本的物化視圖改寫不僅支援SPJG類型運算元,還支援Union、Window、Order、Limit和CTE等運算元。

配置項

配置名稱

預設值

描述

enable_materialized_view_text_match_rewrite

true

用於控制是否啟用基於文本的物化視圖改寫。此功能預設啟用。將此項設定為false來手動禁用此功能。

enable_materialized_view_text_based_rewrite

true

用於控制是否在建立非同步物化視圖時構建抽象文法樹。此功能預設啟用。將此項設定為false將在系統層級禁用基於文本的物化視圖改寫。

materialized_view_subuqery_text_match_max_count

4

用於控制系統比對子查詢是否與物化視圖定義匹配的最大次數。預設值為4。增加此值同時也會增加最佳化器的耗時。

說明

只有當物化視圖滿足時效性(資料一致性)要求時,才能用於基於文本的查詢改寫。您可以在建立物化視圖時通過屬性query_rewrite_consistency手動設定一致性檢查規則。

使用情境

符合以下情況的查詢可以被改寫:

  • 原始查詢與物化視圖的定義一致。

  • 原始查詢的子查詢與物化視圖的定義一致。

與常規的SPJG類型物化視圖查詢改寫相比,基於文本的物化視圖改寫支援更複雜的查詢,例如多層彙總。

說明
  • 建議您將需要匹配的查詢封裝至原始查詢的子查詢中。

  • 請不要在物化視圖的定義或原始查詢的子查詢中封裝ORDER BY子句,否則查詢將無法被改寫。這是由於子查詢中的ORDER BY子句會預設被消除。

例如,您可以構建以下物化視圖。

USE test_db;
CREATE MATERIALIZED VIEW mv1 REFRESH MANUAL AS
SELECT 
  user_id, 
  count(1) 
FROM (
  SELECT 
    user_id, 
    time, 
    bitmap_union(to_bitmap(tag_id)) AS a
  FROM user_tags 
  GROUP BY 
    user_id, 
    time) t
GROUP BY user_id;

該物化視圖可以改寫以下兩個查詢。

USE test_db;
-- SQL1
SELECT 
  user_id, 
  count(1) 
FROM (
  SELECT 
    user_id, 
    time, 
    bitmap_union(to_bitmap(tag_id)) AS a
  FROM user_tags 
  GROUP BY 
    user_id, 
    time) t
GROUP BY user_id;

-- SQL2
SELECT count(1)
FROM
(
    SELECT 
      user_id, 
      count(1) 
    FROM (
      SELECT 
        user_id, 
        time, 
        bitmap_union(to_bitmap(tag_id)) AS a
      FROM user_tags 
      GROUP BY 
        user_id, 
        time) t
    GROUP BY user_id
)m;

但是該物化視圖無法改寫以下包含ORDER BY子句的查詢。

USE test_db;
SELECT 
  user_id, 
  count(1) 
FROM (
  SELECT 
    user_id, 
    time, 
    bitmap_union(to_bitmap(tag_id)) AS a
  FROM user_tags 
  GROUP BY 
    user_id, 
    time) t
GROUP BY user_id
ORDER BY user_id;

設定物化視圖查詢改寫

您可以通過以下Session變數設定非同步物化視圖查詢改寫。

變數

預設值

描述

enable_materialized_view_union_rewrite

true

是否開啟物化視圖Union改寫。

enable_rule_based_materialized_view_rewrite

true

是否開啟基於規則的物化視圖查詢改寫功能,主要用於處理單表查詢改寫。

nested_mv_rewrite_max_level

3

可用於查詢改寫的嵌套物化視圖的最大層數。類型:INT。取值範圍:[1, +∞)。取值為1表示只可使用基於基表建立的物化視圖來進行查詢改寫。

驗證查詢改寫是否生效

您可以使用EXPLAIN語句查看對應Query Plan。如果其中OlapScanNode專案下的TABLE為對應非同步物化視圖名稱,則表示該查詢已基於非同步物化視圖改寫。

USE test_db;
EXPLAIN SELECT
    order_id, sum(goods.price) AS total 
    FROM order_list INNER JOIN goods 
    ON goods.item_id1 = order_list.item_id2 
    GROUP BY order_id;
+------------------------------------+
| Explain String                     |
+------------------------------------+
| PLAN FRAGMENT 0                    |
|  OUTPUT EXPRS:1: order_id | 8: sum |
|   PARTITION: RANDOM                |
|                                    |
|   RESULT SINK                      |
|                                    |
|   1:Project                        |
|   |  <slot 1> : 9: order_id        |
|   |  <slot 8> : 10: total          |
|   |                                |
|   0:OlapScanNode                   |
|      TABLE: order_mv               |
|      PREAGGREGATION: ON            |
|      partitions=1/1                |
|      rollup: order_mv              |
|      tabletRatio=0/12              |
|      tabletList=                   |
|      cardinality=3                 |
|      avgRowSize=4.0                |
|      numNodes=0                    |
+------------------------------------+
20 rows in set (0.01 sec)

禁用查詢改寫

StarRocks預設開啟基於Default Catalog建立的非同步物化視圖查詢改寫。您可以通過將Session變數enable_materialized_view_rewrite設定為false禁用該功能。

對於基於External Catalog建立的非同步物化視圖,你可以通過ALTER MATERIALIZED VIEW將物化視圖 Propertyforce_external_table_query_rewrite設定為false來禁用此功能。