This topic uses TPC-H queries as examples to describe how to use parallel query. In all the examples, the amount of data used is SF = 100 GB that is defined in the TPC-H benchmark. The tested PolarDB node is a primary node that has an 88-core CPU and 710 GB of memory.

Support for GROUP BY and ORDER BY

When parallel query is disabled, it requires 1,563.32 seconds to run the query. After parallel query is enabled, it only requires 49.65 seconds to run the query, which is 31.48 times shorter than the original query time.

The following shows an example of the original SQL statement.
SELECT   l_returnflag, 
         l_linestatus, 
         Sum(l_quantity)                                       AS sum_qty, 
         Sum(l_extendedprice)                                  AS sum_base_price, 
         Sum(l_extendedprice * (1 - l_discount))               AS sum_disc_price, 
         Sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge, 
         Avg(l_quantity)                                       AS avg_qty, 
         Avg(l_extendedprice)                                  AS avg_price, 
         Avg(l_discount)                                       AS avg_disc, 
         Count(*)                                              AS count_order 
FROM     lineitem
WHERE    l_shipdate <= date '1998-12-01' - INTERVAL '93' day 
GROUP BY l_returnflag, 
         l_linestatus 
ORDER BY l_returnflag, 
         l_linestatus ;
When parallel query is disabled, it requires 1,563.32 seconds to run the query.不打开
After parallel query is enabled, it only requires 49.65 seconds to run the query, which is 31.48 times shorter than the original query time.打开并行查询

Support for AGGREGATE functions (SUM/AVG/COUNT)

When parallel query is disabled, it requires 1,563.32 seconds to run the query. After parallel query is enabled, it only requires 49.65 seconds to run the query, which is 31.48 times shorter than the original query time.

The following shows an example of the original SQL statement.
SELECT   l_returnflag, 
         l_linestatus, 
         Sum(l_quantity)                                       AS sum_qty, 
         Sum(l_extendedprice)                                  AS sum_base_price, 
         Sum(l_extendedprice * (1 - l_discount))               AS sum_disc_price, 
         Sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge, 
         Avg(l_quantity)                                       AS avg_qty, 
         Avg(l_extendedprice)                                  AS avg_price, 
         Avg(l_discount)                                       AS avg_disc, 
         Count(*)                                              AS count_order 
FROM     lineitem
WHERE    l_shipdate <= date '1998-12-01' - INTERVAL '93' day 
GROUP BY l_returnflag, 
         l_linestatus 
ORDER BY l_returnflag, 
         l_linestatus ;
When parallel query is disabled, it requires 1,563.32 seconds to run the query. off
After parallel query is enabled, it only requires 49.65 seconds to run the query, which is 31.48 times shorter than the original query time.on

Support for JOIN

When parallel query is disabled, it requires 21.73 seconds to run the query. After parallel query is enabled, it only requires 1.37 seconds to run the query, which is 15.86 times shorter than the original query time.

The following shows an example of the original SQL statement.
select sum(l_extendedprice* (1 - l_discount)) as revenue 
from   lineitem,   part 
where ( p_partkey = l_partkey and p_brand = 'Brand#12'
        and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') 
        and l_quantity >= 6 and l_quantity <= 6 + 10 
        and p_size between 1 and 5 
        and l_shipmode in ('AIR', 'AIR REG') 
        and l_shipinstruct = 'DELIVER IN PERSON' ) 
    or ( p_partkey = l_partkey and p_brand = 'Brand#13'
        and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') 
        and l_quantity >= 10 and l_quantity <= 10 + 10 
        and p_size between 1 and 10 
        and l_shipmode in ('AIR', 'AIR REG') 
        and l_shipinstruct = 'DELIVER IN PERSON' ) 
    or ( p_partkey = l_partkey and p_brand = 'Brand#24'
        and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') 
        and l_quantity >= 21 and l_quantity <= 21 + 10 
        and p_size between 1 and 15 
        and l_shipmode in ('AIR', 'AIR REG') 
        and l_shipinstruct = 'DELIVER IN PERSON' ); 
When parallel query is disabled, it requires 21.73 seconds to run the query.off
After parallel query is enabled, it only requires 1.37 seconds to run the query, which is 15.86 times shorter than the original query time.on

Support for the BETWEEN function and IN function

When parallel query is disabled, it requires 21.73 seconds to run the query. After parallel query is enabled, it only requires 1.37 seconds to run the query, which is 15.86 times shorter than the original query time.

The following shows an example of the original SQL statement.
select sum(l_extendedprice* (1 - l_discount)) as revenue 
from   lineitem,   part 
where ( p_partkey = l_partkey and p_brand = 'Brand#12'
        and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') 
        and l_quantity >= 6 and l_quantity <= 6 + 10 
        and p_size between 1 and 5 
        and l_shipmode in ('AIR', 'AIR REG') 
        and l_shipinstruct = 'DELIVER IN PERSON' ) 
    or ( p_partkey = l_partkey and p_brand = 'Brand#13'
        and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') 
        and l_quantity >= 10 and l_quantity <= 10 + 10 
        and p_size between 1 and 10 
        and l_shipmode in ('AIR', 'AIR REG') 
        and l_shipinstruct = 'DELIVER IN PERSON' ) 
    or ( p_partkey = l_partkey and p_brand = 'Brand#24'
        and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') 
        and l_quantity >= 21 and l_quantity <= 21 + 10 
        and p_size between 1 and 15 
        and l_shipmode in ('AIR', 'AIR REG') 
        and l_shipinstruct = 'DELIVER IN PERSON' ); 
When parallel query is disabled, it requires 21.73 seconds to run the query.off
After parallel query is enabled, it only requires 1.37 seconds to run the query, which is 15.86 times shorter than the original query time.on

Support for LIMIT

When parallel query is disabled, it requires 339.22 seconds to run the query. After parallel query is enabled, it only requires 29.31 seconds to run the query, which is 11.57 times shorter than the original query time.

The following shows an example of the original SQL statement.
select l_shipmode, sum(case when o_orderpriority = '1-URGENT' or o_orderpriority = '2-HIGH' then 1 
    else 0 
end) as high_line_count, sum(case when o_orderpriority <> '1-URGENT' and o_orderpriority <> '2-HIGH' then 1 
else 0 
end) as low_line_count 
from   orders,   lineitem 
where o_orderkey = l_orderkey 
and l_shipmode in ('MAIL', 'TRUCK') 
and l_commitdate < l_receiptdate 
and l_shipdate < l_commitdate 
and l_receiptdate >= date '1996-01-01' 
and l_receiptdate < date '1996-01-01' + interval '1' year 
group by l_shipmode 
order by l_shipmode limit 10; 
When parallel query is disabled, it requires 339.22 seconds to run the query.off
After parallel query is enabled, it only requires 29.31 seconds to run the query, which is 11.57 times shorter than the original query time.on

Support for the INTERVAL function

When parallel query is disabled, it requires 220.87 seconds to run the query. After parallel query is enabled, it only requires 7.75 seconds to run the query, which is 28.5 times shorter than the original query time.

The following shows an example of the original SQL statement.
select 
    100.00 * sum(case when p_type like 'PROMO%' then l_extendedprice * (1 - l_discount) 
    else 0 
end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue 
from   lineitem,   part 
where l_partkey = p_partkey
and l_shipdate >= date '1996-01-01' 
and l_shipdate < date '1996-01-01' + interval '1' month limit 10; 
When parallel query is disabled, it requires 220.87 seconds to run the query.off
After parallel query is enabled, it only requires 7.75 seconds to run the query, which is 28.5 times shorter than the original query time.on

Support for CASE WHEN

When parallel query is disabled, it requires 220.87 seconds to run the query. After parallel query is enabled, it only requires 7.75 seconds to run the query, which is 28.5 times shorter than the original query time.

The following shows an example of the original SQL statement.
select 
    100.00 * sum(case when p_type like 'PROMO%' then l_extendedprice * (1 - l_discount) 
    else 0 
end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue 
from   lineitem,   part 
where l_partkey = p_partkey
and l_shipdate >= date '1996-01-01' 
and l_shipdate < date '1996-01-01' + interval '1' month limit 10; 
When parallel query is disabled, it requires 220.87 seconds to run the query.off
After parallel query is enabled, it only requires 7.75 seconds to run the query, which is 28.5 times shorter than the original query time.on

Support for LIKE

When parallel query is disabled, it requires 427.46 seconds to run the query. After parallel query is enabled, it only requires 33.72 seconds to run the query, which is 12.68 times shorter than the original query time.

The following shows an example of the original SQL statement.
select s_name, s_address from
 supplier,  nation where
s_suppkey in 
    ( select ps_suppkey from  partsupp where
             ps_partkey in ( select p_partkey from  part where p_name like 'dark%')
            and ps_availqty>(select 0.0005 * sum(l_quantity) as col1
     from   lineitem,   partsupp
     where l_partkey = ps_partkey and l_suppkey = ps_suppkey
     and l_shipdate >= date '1993-01-01' and l_shipdate < date '1993-01-01' + interval '1' year)
    )
and s_nationkey = n_nationkey and n_name = 'JORDAN'
order by s_name limit 10; 
When parallel query is disabled, it requires 427.46 seconds to run the query.
After parallel query is enabled, it only requires 33.72 seconds to run the query, which is 12.68 times shorter than the original query time.