Saat menjalankan kueri yang melibatkan sejumlah besar data, lapisan SQL mengonsumsi banyak sumber daya komputasi karena membaca data dari lapisan mesin penyimpanan dan melakukan komputasi. Untuk meningkatkan kinerja kueri, PolarDB for MySQL mendorong filter Bloom ke lapisan mesin penyimpanan. Topik ini menjelaskan fitur penurunan filter Bloom.
Prasyarat
- Versi kluster PolarDB for MySQL Anda adalah 8.0 dengan revisi versi 8.0.2.2.3 atau lebih baru. Untuk informasi lebih lanjut tentang cara memeriksa versi kluster, lihat Kueri versi mesin.
- Data bertipe INT.
Latar Belakang
Filter Bloom adalah metode terbukti yang dapat mengurangi pembacaan data dari mesin penyimpanan dan meningkatkan efisiensi komputasi. PolarDB for MySQL mempercepat hash join menggunakan filter Bloom. Saat tabel hash dibuat, sistem membuat filter Bloom dan mendorongnya ke mesin probe. Mesin probe menggunakan filter Bloom untuk menyaring data yang tidak relevan bagi kueri. Metode ini secara signifikan meningkatkan kinerja kueri dengan mengurangi jumlah data yang ditransmisikan antara lapisan SQL dan lapisan mesin penyimpanan serta mengurangi beban komputasi di lapisan SQL.
Menggunakan fitur penurunan filter Bloom
| Parameter | Tingkat | Deskripsi |
| loose_bloom_filter_enabled | Global dan Sesi | Menentukan apakah filter Bloom diaktifkan. Nilai default: ON. Nilai valid:
|
Pengujian Kinerja
hash join with bloom filter pada parameter Extra menunjukkan bahwa filter Bloom digunakan. - Q3
EXPLAIN SELECT l_orderkey, SUM(l_extendedprice * (1 - l_discount)) AS revenue, o_orderdate, o_shippriority FROM customer, orders, lineitem WHERE c_mktsegment = 'MACHINERY' AND c_custkey = o_custkey AND l_orderkey = o_orderkey AND o_orderdate < '1995-03-10' AND l_shipdate > '1995-03-10' GROUP BY l_orderkey, o_orderdate, o_shippriority ORDER BY revenue DESC, o_orderdate LIMIT 10\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: customer partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 148.463 filtered: 10.00 Extra: Using where; Using temporary; Using filesort *************************** 2. row *************************** id: 1 select_type: SIMPLE table: orders partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1.486.962 filtered: 3.33 Extra: Using where; Using join buffer (hash join with bloom filter) *************************** 3. row *************************** id: 1 select_type: SIMPLE table: lineitem partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5.948.979 filtered: 3.33 Extra: Using where; Using join buffer (hash join with bloom filter) - Q11
EXPLAIN SELECT ps_partkey, SUM(ps_supplycost * ps_availqty) AS value FROM partsupp, supplier, nation WHERE ps_suppkey = s_suppkey AND s_nationkey = n_nationkey AND n_name = 'INDIA' GROUP BY ps_partkey HAVING SUM(ps_supplycost * ps_availqty) > (SELECT SUM(ps_supplycost * ps_availqty) * 0.0001000000 FROM partsupp, supplier, nation WHERE ps_suppkey = s_suppkey AND s_nationkey = n_nationkey AND n_name = 'INDIA' ) ORDER BY value DESC\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: nation partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 25 filtered: 10.00 Extra: Using where; Using temporary; Using filesort *************************** 2. row *************************** id: 1 select_type: PRIMARY table: supplier partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 10.000 filtered: 10.00 Extra: Using where; Using join buffer (hash join with bloom filter) *************************** 3. row *************************** id: 1 select_type: PRIMARY table: partsupp partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 791.815 filtered: 10.00 Extra: Using where; Using join buffer (hash join with bloom filter) *************************** 4. row *************************** id: 2 select_type: SUBQUERY table: nation partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 25 filtered: 10.00 Extra: Using where *************************** 5. row *************************** id: 2 select_type: SUBQUERY table: supplier partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 10.000 filtered: 10.00 Extra: Using where; Using join buffer (hash join with bloom filter) *************************** 6. row *************************** id: 2 select_type: SUBQUERY table: partsupp partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 791.815 filtered: 10.00 Extra: Using where; Using join buffer (hash join with bloom filter) - Q16
EXPLAIN SELECT p_brand, p_type, p_size, COUNT(DISTINCT ps_suppkey) AS supplier_cnt FROM partsupp, part WHERE p_partkey = ps_partkey AND p_brand <> 'Brand#33' AND p_type NOT LIKE 'PROMO POLISHED%' AND p_size IN (34, 45, 33, 42, 9, 24, 26, 7) AND ps_suppkey NOT IN (SELECT s_suppkey FROM supplier WHERE s_comment LIKE '%Customer%Complaints%' ) GROUP BY p_brand, p_type, p_size ORDER BY supplier_cnt DESC, p_brand, p_type, p_size\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: part partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 198.116 filtered: 40.00 Extra: Using where; Using temporary; Using filesort *************************** 2. row *************************** id: 1 select_type: PRIMARY table: partsupp partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 791.815 filtered: 10.00 Extra: Using where; Using join buffer (hash join with bloom filter) *************************** 3. row *************************** id: 2 select_type: SUBQUERY table: supplier partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 10.000 filtered: 11.11 Extra: Using where
Hasil Pengujian
