全部产品
Search
文档中心

PolarDB:Penurunan Filter Bloom

更新时间:Jul 06, 2025

Saat menjalankan kueri yang melibatkan sejumlah besar data, lapisan SQL mengonsumsi banyak sumber daya komputasi karena membaca data dari lapisan mesin penyimpanan dan melakukan komputasi. Untuk meningkatkan kinerja kueri, PolarDB for MySQL mendorong filter Bloom ke lapisan mesin penyimpanan. Topik ini menjelaskan fitur penurunan filter Bloom.

Prasyarat

  • Versi kluster PolarDB for MySQL Anda adalah 8.0 dengan revisi versi 8.0.2.2.3 atau lebih baru. Untuk informasi lebih lanjut tentang cara memeriksa versi kluster, lihat Kueri versi mesin.
  • Data bertipe INT.

Latar Belakang

Filter Bloom adalah metode terbukti yang dapat mengurangi pembacaan data dari mesin penyimpanan dan meningkatkan efisiensi komputasi. PolarDB for MySQL mempercepat hash join menggunakan filter Bloom. Saat tabel hash dibuat, sistem membuat filter Bloom dan mendorongnya ke mesin probe. Mesin probe menggunakan filter Bloom untuk menyaring data yang tidak relevan bagi kueri. Metode ini secara signifikan meningkatkan kinerja kueri dengan mengurangi jumlah data yang ditransmisikan antara lapisan SQL dan lapisan mesin penyimpanan serta mengurangi beban komputasi di lapisan SQL.

Menggunakan fitur penurunan filter Bloom

Anda dapat menggunakan parameter loose_bloom_filter_enabled untuk mengaktifkan filter Bloom. Untuk informasi lebih lanjut, lihat Konfigurasi Parameter Kluster dan Node.
ParameterTingkatDeskripsi
loose_bloom_filter_enabledGlobal dan SesiMenentukan apakah filter Bloom diaktifkan. Nilai default: ON. Nilai valid:
  • ON
  • OFF

Pengujian Kinerja

Dalam pengujian kinerja ini, skema TPC-H tanpa kunci utama dan indeks digunakan. Berikut ini adalah rencana eksekusi untuk Q3, Q11, dan Q16 dari TPC-H. hash join with bloom filter pada parameter Extra menunjukkan bahwa filter Bloom digunakan.
  • Q3
    EXPLAIN
    SELECT l_orderkey,
           SUM(l_extendedprice * (1 - l_discount)) AS revenue,
           o_orderdate,
           o_shippriority
    FROM customer,
         orders,
         lineitem
    WHERE c_mktsegment = 'MACHINERY'
      AND c_custkey = o_custkey
      AND l_orderkey = o_orderkey
      AND o_orderdate < '1995-03-10'
      AND l_shipdate > '1995-03-10'
    GROUP BY l_orderkey,
             o_orderdate,
             o_shippriority
    ORDER BY revenue DESC,
             o_orderdate
    LIMIT 10\G
    *************************** 1. row ***************************
               id: 1
      select_type: SIMPLE
            table: customer
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 148.463
         filtered: 10.00
            Extra: Using where; Using temporary; Using filesort
    *************************** 2. row ***************************
               id: 1
      select_type: SIMPLE
            table: orders
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 1.486.962
         filtered: 3.33
            Extra: Using where; Using join buffer (hash join with bloom filter)
    *************************** 3. row ***************************
               id: 1
      select_type: SIMPLE
            table: lineitem
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 5.948.979
         filtered: 3.33
            Extra: Using where; Using join buffer (hash join with bloom filter)
  • Q11
    EXPLAIN
    SELECT ps_partkey,
           SUM(ps_supplycost * ps_availqty) AS value
    FROM partsupp,
         supplier,
         nation
    WHERE ps_suppkey = s_suppkey
      AND s_nationkey = n_nationkey
      AND n_name = 'INDIA'
    GROUP BY ps_partkey
    HAVING SUM(ps_supplycost * ps_availqty) >
      (SELECT SUM(ps_supplycost * ps_availqty) * 0.0001000000
       FROM partsupp,
            supplier,
            nation
       WHERE ps_suppkey = s_suppkey
         AND s_nationkey = n_nationkey
         AND n_name = 'INDIA' )
    ORDER BY value DESC\G
    *************************** 1. row ***************************
               id: 1
      select_type: PRIMARY
            table: nation
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 25
         filtered: 10.00
            Extra: Using where; Using temporary; Using filesort
    *************************** 2. row ***************************
               id: 1
      select_type: PRIMARY
            table: supplier
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 10.000
         filtered: 10.00
            Extra: Using where; Using join buffer (hash join with bloom filter)
    *************************** 3. row ***************************
               id: 1
      select_type: PRIMARY
            table: partsupp
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 791.815
         filtered: 10.00
            Extra: Using where; Using join buffer (hash join with bloom filter)
    *************************** 4. row ***************************
               id: 2
      select_type: SUBQUERY
            table: nation
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 25
         filtered: 10.00
            Extra: Using where
    *************************** 5. row ***************************
               id: 2
      select_type: SUBQUERY
            table: supplier
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 10.000
         filtered: 10.00
            Extra: Using where; Using join buffer (hash join with bloom filter)
    *************************** 6. row ***************************
               id: 2
      select_type: SUBQUERY
            table: partsupp
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 791.815
         filtered: 10.00
            Extra: Using where; Using join buffer (hash join with bloom filter)
  • Q16
    EXPLAIN SELECT p_brand,
           p_type,
           p_size,
           COUNT(DISTINCT ps_suppkey) AS supplier_cnt
    FROM partsupp,
         part
    WHERE p_partkey = ps_partkey
      AND p_brand <> 'Brand#33'
      AND p_type NOT LIKE 'PROMO POLISHED%'
      AND p_size IN (34,
                     45,
                     33,
                     42,
                     9,
                     24,
                     26,
                     7)
      AND ps_suppkey NOT IN
        (SELECT s_suppkey
         FROM supplier
         WHERE s_comment LIKE '%Customer%Complaints%' )
    GROUP BY p_brand,
             p_type,
             p_size
    ORDER BY supplier_cnt DESC,
             p_brand,
             p_type,
             p_size\G
    *************************** 1. row ***************************
               id: 1
      select_type: PRIMARY
            table: part
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 198.116
         filtered: 40.00
            Extra: Using where; Using temporary; Using filesort
    *************************** 2. row ***************************
               id: 1
      select_type: PRIMARY
            table: partsupp
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 791.815
         filtered: 10.00
            Extra: Using where; Using join buffer (hash join with bloom filter)
    *************************** 3. row ***************************
               id: 2
      select_type: SUBQUERY
            table: supplier
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 10.000
         filtered: 11.11
            Extra: Using where

Hasil Pengujian

Pengujian TPC-H dilakukan dengan faktor skala 1. Gambar berikut menunjukkan perbedaan kinerja antara mengaktifkan dan menonaktifkan filter Bloom.Bloom Filter