This topic describes how to use hash joins in parallel queries in PolarDB.
The PolarDB cluster version is PolarDB for MySQL 8.0 and the revision version is 188.8.131.52.0 or later.
Use hash joins in parallel queries
PolarDB allows you to execute only the
EXPLAIN FORMAT=TREEstatement to check whether hash joins are used.
In the following example, two tables are created and some data is inserted into the two tables.
CREATE TABLE t1 (c1 INT, c2 INT); CREATE TABLE t2 (c1 INT, c2 INT); INSERT INTO t1 VALUES (1,1),(2,2),(3,3),(5,5); INSERT INTO t2 VALUES (1,1),(2,2),(3,3),(5,5);
In the following example, the query plan for joining the preceding two tables is returned.
EXPLAIN FORMAT=TREE EXPLAIN | -> Gather: 4 worker(s) (cost=10.60 rows=4) -> Inner hash join (t2.c1 = t1.c1) (cost=0.81 rows=1) -> Table scan on t2 (cost=0.09 rows=4) -> Hash -> Table scan on t1 with 0 parallel partitions (cost=0.16 rows=1)
In the preceding parallel query plan, the degree of parallelism (DOP) of 4 is used. This indicates that PolarDB uses four workers to perform parallel queries. Four workers scan table t1 in parallel. Each worker uses only part of data in table t1 to create a hash table. Then, a JOIN operation is performed between the four hash tables and table t2. A leader gathers the joined results to generate final query results.