All Products
Search
Document Center

Join algorithms

Last Updated: Jun 18, 2021

OceanBase Database supports three join algorithms: Nested Loop Join, Hash Join, and Merge Join.

Hash Join and Merge Join apply only to equivalent join conditions. Nested Loop Join applies to all join conditions.

Nested Loop Join

The Nested Loop Join scans a table (the outer table) and each time when a record in the table is read, it scans another table (the inner table) to find data that meets the conditions.

The scan can either be a fast index scan or a full table scan. Generally, the performance of a full table scan is low, so the optimizer does not use Nested Loop Join if the column specified in the join condition does not have an index. In OceanBase Database, the execution plan shows whether it is possible to use a fast index scan.

In the following example, the first plan executes a full table scan for the inner table because the join condition is t1.c = t2.c, but table t2 does not have an index on the c column. For the second plan, the index is used to quickly locate the matching rows in the inner table. This is because that the join condition is t1.b = t2.b, and Index k1 created on column b is selected for table t2 as the access path, which means that for every value in column b in each row of table t1, the matching row in table t2 can be quickly located.

obclient>CREATE TABLE t1(a INT PRIMARY KEY, b INT, c INT, KEY k1(b));
Query OK, 0 rows affected (0.24 sec)

obclient>>CREATE TABLE t2(a INT PRIMARY KEY, b INT, c INT, KEY k1(b));
Query OK, 0 rows affected (0.29 sec)

obclient> EXPLAIN EXTENDED_NOADDR SELECT/*+USE_NL(t1 t2)*/ * FROM t1, t2 
      WHERE t1.c = t2.c;
+-----------------------------------------------------------------+
| Query Plan                                                                              |
+-----------------------------------------------------------------+
| ===========================================
|ID|OPERATOR        |NAME|EST. ROWS|COST  |
-------------------------------------------
|0 |NESTED-LOOP JOIN|    |1980     |623742|
|1 | TABLE SCAN     |t1  |1000     |455   |
|2 | TABLE SCAN     |t2  |2        |622   |
===========================================

Outputs & filters:
-------------------------------------
  0 - output([t1.a], [t1.b], [t1.c], [t2.a], [t2.b], [t2.c]), filter(nil),
      conds(nil), nl_params_([t1.c])
  1 - output([t1.c], [t1.a], [t1.b]), filter(nil),
      access([t1.c], [t1.a], [t1.b]), partitions(p0),
      is_index_back=false,
      range_key([t1.a]), range(MIN ; MAX)always true
  2 - output([t2.c], [t2.a], [t2.b]), filter([? = t2.c]),
      access([t2.c], [t2.a], [t2.b]), partitions(p0),
      is_index_back=false, filter_before_indexback[false],
      range_key([t2.a]), range(MIN ; MAX)

obclient> EXPLAIN EXTENDED_NOADDR SELECT/*+USE_NL(t1 t2)*/ * FROM t1, t2 
      WHERE t1.b = t2.b;
+-----------------------------------------------------------------+
| Query Plan                                                                              |
+-----------------------------------------------------------------+
| ============================================
|ID|OPERATOR        |NAME  |EST. ROWS|COST |
--------------------------------------------
|0 |NESTED-LOOP JOIN|      |1980     |94876|
|1 | TABLE SCAN     |t1    |1000     |455  |
|2 | TABLE SCAN     |t2(k1)|2        |94   |
============================================

Outputs & filters:
-------------------------------------
  0 - output([t1.a], [t1.b], [t1.c], [t2.a], [t2.b], [t2.c]), filter(nil),
      conds(nil), nl_params_([t1.b])
  1 - output([t1.b], [t1.a], [t1.c]), filter(nil),
      access([t1.b], [t1.a], [t1.c]), partitions(p0),
      is_index_back=false,
      range_key([t1.a]), range(MIN ; MAX)always true
  2 - output([t2.b], [t2.a], [t2.c]), filter(nil),
      access([t2.b], [t2.a], [t2.c]), partitions(p0),
      is_index_back=true,
      range_key([t2.b], [t2.a]), range(MIN ; MAX),
      range_cond([? = t2.b])

The Nested Loop Join algorithm is used to perform multiple full table scans on the inner table, because each scan requires an iteration in the storage layer, which leads to relatively high costs. Therefore, OceanBase Database allows the materialization of the results from inner table scanning in the memory, so that data in the memory is directly scanned the next time, without multiple scans in the storage layer. However, materialization in memory is not costless. Therefore, the OceanBase optimizer determines whether to perform inner table materialization based on the costs.

As an optimized variant of the Nested Loop Join algorithm, the Blocked Nested Loop Join algorithm is used to read a block-sized row from the outer table at a time and scan the inner table to find the matching data. This reduces the number of reads for the inner table.

The Nested Loop Join algorithm is typically used when the inner table has a limited number of rows and the outer table has an index on the column specified in the join condition. In this case, the matching data for each row in the inner table can be quickly located by using the index.

OceanBase Database also enables you to use the Nested Loop Join algorithm to join multiple tables by adding the hint /*+ USE_NL(table_name_list) */ to the query. In the following example, system uses the Hash Join algorithm. The user can add the preceding hint to the query to make the system use the Nested Loop Join algorithm.

obclient>CREATE TABLE t1(c1 INT, c2 INT);
Query OK, 0 rows affected (0.97 sec)

obclient>CREATE TABLE t2(c1 INT, c2 INT);
Query OK, 0 rows affected (0.29 sec)

obclient>EXPLAIN SELECT * FROM t1,t2 WHERE t1.c1 = t2.c1;
+-----------------------------------------------------------------+
| Query Plan                                                                              |
+-----------------------------------------------------------------+
| ========================================
|ID|OPERATOR   |NAME|EST. ROWS|COST    |
----------------------------------------
|0 |HASH JOIN  |    |98010000 |66774608|
|1 | TABLE SCAN|T1  |100000   |68478   |
|2 | TABLE SCAN|T2  |100000   |68478   |
========================================

Outputs & filters: 
-------------------------------------
  0 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2]), filter(nil), 
      equal_conds([T1.C1 = T2.C1]), other_conds(nil)
  1 - output([T1.C1], [T1.C2]), filter(nil), 
      access([T1.C1], [T1.C2]), partitions(p0)
  2 - output([T2.C1], [T2.C2]), filter(nil), 
      access([T2.C1], [T2.C2]), partitions(p0)

obclient>EXPLAIN SELECT /*+USE_NL(t1, c2)*/* FROM  t1, t2 WHERE t1.c1 = t2.c1;
+-----------------------------------------------------------------+
| Query Plan                                                                              |
+-----------------------------------------------------------------+
| ===============================================
|ID|OPERATOR        |NAME|EST. ROWS|COST      |
-----------------------------------------------
|0 |NESTED-LOOP JOIN|    |98010000 |4595346207|
|1 | TABLE SCAN     |T1  |100000   |68478     |
|2 | MATERIAL       |    |100000   |243044    |
|3 |  TABLE SCAN    |T2  |100000   |68478     |
===============================================

Outputs & filters: 
-------------------------------------
  0 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2]), filter(nil), 
      conds([T1.C1 = T2.C1]), nl_params_(nil)
  1 - output([T1.C1], [T1.C2]), filter(nil), 
      access([T1.C1], [T1.C2]), partitions(p0)
  2 - output([T2.C1], [T2.C2]), filter(nil)
  3 - output([T2.C1], [T2.C2]), filter(nil), 
      access([T2.C1], [T2.C2]), partitions(p0)

Nested Loop Join is also implemented by the following two algorithms:

  • Blocked Nested Loop Join

    The Blocked Nested Loop Join algorithm is implemented in OceanBase Database in the form of the Batch Nested Loop Join algorithm, which reads data rows from the outer table in batches (1,000 rows by default) before scanning the inner table to find the matching data. This algorithm matches data in the outer table and that in the inner table in batches. This approach reduces the number of reads from the inner table and the number of inner loops.

    In the following example, the batch_join=true field indicates specifies to use the Batch Nested Loop Join algorithm in the query.

    obclient>CREATE TABLE t1(c1 INT PRIMARY KEY);
    Query OK, 0 rows affected (0.97 sec)
    
    obclient>CREATE TABLE t2(c1 INT PRIMARY KEY);
    Query OK, 0 rows affected (0.97 sec)
    
    obclient>EXPLAIN EXTENDED_NOADDR SELECT /*+USE_NL(t1,t2)*/*  FROM t1,t2 
              WHERE t1.c1=t2.c1\G;
    *************************** 1. row ***************************
    Query Plan: 
    ============================================
    |ID|OPERATOR        |NAME|EST. ROWS|COST   |
    --------------------------------------------
    |0 |NESTED-LOOP JOIN|    |100001   |3728786|
    |1 | TABLE SCAN     |t1  |100000   |59654  |
    |2 | TABLE GET      |t2  |1        |36     |
    ============================================
    
    Outputs & filters: 
    -------------------------------------
      0 - output([t1.c1], [t2.c1]), filter(nil), 
          conds(nil), nl_params_([t1.c1]), inner_get=false, self_join=false, batch_join=true
      1 - output([t1.c1]), filter(nil), 
          access([t1.c1]), partitions(p0), 
          is_index_back=false, 
          range_key([t1.c1]), range(MIN ; MAX)always true
      2 - output([t2.c1]), filter(nil), 
          access([t2.c1]), partitions(p0), 
          is_index_back=false, 
          range_key([t2.c1]), range(MIN ; MAX), 
          range_cond([? = t2.c1])
  • Index Nested Loop Join

    This is an index-based join algorithm that directly matches the index of the inner table with matching conditions of the outer table. It avoids comparison with every record of the inner table, and reduces the time required to match the inner table.

    As shown in the following example, where a join condition is t1.c1 = t2.c1, the Index Nested Loop Join is usable when column c1 of table t2 or column c1 of table t1 is indexed.

    obclient>CREATE TABLE t1(c1 INT PRIMARY KEY);
    Query OK, 0 rows affected (0.97 sec)
    
    obclient>CREATE TABLE t2(c1 INT ,c2 INT);
    Query OK, 0 rows affected (0.97 sec)
    
    obclient>EXPLAIN SELECT /*+ORDERED USE_NL(t2,t1)*/ * FROM t2,
              (SELECT /*+NO_MERGE*/ * FROM t1)t1 
              WHERE t1.c1 = t2.c1 AND t2.c2 = 1\G;
    *************************** 1. row ***************************
    Query Plan: 
    =========================================== 
    |ID|OPERATOR |NAME|EST. ROWS|COST | 
    ------------------------------------------- 
    |0 |NESTED-LOOP JOIN| |981 |117272| 
    |1 | TABLE SCAN |t2 |990 |80811 | 
    |2 | SUBPLAN SCAN |t1 |1 |37 | 
    |3 | TABLE GET |t1 |1 |36 | 
    =========================================== 
    Outputs & filters: 
    ------------------------------------- 
    0 - output([t2.c1], [t2.c2], [t1.c1]), filter(nil), conds(nil), nl_params_([t2.c1]) 
    1 - output([t2.c1], [t2.c2]), filter([t2.c2 = 1]), access([t2.c1], [t2.c2]), partitions(p0) 
    2 - output([t1.c1]), filter(nil), access([t1.c1]) 
    3 - output([t1.c1]), filter(nil), access([t1.c1]), partitions(p0)

    In the Outputs & filters section, parameter [t2.c1] of nl_param indicates the execution of conditional pushdown optimization. For more information, see JOIN.

    In most cases of query optimization, the OceanBase optimizer prioritizes the Index Nested Loop Join algorithm before it uses the Batch Nested Loop Join algorithm. These two algorithms can be used together. Finally, the Nested Loop Join algorithm is used.

Merge Join

To join two tables, the Merge Join algorithm is used to sort them by join fields before scanning them for joining. External sorting is required if the memory is insufficient.

Merge Join starts by taking a record from each table for matching. If the record meets the join condition, it is put into the result set. Otherwise, the optimizer discards the record that has a smaller join field value and continues by taking the next record from the table of that discarded record, until the matching record is found.

Temporary space is required for merging two tables with multiple JOIN fields. For example, when Merge Join is applied for A JOIN B, if A and B both have join fields of multiple records, such as A1, A2, and An for A and B1, B2, and Bn for B. Then, every record of Amust be matchedwith its counterparts in B. In this case, the pointer needs to move from B1 to Bn many times, and read a recordeach time. So,it is faster if the records from B1 to Bn are fetched to an in-memory temporary table in advance than reading data on its original storage page or disk. In some scenarios, sorting is skipped if an index is applied to join fields in a consistent order.

Generally, Merge Join is suitable for ordered tables. Otherwise, Hash Join would be better. The following example displays two plans that use the Merge Join algorithm. Sorting is required for the first one, but not for the second one. This is because the two k1 indexes are used as the access paths of the two tables, and the k1 indexes are sorted by column b.

obclient> CREATE TABLE t1(a INT PRIMARY KEY, b INT, c INT, KEY k1(b));
Query OK, 0 rows affected (0.24 sec)

obclient> CREATE TABLE t2(a INT PRIMARY KEY, b INT, c INT, KEY k1(b));
Query OK, 0 rows affected (0.29 sec)

obclient> EXPLAIN SELECT/*+USE_MERGE(t1 t2)*/ * FROM t1, t2 WHERE t1.c = t2.c;
*************************** 1. row ***************************
Query Plan: 
| =====================================
|ID|OPERATOR    |NAME|EST. ROWS|COST|
-------------------------------------
|0 |MERGE JOIN  |    |1980     |6011|
|1 | SORT       |    |1000     |2198|
|2 |  TABLE SCAN|t1  |1000     |455 |
|3 | SORT       |    |1000     |2198|
|4 |  TABLE SCAN|t2  |1000     |455 |
=====================================

Outputs & filters:
-------------------------------------
  0 - output([t1.a], [t1.b], [t1.c], [t2.a], [t2.b], [t2.c]), filter(nil),
      equal_conds([t1.c = t2.c]), other_conds(nil)
  1 - output([t1.a], [t1.b], [t1.c]), filter(nil), sort_keys([t1.c, ASC])
  2 - output([t1.c], [t1.a], [t1.b]), filter(nil),
      access([t1.c], [t1.a], [t1.b]), partitions(p0)
  3 - output([t2.a], [t2.b], [t2.c]), filter(nil), sort_keys([t2.c, ASC])
  4 - output([t2.c], [t2.a], [t2.b]), filter(nil),
      access([t2.c], [t2.a], [t2.b]), partitions(p0)

 
obclient>EXPLAIN SELECT/*+USE_MERGE(t1 t2),INDEX(t1 k1),INDEX(t2 k1)*/ * 
        FROM t1, t2 WHERE t1.b = t2.b;
*************************** 1. row ***************************
Query Plan: 
| =======================================
|ID|OPERATOR   |NAME  |EST. ROWS|COST |
---------------------------------------
|0 |MERGE JOIN |      |1980     |12748|
|1 | TABLE SCAN|t1(k1)|1000     |5566 |
|2 | TABLE SCAN|t2(k1)|1000     |5566 |
=======================================

Outputs & filters:
-------------------------------------
  0 - output([t1.a], [t1.b], [t1.c], [t2.a], [t2.b], [t2.c]), filter(nil),
      equal_conds([t1.b = t2.b]), other_conds(nil)
  1 - output([t1.b], [t1.a], [t1.c]), filter(nil),
      access([t1.b], [t1.a], [t1.c]), partitions(p0)
  2 - output([t2.b], [t2.a], [t2.c]), filter(nil),
      access([t2.b], [t2.a], [t2.c]), partitions(p0)

OceanBase Database also enables you to use the Merge Join algorithm to joinmultiple tables by adding the hint /*+ USE_MERGE(table_name_list) */ to the query. In the following example, system uses the Hash Join algorithm. The user can add the preceding hint to the query to make the system use the Merge Join algorithm.

obclient>CREATE TABLE t1(c1 INT, c2 INT);
Query OK, 0 rows affected (0.97 sec)

obclient>CREATE TABLE t2(c1 INT, c2 INT);
Query OK, 0 rows affected (0.29 sec)

obclient>EXPLAIN SELECT * FROM t1,t2 WHERE t1.c1 = t2.c1;
*************************** 1. row ***************************
Query Plan: 
| ========================================
|ID|OPERATOR   |NAME|EST. ROWS|COST    |
----------------------------------------
|0 |HASH JOIN  |    |98010000 |66774608|
|1 | TABLE SCAN|T1  |100000   |68478   |
|2 | TABLE SCAN|T2  |100000   |68478   |
========================================

Outputs & filters: 
-------------------------------------
  0 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2]), filter(nil), 
      equal_conds([T1.C1 = T2.C1]), other_conds(nil)
  1 - output([T1.C1], [T1.C2]), filter(nil), 
      access([T1.C1], [T1.C2]), partitions(p0)
  2 - output([T2.C1], [T2.C2]), filter(nil), 
      access([T2.C1], [T2.C2]), partitions(p0)
 
 obclient>EXPLAIN SELECT /*+USE_MERGE(t1,t2)*/* FROM t1, t2 WHERE t1.c1 = t2.c1;
*************************** 1. row ***************************
Query Plan: 
 | =========================================
|ID|OPERATOR    |NAME|EST. ROWS|COST    |
-----------------------------------------
|0 |MERGE JOIN  |    |98010000 |67488837|
|1 | SORT       |    |100000   |563680  |
|2 |  TABLE SCAN|T1  |100000   |68478   |
|3 | SORT       |    |100000   |563680  |
|4 |  TABLE SCAN|T2  |100000   |68478   |
=========================================

Outputs & filters: 
-------------------------------------
  0 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2]), filter(nil), 
      equal_conds([T1.C1 = T2.C1]), other_conds(nil)
  1 - output([T1.C1], [T1.C2]), filter(nil), sort_keys([T1.C1, ASC])
  2 - output([T1.C1], [T1.C2]), filter(nil), 
      access([T1.C1], [T1.C2]), partitions(p0)
  3 - output([T2.C1], [T2.C2]), filter(nil), sort_keys([T2.C1, ASC])
  4 - output([T2.C1], [T2.C2]), filter(nil), 
      access([T2.C1], [T2.C2]), partitions(p0)

Hash Join

To join two tables, Hash Join uses the smaller table, or "the build table" to create a hash table based on join conditions, and scans the larger table, or "the probe table", line by line, to find the matching row while probing the hash table. If the build table is so large that the hash table cannot be stored in memory, Oceanbase Database splits the build table and the probe table into multiple partitions by join conditions, each includes a standalone pair of build table and probe table. In this way, a large Hash Join task is divided into several separate subtasks, so that Hash Join can be executed in the memory for every partition. In most cases, the Hash Join algorithm has higher efficiency than other join algorithms.

The following example shows a plan that uses the Hash Join algorithm.

obclient> CREATE TABLE t1(a INT PRIMARY KEY, b INT, c INT, KEY k1(b));
Query OK, 0 rows affected (0.24 sec)

obclient>CREATE TABLE t2(a INT PRIMARY KEY, b INT, c INT, KEY k1(b));
Query OK, 0 rows affected (0.29 sec)

obclient> EXPLAIN SELECT/*+USE_HASH(t1 t2)*/ * FROM t1, t2 WHERE t1.c = t2.c;
*************************** 1. row ***************************
Query Plan: 
| ====================================
|ID|OPERATOR   |NAME|EST. ROWS|COST|
------------------------------------
|0 |HASH JOIN  |    |1980     |4093|
|1 | TABLE SCAN|t1  |1000     |455 |
|2 | TABLE SCAN|t2  |1000     |455 |
====================================

Outputs & filters:
-------------------------------------
  0 - output([t1.a], [t1.b], [t1.c], [t2.a], [t2.b], [t2.c]), filter(nil),
      equal_conds([t1.c = t2.c]), other_conds(nil)
  1 - output([t1.c], [t1.a], [t1.b]), filter(nil),
      access([t1.c], [t1.a], [t1.b]), partitions(p0)
  2 - output([t2.c], [t2.a], [t2.b]), filter(nil),
      access([t2.c], [t2.a], [t2.b]), partitions(p0)

OceanBase Database also enables you to use the Hash Join algorithm to join multiple tables by adding the hint /*+ USE_HASH(table_name_list) */ to the query. In the following example, system uses the Merge Join algorithm. The user can add the preceding hint to the query to make the system use the Hash Join algorithm.

obclient>CREATE TABLE t1(c1 INT, c2 INT, PRIMARY KEY(c1));
Query OK, 0 rows affected (0.31 sec)

obclient>CREATE TABLE t2(c1 INT, c2 INT, PRIMARY KEY(c1));
Query OK, 0 rows affected (0.33 sec)

obclient>EXPLAIN SELECT * FROM t1, t2 WHERE t1.c1 = t2.c1;
*************************** 1. row ***************************
Query Plan: 
| ======================================
|ID|OPERATOR   |NAME|EST. ROWS|COST  |
--------------------------------------
|0 |MERGE JOIN |    |100001   |219005|
|1 | TABLE SCAN|T1  |100000   |61860 |
|2 | TABLE SCAN|T2  |100000   |61860 |
======================================

Outputs & filters: 
-------------------------------------
  0 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2]), filter(nil), 
      equal_conds([T1.C1 = T2.C1]), other_conds(nil)
  1 - output([T1.C1], [T1.C2]), filter(nil), 
      access([T1.C1], [T1.C2]), partitions(p0)
  2 - output([T2.C1], [T2.C2]), filter(nil), 
      access([T2.C1], [T2.C2]), partitions(p0)
      
obclient>EXPLAIN SELECT /*+USE_HASH(t1, t2)*/ * FROM t1, t2 WHERE t1.c1 = t2.c1;
*************************** 1. row ***************************
Query Plan: 
 | ======================================
|ID|OPERATOR   |NAME|EST. ROWS|COST  |
--------------------------------------
|0 |HASH JOIN  |    |100001   |495180|
|1 | TABLE SCAN|T1  |100000   |61860 |
|2 | TABLE SCAN|T2  |100000   |61860 |
======================================

Outputs & filters: 
-------------------------------------
  0 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2]), filter(nil), 
      equal_conds([T1.C1 = T2.C1]), other_conds(nil)
  1 - output([T1.C1], [T1.C2]), filter(nil), 
      access([T1.C1], [T1.C2]), partitions(p0)
  2 - output([T2.C1], [T2.C2]), filter(nil), 
      access([T2.C1], [T2.C2]), partitions(p0)