When you perform a hash join for multiple tables, a large amount of resources are consumed. To resolve this issue, AnalyticDB for MySQL provides Nested Loop Join. This topic describes how to use Nested Loop Join to join multiple tables.

Background information

By default, AnalyticDB for MySQL uses Hash Join to join tables. Hash Join is commonly used but consumes a large amount of resources. Hash Join is not ideal for joining tables that are small in size and have good query performance. To resolve this issue, AnalyticDB for MySQL provides Nested Loop Join. Nested Loop Join is also called Index Join. This is because the right table in Nested Loop Join is accessed through indexes.

Scenarios

Nested Loop Join is ideal for scenarios where only a small amount of data is filtered out in the left table and the right table has better query performance in the joined columns.

Precautions

  • Nested Loop Join supports only inner joins and left joins and does not support right joins or full outer joins.
  • When you use Nested Loop Join, the right table must have an index and does not support multi-partition columns.
  • The data types of the joined columns must be the same.
    • Supported data types: INT, BOOLEAN, VARCHAR, TIME, DATE, TIMESTAMP, and DATETIME.
    • Non-supported data types: FLOAT, DOUBLE, DECIMAL, JSON, MULTIVALUE, GEO2D, BYTES, ARRAY, BLOB, BINARY, NULL, and OTHER.
  • When you join fact tables with dimension tables, take note of the following items:
    • A dimension table can contain up to 20 million records.
    • Join conditions are not limited.
    • The left table for left joins cannot be a dimension table.
  • When you join fact tables, join conditions must contain partition keys.

Usage

You can enable Nested Loop Join through a hint.

  • /*+nested_loop_join=true*/

    The data sizes of tables are not determined. Nested Loop Join is performed for all joins that meet the corresponding join conditions.

  • /*+nlj_index_join_small_table_max_row, nlj_index_join_large_table_max_row*/

    The data sizes of the left and right tables are determined. If the data size of a small or big table exceeds the corresponding threshold, a hash join is used.

Example

/*+nested_loop_join=true */
SELECT  t1.c1 
FROM  T1 INNER JOIN T2 
ON T1.C1=T2.C1;