All Products
Search
Document Center

ApsaraDB for SelectDB:Use Bucket Shuffle Join

Last Updated:Mar 28, 2026

Bucket Shuffle Join optimizes distributed Join queries in ApsaraDB for SelectDB by routing right-table data to the storage nodes where the left table's data already lives — instead of shuffling both tables across the cluster. Compared to Broadcast Join and Shuffle Join, this reduces both network and memory overhead to the size of the right table alone, regardless of cluster scale.

For the design background, see ISSUE 4394.

How it works

ApsaraDB for SelectDB supports two common distributed Join strategies:

  • Broadcast Join: sends the full right table to every HashJoinNode that holds left-table data. With three HashJoinNodes, that is 3x the right table's data volume in both network and memory.

  • Shuffle Join: hashes both tables and redistributes all rows across cluster nodes. Network overhead equals the combined size of both tables; memory overhead equals the right table's size.

image.png

The frontend (FE) node stores the data distribution information for every table. When a Join's equality condition matches the left table's bucket column, the FE uses that distribution map to send right-table data directly to the left table's storage nodes — no redistribution of the left table required.

The result: both network and memory overhead equal only the right table's size. This advantage grows when the FE can also apply partition pruning or bucket pruning on the left table, further reducing the data each node needs to process.

How Bucket Shuffle Join differs from Colocate Join: Bucket Shuffle Join requires no knowledge of or changes to either table's data distribution, and it avoids data skew because it imposes no matching distribution rules between the two tables. It also gives Join Reorder more paths to explore during query planning.

Key concepts

TermDefinition
Left tableThe left-hand table in a Join query, used in the probe operation. Join Reorder may change which table is on the left.
Right tableThe right-hand table in a Join query, used in the build operation. Join Reorder may change which table is on the right.
Bucket columnThe column by which a table's rows are distributed across buckets at table creation time.
FE nodeThe frontend node that stores each table's data distribution information and generates query execution plans.

Enable Bucket Shuffle Join

Set the session variable enable_bucket_shuffle_join to true. The FE then automatically selects Bucket Shuffle Join for any eligible query.

set enable_bucket_shuffle_join = true;

When planning a distributed query, the FE applies this priority order:

Colocate Join > Bucket Shuffle Join > Broadcast Join > Shuffle Join

To override the automatic selection, use a hint:

SELECT * FROM test JOIN [shuffle] baseall ON test.k1 = baseall.k1;

Verify that Bucket Shuffle Join is used

Run explain on a query to inspect its execution plan. Look for BUCKET_SHUFFLE in the join op field.

Bucket Shuffle Join applied:

|   2:HASH JOIN                                                       |
|   |  join op: INNER JOIN (BUCKET_SHUFFLE)                          |
|   |  hash predicates:                                              |
|   |  colocate: false, reason: table not in the same group          |
|   |  equal join conjunct: `test`.`k1` = `baseall`.`k1`            |

Fallback to another Join method (Bucket Shuffle Join did not apply):

|   2:HASH JOIN                                                       |
|   |  join op: INNER JOIN (BROADCAST)                               |
|   |  hash predicates:                                              |
|   |  colocate: false, reason: table not in the same group          |
|   |  equal join conjunct: `test`.`k1` = `baseall`.`k1`            |

If the plan shows BROADCAST or SHUFFLE instead of BUCKET_SHUFFLE, check the planning rules below to identify why the optimization did not apply.

Planning rules

Bucket Shuffle Join routes data based on hash-computed bucket assignments. All planning requirements follow from this core constraint: the FE must be able to determine, at planning time, exactly which storage nodes hold each row of the left table.

RequirementDetail
Equality join conditionThe Join must use an equality predicate (=). Range or non-equality conditions cannot be used for hash-based routing.
Left table's bucket column in the join conditionThe equality predicate must include the bucket column of the left table.
Matching data typesThe bucket column of the left table and the corresponding column of the right table must have the same data type. Mismatched types prevent planning.
Native OLAP tables onlyThe left table must be a native online analytical processing (OLAP) table. External tables — such as Open Database Connectivity (ODBC), MySQL, or Elasticsearch tables — cannot serve as the left table.
Single partition on the left tableFor partitioned tables, Bucket Shuffle Join applies only when the query touches exactly one partition. Use a WHERE clause to enable partition pruning and satisfy this requirement.
Colocate tables perform bestIf the left table is a Colocate table, its per-partition data distribution is fixed and fully known to the FE, giving Bucket Shuffle Join the most precise routing information.