All Products
Search
Document Center

MaxCompute:JOIN

Last Updated:Mar 26, 2026

MaxCompute supports JOIN operations to combine rows from two tables based on a join condition. This topic covers LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER JOIN, INNER JOIN, NATURAL JOIN, implicit JOIN, and multiple JOIN operations.

Supported JOIN types

JOIN typeAlso written asBehavior
LEFT OUTER JOINLEFT JOINReturns all rows from the left table. For rows in the left table with no match in the right table, right-table columns contain NULL.
RIGHT OUTER JOINRIGHT JOINReturns all rows from the right table. For rows in the right table with no match in the left table, left-table columns contain NULL.
FULL OUTER JOINFULL JOINReturns all rows from both tables. Unmatched rows from either side have NULL in the columns from the other table.
INNER JOINJOIN (the INNER keyword is optional)Returns only rows where the join condition is satisfied in both tables.
NATURAL JOINAutomatically joins tables on all columns with the same name. Equivalent to an INNER JOIN with an ON clause on every matching column. MaxCompute also supports OUTER NATURAL JOIN. When you use a USING clause, the common columns appear only once in the result.
Implicit JOINComma-separated tables in FROMUses a WHERE clause for the join condition instead of the JOIN keyword. Equivalent to an explicit JOIN with an ON clause.
Multiple JOINChains multiple JOIN operations. Use parentheses () to control evaluation order — parenthesized JOINs are evaluated first.

Limits

  • MaxCompute does not support CROSS JOIN. A CROSS JOIN pairs every row from the left table with every row from the right table without any join condition.

  • JOIN conditions must use equi-joins combined with AND. To use non-equi joins or OR-combined conditions, use MAPJOIN instead.

  • When you apply partition pruning conditions, the location of the condition determines which tables are pruned:

    • Conditions in the WHERE clause: partition pruning applies to both tables.

    • Conditions in the ON clause: partition pruning applies only to the secondary table. The primary table is fully scanned. For details, see Check whether partition pruning is effective.

  • In most cases, the left table is a large table, and the right table is a small table. Avoid chaining multiple consecutive LEFT JOIN operations when the right table contains duplicate values. Doing so can cause data expansion and interrupt your jobs.

Important

When a WHERE clause follows a JOIN clause, MaxCompute evaluates the JOIN first and then filters the result with WHERE. For outer joins, this means a filter on the right-table columns can eliminate rows where those columns are NULL — effectively turning a LEFT JOIN into an INNER JOIN. Place filters in a subquery before joining to preserve outer-join semantics. See WHERE and JOIN: understanding the difference for examples.

Syntax

<table_reference> JOIN <table_factor> [<join_condition>]
| <table_reference> {LEFT OUTER | RIGHT OUTER | FULL OUTER | INNER | NATURAL} JOIN <table_reference> <join_condition>

Parameters:

  • table_reference: The left table. Accepts table_name [alias], table_query [alias], and similar forms.

  • table_factor: The right table or additional join target. Accepts table_name [alias], table_subquery [alias], and similar forms.

  • join_condition: One or more equality expressions combined with AND, in the form ON equality_expression [AND equality_expression].... This parameter is optional for implicit JOIN and NATURAL JOIN.

Sample data

The examples in this topic use the sale_detail and sale_detail_jt tables. Run the following statements to create the tables and load the sample data:

-- Create two partitioned tables.
CREATE TABLE IF NOT EXISTS sale_detail
(
  shop_name     STRING,
  customer_id   STRING,
  total_price   DOUBLE
)
PARTITIONED BY (sale_date STRING, region STRING);

CREATE TABLE IF NOT EXISTS sale_detail_jt
(
  shop_name     STRING,
  customer_id   STRING,
  total_price   DOUBLE
)
PARTITIONED BY (sale_date STRING, region STRING);

-- Add partitions.
ALTER TABLE sale_detail ADD PARTITION (sale_date='2013', region='china') PARTITION (sale_date='2014', region='shanghai');
ALTER TABLE sale_detail_jt ADD PARTITION (sale_date='2013', region='china');

-- Insert sample data.
INSERT INTO sale_detail PARTITION (sale_date='2013', region='china') VALUES ('s1','c1',100.1),('s2','c2',100.2),('s3','c3',100.3);
INSERT INTO sale_detail PARTITION (sale_date='2014', region='shanghai') VALUES ('null','c5',null),('s6','c6',100.4),('s7','c7',100.5);
INSERT INTO sale_detail_jt PARTITION (sale_date='2013', region='china') VALUES ('s1','c1',100.1),('s2','c2',100.2),('s5','c2',100.2);

-- Create a shop table for use in multiple JOIN examples.
CREATE TABLE shop AS SELECT shop_name, customer_id, total_price FROM sale_detail;

After inserting data, the tables contain the following rows:

sale_detail:

+------------+-------------+-------------+------------+------------+
| shop_name  | customer_id | total_price | sale_date  | region     |
+------------+-------------+-------------+------------+------------+
| s1         | c1          | 100.1       | 2013       | china      |
| s2         | c2          | 100.2       | 2013       | china      |
| s3         | c3          | 100.3       | 2013       | china      |
| null       | c5          | NULL        | 2014       | shanghai   |
| s6         | c6          | 100.4       | 2014       | shanghai   |
| s7         | c7          | 100.5       | 2014       | shanghai   |
+------------+-------------+-------------+------------+------------+

sale_detail_jt:

+------------+-------------+-------------+------------+------------+
| shop_name  | customer_id | total_price | sale_date  | region     |
+------------+-------------+-------------+------------+------------+
| s1         | c1          | 100.1       | 2013       | china      |
| s2         | c2          | 100.2       | 2013       | china      |
| s5         | c2          | 100.2       | 2013       | china      |
+------------+-------------+-------------+------------+------------+

Examples

LEFT OUTER JOIN

Returns all rows from the left table (sale_detail_jt). Rows with no match in sale_detail have NULL in the right-table column (bshop).

-- Partitioned tables require full table scan to be enabled.
SET odps.sql.allow.fullscan=true;
-- Both tables have a shop_name column, so use aliases to distinguish them.
SELECT a.shop_name AS ashop, b.shop_name AS bshop
FROM sale_detail_jt a
LEFT OUTER JOIN sale_detail b ON a.shop_name = b.shop_name;

Result:

+------------+------------+
| ashop      | bshop      |
+------------+------------+
| s2         | s2         |
| s1         | s1         |
| s5         | NULL       |
+------------+------------+

s5 exists in sale_detail_jt but has no matching shop_name in sale_detail, so bshop is NULL.

RIGHT OUTER JOIN

Returns all rows from the right table (sale_detail). Rows with no match in sale_detail_jt have NULL in the left-table column (ashop).

SET odps.sql.allow.fullscan=true;
SELECT a.shop_name AS ashop, b.shop_name AS bshop
FROM sale_detail_jt a
RIGHT OUTER JOIN sale_detail b ON a.shop_name = b.shop_name;

Result:

+------------+------------+
| ashop      | bshop      |
+------------+------------+
| NULL       | s3         |
| NULL       | s6         |
| NULL       | null       |
| s2         | s2         |
| NULL       | s7         |
| s1         | s1         |
+------------+------------+

s3, s6, null, and s7 exist only in sale_detail, so ashop is NULL for those rows.

FULL OUTER JOIN

Returns all rows from both tables. Unmatched rows from either side have NULL in the columns from the other table.

SET odps.sql.allow.fullscan=true;
SELECT a.shop_name AS ashop, b.shop_name AS bshop
FROM sale_detail_jt a
FULL OUTER JOIN sale_detail b ON a.shop_name = b.shop_name;

Result:

+------------+------------+
| ashop      | bshop      |
+------------+------------+
| NULL       | s3         |
| NULL       | s6         |
| s2         | s2         |
| NULL       | null       |
| NULL       | s7         |
| s1         | s1         |
| s5         | NULL       |
+------------+------------+

s5 has no match in sale_detail (NULL on the right); s3, s6, null, and s7 have no match in sale_detail_jt (NULL on the left).

INNER JOIN

Returns only rows where shop_name matches in both tables.

SET odps.sql.allow.fullscan=true;
SELECT a.shop_name AS ashop, b.shop_name AS bshop
FROM sale_detail_jt a
INNER JOIN sale_detail b ON a.shop_name = b.shop_name;

Result:

+------------+------------+
| ashop      | bshop      |
+------------+------------+
| s2         | s2         |
| s1         | s1         |
+------------+------------+

Only s1 and s2 appear in both tables. s5 (only in sale_detail_jt) and s3, s6, null, s7 (only in sale_detail) are excluded.

NATURAL JOIN

Joins on all columns with the same name (shop_name, customer_id, total_price, sale_date, region). The two statements below are equivalent:

SET odps.sql.allow.fullscan=true;
-- NATURAL JOIN (short form)
SELECT * FROM sale_detail_jt NATURAL JOIN sale_detail;

-- Equivalent explicit INNER JOIN
SELECT sale_detail_jt.shop_name    AS shop_name,
       sale_detail_jt.customer_id  AS customer_id,
       sale_detail_jt.total_price  AS total_price,
       sale_detail_jt.sale_date    AS sale_date,
       sale_detail_jt.region       AS region
FROM sale_detail_jt
INNER JOIN sale_detail
ON  sale_detail_jt.shop_name    = sale_detail.shop_name
AND sale_detail_jt.customer_id  = sale_detail.customer_id
AND sale_detail_jt.total_price  = sale_detail.total_price
AND sale_detail_jt.sale_date    = sale_detail.sale_date
AND sale_detail_jt.region       = sale_detail.region;

Result:

+------------+-------------+-------------+------------+------------+
| shop_name  | customer_id | total_price | sale_date  | region     |
+------------+-------------+-------------+------------+------------+
| s1         | c1          | 100.1       | 2013       | china      |
| s2         | c2          | 100.2       | 2013       | china      |
+------------+-------------+-------------+------------+------------+

s5 is excluded because its customer_id (c2) and total_price (100.2) match s2 in sale_detail, but shop_name (s5) does not match. All five columns must match for a row to be included.

Implicit JOIN

Uses a WHERE clause instead of the JOIN keyword. The two statements below return the same result:

SET odps.sql.allow.fullscan=true;
-- Implicit JOIN
SELECT * FROM sale_detail_jt, sale_detail
WHERE sale_detail_jt.shop_name = sale_detail.shop_name;

-- Equivalent explicit JOIN
SELECT * FROM sale_detail_jt
JOIN sale_detail ON sale_detail_jt.shop_name = sale_detail.shop_name;

Result:

+------------+-------------+-------------+------------+------------+------------+--------------+--------------+------------+------------+
| shop_name  | customer_id | total_price | sale_date  | region     | shop_name2 | customer_id2 | total_price2 | sale_date2 | region2    |
+------------+-------------+-------------+------------+------------+------------+--------------+--------------+------------+------------+
| s2         | c2          | 100.2       | 2013       | china      | s2         | c2           | 100.2        | 2013       | china      |
| s1         | c1          | 100.1       | 2013       | china      | s1         | c1           | 100.1        | 2013       | china      |
+------------+-------------+-------------+------------+------------+------------+--------------+--------------+------------+------------+

Multiple JOIN operations

Chain multiple JOIN operations in a single query. Use parentheses to control evaluation order.

Without parentheses (left-to-right evaluation):

SET odps.sql.allow.fullscan=true;
SELECT a.*
FROM sale_detail_jt a
FULL OUTER JOIN sale_detail b ON a.shop_name = b.shop_name
FULL OUTER JOIN sale_detail c ON a.shop_name = c.shop_name;

Result:

+------------+-------------+-------------+------------+------------+
| shop_name  | customer_id | total_price | sale_date  | region     |
+------------+-------------+-------------+------------+------------+
| s5         | c2          | 100.2       | 2013       | china      |
| NULL       | NULL        | NULL        | NULL       | NULL       |
| NULL       | NULL        | NULL        | NULL       | NULL       |
| NULL       | NULL        | NULL        | NULL       | NULL       |
| NULL       | NULL        | NULL        | NULL       | NULL       |
| NULL       | NULL        | NULL        | NULL       | NULL       |
| NULL       | NULL        | NULL        | NULL       | NULL       |
| s1         | c1          | 100.1       | 2013       | china      |
| NULL       | NULL        | NULL        | NULL       | NULL       |
| s2         | c2          | 100.2       | 2013       | china      |
| NULL       | NULL        | NULL        | NULL       | NULL       |
+------------+-------------+-------------+------------+------------+

With parentheses (inner JOIN evaluated first):

SET odps.sql.allow.fullscan=true;
-- The JOIN inside parentheses runs first, then the outer JOIN.
SELECT *
FROM shop
JOIN (sale_detail_jt JOIN sale_detail ON sale_detail_jt.shop_name = sale_detail.shop_name)
ON shop.shop_name = sale_detail_jt.shop_name;

Result:

+------------+-------------+-------------+------------+------------+------------+--------------+--------------+------------+------------+------------+--------------+--------------+------------+------------+
| shop_name  | customer_id | total_price | sale_date  | region     | shop_name2 | customer_id2 | total_price2 | sale_date2 | region2    | shop_name3 | customer_id3 | total_price3 | sale_date3 | region3    |
+------------+-------------+-------------+------------+------------+------------+--------------+--------------+------------+------------+------------+--------------+--------------+------------+------------+
| s2         | c2          | 100.2       | 2013       | china      | s2         | c2           | 100.2        | 2013       | china      | s2         | c2           | 100.2        | 2013       | china      |
| s1         | c1          | 100.1       | 2013       | china      | s1         | c1           | 100.1        | 2013       | china      | s1         | c1           | 100.1        | 2013       | china      |
+------------+-------------+-------------+------------+------------+------------+--------------+--------------+------------+------------+------------+--------------+--------------+------------+------------+

WHERE and JOIN: understanding the difference

When filtering LEFT JOIN results, place the filter condition in a subquery rather than the outer WHERE clause. The two examples below look similar but return different results.

Correct usage — filter before joining:

SET odps.sql.allow.fullscan=true;
SELECT a.shop_name,
       a.customer_id,
       a.total_price,
       b.total_price
FROM   (SELECT * FROM sale_detail     WHERE region = 'china') a
LEFT JOIN
       (SELECT * FROM sale_detail_jt  WHERE region = 'china') b
ON     a.shop_name = b.shop_name;

Result:

+------------+-------------+-------------+--------------+
| shop_name  | customer_id | total_price | total_price2 |
+------------+-------------+-------------+--------------+
| s1         | c1          | 100.1       | 100.1        |
| s2         | c2          | 100.2       | 100.2        |
| s3         | c3          | 100.3       | NULL         |
+------------+-------------+-------------+--------------+

All three rows from sale_detail where region = 'china' are returned. s3 has no match in sale_detail_jt, so total_price2 is NULL — the LEFT JOIN preserves the row.

Incorrect usage — filter after joining with WHERE:

SET odps.sql.allow.fullscan=true;
SELECT a.shop_name,
       a.customer_id,
       a.total_price,
       b.total_price
FROM   sale_detail a
LEFT JOIN sale_detail_jt b ON a.shop_name = b.shop_name
WHERE  a.region = 'china' AND b.region = 'china';

Result:

+------------+-------------+-------------+--------------+
| shop_name  | customer_id | total_price | total_price2 |
+------------+-------------+-------------+--------------+
| s1         | c1          | 100.1       | 100.1        |
| s2         | c2          | 100.2       | 100.2        |
+------------+-------------+-------------+--------------+

s3 is missing. Because s3 has no match in sale_detail_jt, b.region is NULL after the LEFT JOIN. The WHERE b.region = 'china' condition then filters out that row, turning the LEFT JOIN into an effective INNER JOIN.