You can use a JOIN operation to join two tables and obtain query results from the joined tables. In MaxCompute, JOIN operations include LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER JOIN, and INNER JOIN. MaxCompute supports multiple JOIN operations in an SQL statement, but does not support CROSS JOIN. A CROSS JOIN operation joins two tables without specifying conditions in the ON clause and returns the cartesian product of the input tables.
join_table:
table_reference join table_factor [join_condition]
| table_reference {left outer|right outer|full outer|inner} join table_reference join_condition
table_reference:
table_factor
| join_table
table_factor:
tbl_name [alias]
| table_subquery alias
| ( table_references )
join_condition:
on equality_expression ( and equality_expression )
equality_expression
indicates an equality expression.- If partition pruning conditions are specified in the
WHERE
clause, partition pruning takes effect for both tables. If partition pruning conditions are specified in theON
clause, partition pruning takes effect for the right table but not the left table. A full table scan is run for the left table. For more information, see Reasonableness evaluation of partition pruning.
LEFT OUTER JOIN
: the LEFT JOIN operation. It returns all queried rows from the left table, including those that do not match any rows in the right table.select a.shop_name as ashop, b.shop_name as bshop from shop a left outer join sale_detail b on a.shop_name=b.shop_name; -- Both the shop and sale_detail tables have the shop_name column. Therefore, aliases are assigned to the shop_name columns in the SELECT clause to distinguish between the columns.
Note If the values in some rows of the right table are duplicate, we recommend that you do not useLEFT JOIN
consecutively many times. Otherwise, data bloat may occur during theJOIN
operation and interrupt your jobs.RIGHT OUTER JOIN
: the RIGHT JOIN operation. It returns all queried rows from the right table, including those that do not match any rows in the left table.select a.shop_name as ashop, b.shop_name as bshop from shop a right outer join sale_detail b on a.shop_name=b.shop_name;
FULL OUTER JOIN
: the FULL JOIN operation. It returns all queried rows from both the left and right tables.select a.shop_name as ashop, b.shop_name as bshop from shop a full outer join sale_detail b on a.shop_name=b.shop_name;
In the following example, the left table is test_table_a
and the right table is test_table_b
. The JOIN
clause is used to join tables. The WHERE
clause is used to specify filter conditions. To query rows whose value of the origin
column is equal to that of the id
column and whose data timestamp is greater than 20180101, you can use LEFT JOIN
to return all queried rows in the left table test_table_a
. Normally, if you use LEFT JOIN
, it returns all queried rows in the left table. If you use RIGHT JOIN
, it returns all queried rows in the right table.
SELECT s.id
,s.name
,s.origin
,d.value
FROM test_table_a s
LEFT JOIN test_table_b d
ON s.origin = d.id
WHERE s.ds > "20180101" AND d.ds>"20180101";
If you use the JOIN
clause before the WHERE
clause, the JOIN
operation is performed first. Then, the results obtained from the JOIN
operation are filtered based on the conditions specified by the WHERE
clause. The final results are the intersection of two tables but not all queried
rows in the left table.
SELECT s.id
,s.name
,s.origin
,d.value
FROM (select * from test_table_a where ds > "20180101" ) s
LEFT JOIN (select * from test_table_b where ds > "20180101") d
ON s.origin = d.id;
INNER JOIN
: returns a result table that combines the queried rows from two tables that meet
the conditions specified by the ON clause. The keyword INNER
can be omitted.
select a.shop_name from shop a inner join sale_detail b on a.shop_name=b.shop_name;
select a.shop_name from shop a join sale_detail b on a.shop_name=b.shop_name;
and
to connect these conditions. You can specify a non-equivalent condition or use or
to connect multiple conditions in the ON clause only for MAPJOIN.
select a.* from shop a full outer join sale_detail b on a.shop_name=b.shop_name
full outer join sale_detail c on a.shop_name=c.shop_name;
-- MaxCompute supports multiple JOIN operations in an SQL statement.
select a.* from shop a join sale_detail b on a.shop_name ! = b.shop_name;
-- MaxCompute does not support a non-equivalent condition and returns an error.
JOIN
operation:
SELECT * FROM table1, table2 WHERE table1.id = table2.id;
-- The effect of this statement is equivalent to the following statement:
SELECT * FROM table1 JOIN table2 ON table1.id = table2.id;
SELECT * FROM src NATURAL JOIN src2;
-- Both the src and src2 tables contain the key1 and key2 fields. In this case, the preceding statement is equivalent to the following statement:
SELECT src.key1 as key1, src.key2 as key2, src.a1, src.a2, src2.b1, src2.b2 FROM src INNER JOIN src2 ON src.key1 = src2.key1 AND src.key2 = src2.key2;
SELECT * FROM src JOIN (src2 JOIN src3 on xxx) ON yyy;