All Products
Search
Document Center

MaxCompute:INTERSECT, UNION, EXCEPT, and MINUS

Last Updated:Mar 26, 2026

MaxCompute supports four set operators for combining query result sets: INTERSECT, UNION, EXCEPT, and MINUS. Each operator has two variants that differ in how they handle duplicate rows.

Overview

OperatorReturns
INTERSECTRows present in both datasets
UNIONAll rows from both datasets combined
EXCEPT / MINUSRows in the left dataset that are not in the right dataset

EXCEPT and MINUS are synonyms — they produce identical results.

Duplicate handling: set semantics vs. bag semantics

All set operators work in one of two modes:

  • Set semantics (default): removes duplicate rows from the result, equivalent to applying DISTINCT. Use INTERSECT, UNION, or EXCEPT without the ALL keyword.

  • Bag semantics: preserves duplicate rows. Use INTERSECT ALL, UNION ALL, or EXCEPT ALL.

In short: with ALL, duplicates are kept; without ALL, duplicates are removed.

Limits

  • Set operators can combine a maximum of 256 datasets in a single statement. Exceeding this limit returns an error.

  • Both datasets must have the same number of columns.

Usage notes

  • Results are not guaranteed to be in any specific order unless you add an ORDER BY clause.

  • If the corresponding columns have incompatible data types, MaxCompute performs implicit type conversion before executing the operation. See Data types for conversion rules.

  • Implicit type conversion between STRING and other types is disabled for all set operations. Cast columns explicitly when combining STRING with non-string types.

INTERSECT

Returns rows that appear in both datasets.

Syntax

-- Keep duplicate rows in the result.
<select_statement1> INTERSECT ALL <select_statement2>;

-- Remove duplicate rows from the result. INTERSECT and INTERSECT DISTINCT are equivalent.
<select_statement1> INTERSECT [DISTINCT] <select_statement2>;

Parameters

ParameterRequiredDescription
select_statement1, select_statement2YesSELECT clauses to combine. See SELECT syntax.
DISTINCTNoRemoves duplicate rows from the intersection. Omitting ALL has the same effect.

Examples

Example 1: Return the intersection, keeping duplicate rows.

SELECT * FROM VALUES (1, 2), (1, 2), (3, 4), (5, 6) t(a, b)
INTERSECT ALL
SELECT * FROM VALUES (1, 2), (1, 2), (3, 4), (5, 7) t(a, b);

Result:

+------------+------------+
| a          | b          |
+------------+------------+
| 1          | 2          |
| 1          | 2          |
| 3          | 4          |
+------------+------------+

Example 2: Return the intersection, removing duplicate rows.

SELECT * FROM VALUES (1, 2), (1, 2), (3, 4), (5, 6) t(a, b)
INTERSECT DISTINCT
SELECT * FROM VALUES (1, 2), (1, 2), (3, 4), (5, 7) t(a, b);
-- Equivalent to:
SELECT DISTINCT * FROM
(SELECT * FROM VALUES (1, 2), (1, 2), (3, 4), (5, 6) t(a, b)
INTERSECT ALL
SELECT * FROM VALUES (1, 2), (1, 2), (3, 4), (5, 7) t(a, b)) t;

Result:

+------------+------------+
| a          | b          |
+------------+------------+
| 1          | 2          |
| 3          | 4          |
+------------+------------+

UNION

Returns all rows from both datasets combined.

Syntax

-- Keep duplicate rows in the result.
<select_statement1> UNION ALL <select_statement2>;

-- Remove duplicate rows from the result.
<select_statement1> UNION [DISTINCT] <select_statement2>;

Parameters

ParameterRequiredDescription
select_statement1, select_statement2YesSELECT clauses to combine. See SELECT syntax.
DISTINCTNoRemoves duplicate rows from the union result.

Usage notes

  • When chaining multiple UNION ALL operations, use parentheses to specify evaluation order.

  • The behavior of CLUSTER BY, DISTRIBUTE BY, SORT BY, ORDER BY, and LIMIT clauses following UNION depends on the odps.sql.type.system.odps2 setting:

    • SET odps.sql.type.system.odps2=true; — the clause applies to the results of all UNION operations.

    • SET odps.sql.type.system.odps2=false; — the clause applies only to the last select_statement in the UNION.

Examples

Example 1: Return the union, keeping duplicate rows.

SELECT * FROM VALUES (1, 2), (1, 2), (3, 4) t(a, b)
UNION ALL
SELECT * FROM VALUES (1, 2), (1, 4) t(a, b);

Result:

+------------+------------+
| a          | b          |
+------------+------------+
| 1          | 2          |
| 1          | 2          |
| 3          | 4          |
| 1          | 2          |
| 1          | 4          |
+------------+------------+

Example 2: Return the union, removing duplicate rows.

SELECT * FROM VALUES (1, 2), (1, 2), (3, 4) t(a, b)
UNION DISTINCT
SELECT * FROM VALUES (1, 2), (1, 4) t(a, b);
-- Equivalent to:
SELECT DISTINCT * FROM (
SELECT * FROM VALUES (1, 2), (1, 2), (3, 4) t(a, b)
UNION ALL
SELECT * FROM VALUES (1, 2), (1, 4) t(a, b));

Result:

+------------+------------+
| a          | b          |
+------------+------------+
| 1          | 2          |
| 1          | 4          |
| 3          | 4          |
+------------+------------+

Example 3: Use parentheses to control the evaluation order of UNION ALL operations.

SELECT * FROM VALUES (1, 2), (1, 2), (5, 6) t(a, b)
UNION ALL
(SELECT * FROM VALUES (1, 2), (1, 2), (3, 4) t(a, b)
UNION ALL
SELECT * FROM VALUES (1, 2), (1, 4) t(a, b));

Result:

+------------+------------+
| a          | b          |
+------------+------------+
| 1          | 2          |
| 1          | 2          |
| 5          | 6          |
| 1          | 2          |
| 1          | 2          |
| 3          | 4          |
| 1          | 2          |
| 1          | 4          |
+------------+------------+

Example 4: Use UNION ALL followed by ORDER BY and LIMIT with odps.sql.type.system.odps2=true. The ORDER BY and LIMIT apply to the combined result set.

SET odps.sql.type.system.odps2=true;
SELECT explode(ARRAY(3, 1)) AS (a) UNION ALL SELECT explode(ARRAY(0, 4, 2)) AS (a) ORDER BY a limit 3;

Result:

+------------+
| a          |
+------------+
| 0          |
| 1          |
| 2          |
+------------+

Example 5: Use UNION ALL followed by ORDER BY and LIMIT with odps.sql.type.system.odps2=false. The ORDER BY and LIMIT apply only to the last SELECT statement, not to the combined result. All rows from both queries are returned.

SET odps.sql.type.system.odps2=false;
SELECT explode(ARRAY(3, 1)) AS (a) UNION ALL SELECT explode(ARRAY(0, 4, 2)) AS (a) ORDER BY a limit 3;

Result:

+------------+
| a          |
+------------+
| 3          |
| 1          |
| 0          |
| 2          |
| 4          |
+------------+

EXCEPT and MINUS

Returns rows from the left dataset that do not appear in the right dataset. EXCEPT and MINUS are synonyms.

Syntax

-- Keep duplicate rows in the result.
<select_statement1> EXCEPT ALL <select_statement2>;
<select_statement1> MINUS ALL <select_statement2>;

-- Remove duplicate rows from the result.
<select_statement1> EXCEPT [DISTINCT] <select_statement2>;
<select_statement1> MINUS [DISTINCT] <select_statement2>;

Parameters

ParameterRequiredDescription
select_statement1, select_statement2YesSELECT clauses to combine. See SELECT syntax.
DISTINCTNoRemoves duplicate rows from the result.

Examples

Example 1: Return rows in the left dataset that are not in the right dataset, keeping duplicate rows.

SELECT * FROM VALUES (1, 2), (1, 2), (3, 4), (3, 4), (5, 6), (7, 8) t(a, b)
EXCEPT ALL
SELECT * FROM VALUES (3, 4), (5, 6), (5, 6), (9, 10) t(a, b);
-- Equivalent to:
SELECT * FROM VALUES (1, 2), (1, 2), (3, 4), (3, 4), (5, 6), (7, 8) t(a, b)
MINUS ALL
SELECT * FROM VALUES (3, 4), (5, 6), (5, 6), (9, 10) t(a, b);

Result:

+------------+------------+
| a          | b          |
+------------+------------+
| 1          | 2          |
| 1          | 2          |
| 3          | 4          |
| 7          | 8          |
+------------+------------+

Example 2: Return rows in the left dataset that are not in the right dataset, removing duplicate rows.

SELECT * FROM VALUES (1, 2), (1, 2), (3, 4), (3, 4), (5, 6), (7, 8) t(a, b)
EXCEPT DISTINCT
SELECT * FROM VALUES (3, 4), (5, 6), (5, 6), (9, 10) t(a, b);
-- Equivalent to:
SELECT * FROM VALUES (1, 2), (1, 2), (3, 4), (3, 4), (5, 6), (7, 8) t(a, b)
MINUS DISTINCT
SELECT * FROM VALUES (3, 4), (5, 6), (5, 6), (9, 10) t(a, b);
-- Both are equivalent to:
SELECT DISTINCT * FROM VALUES (1, 2), (1, 2), (3, 4), (3, 4), (5, 6), (7, 8) t(a, b) except all select * from values (3, 4), (5, 6), (5, 6), (9, 10) t(a, b);

Result:

+------------+------------+
| a          | b          |
+------------+------------+
| 1          | 2          |
| 7          | 8          |
+------------+------------+