This topic describes the SQL statements that use the UNION, INTERSECT, and EXCEPT set operators, such as UNION, UNION ALL, UNION DISTINCT, INTERSECT, INTERSECT ALL, INTERSECT DISTINCT, EXCEPT, EXCEPT ALL, and EXCEPT DISTINCT.

The syntax of such statements is as follows:
select_statement UNION ALL select_statement;
select_statement UNION [DISTINCT] select_statement;
select_statement INTERSECT ALL select_statement;
select_statement INTERSECT [DISTINCT] select_statement;
select_statement EXCEPT ALL select_statement;
select_statement EXCEPT [DISTINCT] select_statement;
select_statement MINUS ALL select_statement;
select_statement MINUS [DISTINCT] select_statement;
  • UNION: returns the union of two datasets. It combines the two datasets into one dataset.
  • INTERSECT: returns the intersection of two datasets, that is, data contained in both datasets.
  • EXCEPT: returns distinct values from first dataset that are not contained in the second dataset.
  • MINUS: equivalent to EXCEPT.
Examples
  • Example of a UNION ALL statement
    SELECT * FROM VALUES (1, 2), (1, 2), (3, 4) t(a, b) 
    UNION ALL 
    SELECT * FROM VALUES (1, 2), (1, 4) t(a, b);
    Returned result: The two datasets are combined.
    +------------+------------+
    | a          | b          |
    +------------+------------+
    | 1          | 2          |
    | 1          | 4          |
    | 1          | 2          |
    | 1          | 2          |
    | 3          | 4          |
    +------------+------------+
    If multiple UNION ALL clauses exist, enclose the corresponding UNION ALL clause in parentheses (( )) to influence the order of precedence.
    SELECT * FROM src UNION ALL (SELECT * FROM src2 UNION ALL SELECT * FROM src3);
  • Example of a UNION DISTINCT statement
    SELECT * FROM VALUES (1, 2), (1, 2), (3, 4) t(a, b) 
    UNION 
    SELECT * FROM VALUES (1, 2), (1, 4) t(a, b);
    Returned result: The returned result is equivalent to that returned by running the SELECT DISTINCT * FROM (< result of UNION ALL >) t; statement.
    +------------+------------+
    | a          | b          |
    +------------+------------+
    | 1          | 2          |
    | 1          | 4          |
    | 3          | 4          |
    +------------+------------+
  • Example of an INTERSECT ALL statement
    SELECT * FROM VALUES (1, 2), (1, 2), (3, 4), (5, 6) t(a, b) 
    INTERSECT ALL 
    SELECT * FROM VALUES (1, 2), (1, 2), (3, 4), (5, 7) t(a, b);
    Returned result: Deduplication is skipped in INTERSECT ALL. Hidden serial numbers are behind rows with the same values to distinguish them.
    +------------+------------+
    | a          | b          |
    +------------+------------+
    | 1          | 2          |
    | 1          | 2          |
    | 3          | 4          |
    +------------+------------+
  • Example of an INTERSECT DISTINCT statement
    SELECT * FROM VALUES (1, 2), (1, 2), (3, 4), (5, 6) t(a, b) 
    INTERSECT 
    SELECT * FROM VALUES (1, 2), (1, 2), (3, 4), (5, 7) t(a, b);
    Returned result: The returned result is equivalent to that returned by running the SELECT DISTINCT * FROM (< result of INTERSECT ALL >) t; statement.
    +------------+------------+
    | a          | b          |
    +------------+------------+
    | 1          | 2          |
    | 3          | 4          |
    +------------+------------+
  • Example of an EXCEPT ALL statement
    SELECT * FROM VALUES (1, 2), (1, 2), (3, 4), (3, 4), (5, 6), (7, 8) t(a, b) 
    EXCEPT ALL 
    SELECT * FROM VALUES (3, 4), (5, 6), (5, 6), (9, 10) t(a, b);
    Returned result: Deduplication is skipped in EXCEPT ALL. Hidden serial numbers are behind rows with the same values to distinguish them.
    +------------+------------+
    | a          | b          |
    +------------+------------+
    | 1          | 2          |
    | 1          | 2          |
    | 3          | 4          |
    | 7          | 8          |
    +------------+------------+
  • Example of an EXCEPT DISTINCT statement
    SELECT * FROM VALUES (1, 2), (1, 2), (3, 4), (3, 4), (5, 6), (7, 8) t(a, b) 
    EXCEPT
    SELECT * FROM VALUES (3, 4), (5, 6), (5, 6), (9, 10) t(a, b);
    Returned result: The returned result is equivalent to that returned by running the SELECT DISTINCT * FROM left_branch EXCEPT ALL SELECT DISTINCT * FROM right_branch; statement.
    +------------+------------+
    | a          | b          |
    +------------+------------+
    | 1          | 2          |
    | 7          | 8          |
    +------------+------------+
    Note
    • The sequence of data returned in a set operation may not be ordered.
    • The left and right branches in a set operation must have the same number of columns. In addition, if data types in the left and right branches are not the same, the data types may be implicitly converted. For compatibility reasons, MaxCompute disables the implicit conversion between data of the string type and data of non-string types in set operations.
    • MaxCompute allows up to 256 branches in a set operation. If the number of branches exceeds 256, an error is reported.
    • Assume that a UNION statement is followed by a CLUSTER BY, a DISTRIBUTE BY, a SORT BY, an ORDER BY, or a LIMIT clause. If you use the set odps.sql.type.system.odps2=false; statement, the clause works only on the result of the SELECT statement after the UNION operator. If you use the set odps.sql.type.system.odps2=true; statement, the clause works on the result of the UNION operation. An example is as follows:
      set odps.sql.type.system.odps2=true;
      SELECT explode(array(3, 1)) AS (a) UNION ALL SELECT explode(array(0, 4, 2)) AS (a) ORDER BY a LIMIT 3;
      The returned result is as follows:
      +------+
      | a    |
      +------+
      | 0    |
      | 1    |
      | 2    |
      +------+