This topic describes how to perform the following operations on datasets of query
results in MaxCompute: `INTERSECT`

, `INTERSECT ALL`

, `INTERSECT DISTINCT`

, `UNION`

, `UNION ALL`

, `UNION DISTINCT`

, `EXCEPT`

, `EXCEPT ALL`

, `EXCEPT DISTINCT`

, `MINUS`

, `MINUS ALL`

, and `MINUS DISTINCT`

.

## Description

MaxCompute supports the following operations on datasets:

- INTERSECT: returns the intersection of two datasets. The intersection includes the values that are contained in both datasets.
- UNION: returns the union of two datasets. The union is a dataset that is obtained by combining the two datasets.
- EXCEPT and MINUS: returns distinct values from one of two datasets. These values are not contained in the other dataset.

## Limits

The following limits are imposed on INTERSECT, UNION, EXCEPT, and MINUS:

- MaxCompute allows you to perform INTERSECT, UNION, EXCEPT, or MINUS operations on a maximum of 256 datasets at the same time. If the number of datasets exceeds 256, an error is returned.
- The number of columns in the left and right tables must be the same.

## Usage notes

When you perform INTERSECT, UNION, EXCEPT, or MINUS operations on two datasets, take note of the following items:

- The results of INTERSECT, UNION, EXCEPT, or MINUS operations may not be sorted in a specific order.
- If data types of the datasets are not consistent, MaxCompute implicitly converts the data types before you perform INTERSECT, UNION, EXCEPT, or MINUS operations. For more information about implicit conversions, see Data types. To prevent compatibility issues, MaxCompute disables implicit conversions between data of the STRING type and data of other types for INTERSECT, UNION, EXCEPT, or MINUS operations.

## INTERSECT

- Syntax
`-- Obtain an intersection that contains duplicate values. <select_statement1> intersect all <select_statement2>; -- Obtain an intersection that does not contain duplicate values. The usage of`

`INTERSECT`

is equivalent to that of`INTERSECT DISTINCT`

. <select_statement1> intersect [distinct] <select_statement2>; - Parameters
- select_statement1 and select_statement2: required. These parameters specify the
`SELECT`

clauses. For more information about the syntax of the clauses, see SELECT syntax. - distinct: optional. This parameter is used to remove duplicate values from the intersection of two datasets.

- select_statement1 and select_statement2: required. These parameters specify the
- Examples
- Example 1: Obtain the intersection of two datasets. The intersection contains duplicate
values. Sample statement:
`select * from values (1, 2), (1, 2), (3, 4), (5, 6) t(a, b) intersect all select * from values (1, 2), (1, 2), (3, 4), (5, 7) t(a, b);`

The following result is returned:`+------------+------------+ | a | b | +------------+------------+ | 1 | 2 | | 1 | 2 | | 3 | 4 | +------------+------------+`

- Example 2: Obtain the intersection of two datasets. The intersection does not contain
duplicate values. Sample statements:
`select * from values (1, 2), (1, 2), (3, 4), (5, 6) t(a, b) intersect distinct select * from values (1, 2), (1, 2), (3, 4), (5, 7) t(a, b); -- The preceding statement is equivalent to the following statement: select distinct * from (select * from values (1, 2), (1, 2), (3, 4), (5, 6) t(a, b) intersect all select * from values (1, 2), (1, 2), (3, 4), (5, 7) t(a, b)) t;`

The following result is returned:`+------------+------------+ | a | b | +------------+------------+ | 1 | 2 | | 3 | 4 | +------------+------------+`

- Example 1: Obtain the intersection of two datasets. The intersection contains duplicate
values. Sample statement:

## UNION

- Syntax
`-- Obtain a union that contains duplicate values. <select_statement1> union all <select_statement2>; -- Obtain a union that does not contain duplicate values. <select_statement1> union [distinct] <select_statement2>;`

- Usage notes
- If multiple
`UNION ALL`

operations exist, use parentheses () to specify the priorities of the`UNION ALL`

operations. - If
`UNION`

is followed by a`CLUSTER BY`

,`DISTRIBUTE BY`

,`SORT BY`

,`ORDER BY`

, or`LIMIT`

clause and`odps.sql.type.system.odps2`

is set to false, the clause works only on the last`select_statement`

of`UNION`

. If`odps.sql.type.system.odps2`

is set to true, the clause works on the results of all`UNION`

operations.

- If multiple
- Parameters
- select_statement1 and select_statement2: required. These parameters specify the
`SELECT`

clauses. For more information about the syntax of the clauses, see SELECT syntax. - distinct: optional. This parameter is used to remove duplicate values from the union of two datasets.

- select_statement1 and select_statement2: required. These parameters specify the
- Examples
- Example 1: Obtain the union of two datasets. The union contains duplicate values.
Sample statement:

The following result is returned:`select * from values (1, 2), (1, 2), (3, 4) t(a, b) union all select * from values (1, 2), (1, 4) t(a, b);`

`+------------+------------+ | a | b | +------------+------------+ | 1 | 2 | | 1 | 2 | | 3 | 4 | | 1 | 2 | | 1 | 4 | +------------+------------+`

- Example 2: Obtain the union of two datasets. The union does not contain duplicate
values. Sample statements:

The following result is returned:`select * from values (1, 2), (1, 2), (3, 4) t(a, b) union distinct select * from values (1, 2), (1, 4) t(a, b); -- The preceding statement is equivalent to the following statement: select distinct * from ( select * from values (1, 2), (1, 2), (3, 4) t(a, b) union all select * from values (1, 2), (1, 4) t(a, b));`

`+------------+------------+ | a | b | +------------+------------+ | 1 | 2 | | 1 | 4 | | 3 | 4 | +------------+------------+`

- Example 3: Use parentheses () to specify the priorities of
`UNION ALL`

operations. Sample statement:

The following result is returned:`select * from values (1, 2), (1, 2), (5, 6) t(a, b) union all (select * from values (1, 2), (1, 2), (3, 4) t(a, b) union all select * from values (1, 2), (1, 4) t(a, b));`

`+------------+------------+ | a | b | +------------+------------+ | 1 | 2 | | 1 | 2 | | 5 | 6 | | 1 | 2 | | 1 | 2 | | 3 | 4 | | 1 | 2 | | 1 | 4 | +------------+------------+`

- Example 4: Use
`UNION`

that is followed by a`CLUSTER BY`

,`DISTRIBUTE BY`

,`SORT BY`

,`ORDER BY`

, or`LIMIT`

clause in a SELECT statement and set`odps.sql.type.system.odps2`

to true. Sample statements:

The following result is returned:`set odps.sql.type.system.odps2=true; select explode(array(3, 1)) as (a) union all select explode(array(0, 4, 2)) as (a) order by a limit 3;`

`+------------+ | a | +------------+ | 0 | | 1 | | 2 | +------------+`

- Example 5: Use
`UNION`

that is followed by a`CLUSTER BY`

,`DISTRIBUTE BY`

,`SORT BY`

,`ORDER BY`

, or`LIMIT`

clause in a SELECT statement and set`odps.sql.type.system.odps2`

to false. Sample statements:

The following result is returned:`set odps.sql.type.system.odps2=false; select explode(array(3, 1)) as (a) union all select explode(array(0, 4, 2)) as (a) order by a limit 3;`

`+------------+ | a | +------------+ | 3 | | 1 | | 0 | | 2 | | 4 | +------------+`

- Example 1: Obtain the union of two datasets. The union contains duplicate values.
Sample statement:

## EXCEPT and MINUS

- Syntax
`-- Obtain the supplementary set of two datasets. The supplementary set contains duplicate values. <select_statement1> except all <select_statement2>; <select_statement1> minus all <select_statement2>; -- Obtain the supplementary set of two datasets. The supplementary set does not contain duplicate values. <select_statement1> except [distinct] <select_statement2>; <select_statement1> minus [distinct] <select_statement2>;`

**Note**The usage of`EXCEPT`

is equivalent to that of`MINUS`

. - Parameters
- select_statement1 and select_statement2: required. These parameters specify the
`SELECT`

clauses. For more information about the syntax of the clauses, see SELECT syntax. - distinct: optional. This parameter is used to remove duplicate values from the supplementary set of two datasets.

- select_statement1 and select_statement2: required. These parameters specify the
- Examples
- Example 1: Obtain the supplementary set of two datasets. The supplementary set contains
duplicate values. Sample statements:
`select * from values (1, 2), (1, 2), (3, 4), (3, 4), (5, 6), (7, 8) t(a, b) except all select * from values (3, 4), (5, 6), (5, 6), (9, 10) t(a, b); -- The preceding statement is equivalent to the following statement: select * from values (1, 2), (1, 2), (3, 4), (3, 4), (5, 6), (7, 8) t(a, b) minus all select * from values (3, 4), (5, 6), (5, 6), (9, 10) t(a, b);`

The following result is returned:`+------------+------------+ | a | b | +------------+------------+ | 1 | 2 | | 1 | 2 | | 3 | 4 | | 7 | 8 | +------------+------------+`

- Example 2: Obtain the supplementary set of two datasets. The supplementary set does
not contain duplicate values. Sample statements:

The following result is returned:`select * from values (1, 2), (1, 2), (3, 4), (3, 4), (5, 6), (7, 8) t(a, b) except distinct select * from values (3, 4), (5, 6), (5, 6), (9, 10) t(a, b); -- The preceding statement is equivalent to the following statements: select * from values (1, 2), (1, 2), (3, 4), (3, 4), (5, 6), (7, 8) t(a, b) minus distinct select * from values (3, 4), (5, 6), (5, 6), (9, 10) t(a, b); -- The preceding statements are equivalent to the following statement: select distinct * from values (1, 2), (1, 2), (3, 4), (3, 4), (5, 6), (7, 8) t(a, b) except all select * from values (3, 4), (5, 6), (5, 6), (9, 10) t(a, b);`

`+------------+------------+ | a | b | +------------+------------+ | 1 | 2 | | 7 | 8 | +------------+------------+`

- Example 1: Obtain the supplementary set of two datasets. The supplementary set contains
duplicate values. Sample statements: