A subquery is a SELECT statement nested inside another query. Use subqueries to filter rows based on a derived set of values, check whether matching rows exist in another table, compute a single value from related rows, or use a query result as a temporary table in a FROM clause.
Subquery types
MaxCompute subqueries fall into two independent dimensions:
Correlated vs. uncorrelated
A correlated subquery references one or more columns from the outer query, using the current row's values to filter results.
An uncorrelated subquery has no reference to the outer query. MaxCompute runs it independently and passes the result to the outer query.
Scalar vs. non-scalar
A scalar subquery returns exactly one column from one row. You can use the returned value anywhere a scalar expression is valid.
A non-scalar subquery returns zero or more rows and is typically used with
IN,NOT IN,EXISTS, orNOT EXISTS.
The following subquery types are supported:
| Type | Primary clause | Use case |
|---|---|---|
| Basic subquery | FROM | Use a query result as a derived table |
| IN subquery | WHERE | Keep rows whose column value matches any value returned by a subquery |
| NOT IN subquery | WHERE | Keep rows whose column value matches no value returned by a subquery |
| EXISTS subquery | WHERE | Keep rows for which a correlated subquery returns at least one row |
| NOT EXISTS subquery | WHERE | Keep rows for which a correlated subquery returns no rows |
| Scalar subquery | SELECT, WHERE, HAVING | Use a single aggregated value from a related table as a scalar |
MaxCompute converts SCALAR, IN, NOT IN, EXISTS, and NOT EXISTS subqueries into JOIN operations at execution time. MAPJOIN is an efficient broadcast join algorithm. If the subquery result is a small table, add a MAPJOIN hint to the subquery statement to explicitly request this algorithm.
Sample data
The examples in this topic use the following tables. Run these statements to set up the sample data.
-- Create a partitioned table named sale_detail.
create table if not exists sale_detail
(
shop_name string,
customer_id string,
total_price double
)
partitioned by (sale_date string, region string);
-- Add partitions.
alter table sale_detail add partition (sale_date='2013', region='china') partition (sale_date='2014', region='shanghai');
-- Insert data.
insert into sale_detail partition (sale_date='2013', region='china') values ('s1','c1',100.1),('s2','c2',100.2),('s3','c3',100.3);
insert into sale_detail partition (sale_date='2014', region='shanghai') values ('null','c5',null),('s6','c6',100.4),('s7','c7',100.5);To verify the data:
set odps.sql.allow.fullscan=true;
select * from sale_detail;Result:
+------------+-------------+-------------+------------+------------+
| shop_name | customer_id | total_price | sale_date | region |
+------------+-------------+-------------+------------+------------+
| s1 | c1 | 100.1 | 2013 | china |
| s2 | c2 | 100.2 | 2013 | china |
| s3 | c3 | 100.3 | 2013 | china |
| null | c5 | NULL | 2014 | shanghai |
| s6 | c6 | 100.4 | 2014 | shanghai |
| s7 | c7 | 100.5 | 2014 | shanghai |
+------------+-------------+-------------+------------+------------+Basic subquery
A basic subquery appears in the FROM clause and acts as a derived table for the outer query. You can join it with other tables just like a regular table.
For information about JOIN syntax, see JOIN.
Syntax
select <select_expr> from (<select_statement>) [<sq_alias_name>];You can also use a WITH clause (common table expression) to name the subquery before the main SELECT:
with <sq_alias_name> as (<select_statement>)
select <select_expr> from <sq_alias_name>;Parameters
| Parameter | Required | Description |
|---|---|---|
select_expr | Yes | Columns or partition key columns to return. Accepts the format col1_name, col2_name, ... or a regular expression. |
select_statement | Yes | The subquery SELECT statement. See SELECT syntax. |
sq_alias_name | No | Alias for the subquery result set. Required when you reference the subquery in the outer query. |
Examples
Example 1: Select from a subquery result.
set odps.sql.allow.fullscan=true;
select * from (select shop_name from sale_detail) a;Result:
+------------+
| shop_name |
+------------+
| s1 |
| s2 |
| s3 |
| null |
| s6 |
| s7 |
+------------+Example 2: Join a subquery result with another table.
-- Create a shop table, then join a subquery against it.
create table shop as select shop_name, customer_id, total_price from sale_detail;
select a.shop_name, a.customer_id, a.total_price
from (select * from shop) a
join sale_detail on a.shop_name = sale_detail.shop_name;Result:
+------------+-------------+-------------+
| shop_name | customer_id | total_price |
+------------+-------------+-------------+
| null | c5 | NULL |
| s6 | c6 | 100.4 |
| s7 | c7 | 100.5 |
| s1 | c1 | 100.1 |
| s2 | c2 | 100.2 |
| s3 | c3 | 100.3 |
+------------+-------------+-------------+IN subquery
An IN subquery filters the outer query to rows where a column value matches any value returned by the subquery. Internally, MaxCompute converts this to a LEFT SEMI JOIN. For more information, see Semi join.
MaxCompute supports three syntax variants, including multi-column matching (compatible with PostgreSQL).
Syntax
Syntax 1 — uncorrelated, single column:
select <select_expr1> from <table_name1>
where <select_expr2> in (select <select_expr3> from <table_name2>);
-- Equivalent LEFT SEMI JOIN form:
select <select_expr1> from <table_name1> <alias_name1>
left semi join <table_name2> <alias_name2>
on <alias_name1>.<select_expr2> = <alias_name2>.<select_expr3>;When select_expr2 is a partition key column, MaxCompute does not convert the subquery to LEFT SEMI JOIN. Instead, it runs the subquery as a separate job, then compares results against the partition key values. Partitions whose key values do not appear in the subquery result are skipped, so partition pruning still applies.
Syntax 2 — correlated, single column:
select <select_expr1> from <table_name1>
where <select_expr2> in (
select <select_expr3> from <table_name2>
where <table_name1>.<col_name> = <table_name2>.<col_name>
);The where condition that references both the inner and outer table is a correlated condition. MaxCompute V1.0 does not support expressions that reference source tables from both subqueries and main queries. MaxCompute V2.0 supports such expressions. These conditions become part of the ON clause in the resulting SEMI JOIN.
When an IN subquery cannot be converted to a SEMI JOIN — for example, when it appears outside a WHERE clause, or when the WHERE clause condition cannot be expressed as a JOIN — MaxCompute runs it as a separate job. Correlated conditions are not supported in that path.
Syntax 3 — multi-column match:
select <select_expr1> from <table_name1>
where (<col_name1>, <col_name2>) in (
select <select_expr2>, <select_expr3> from <table_name2>
);Multi-column IN subqueries eliminate the need to split a query into multiple subqueries and save one JOIN operation. The subquery can use:
A simple SELECT statement with multiple columns
Aggregate functions (see Aggregate functions)
Constant value lists
Parameters
| Parameter | Required | Description |
|---|---|---|
select_expr1 | Yes | Columns to return from the main query (col1_name, col2_name, ... or a regular expression). |
table_name1, table_name2 | Yes | The main table and the subquery table. |
select_expr2, select_expr3 | Yes | Columns from table_name1 and table_name2 to compare. |
col_name | Yes | Column name used in a correlated condition. |
Usage notes
NULL values are automatically excluded from the subquery result. A row in the main table is included only if its column value is found in the non-NULL subquery results.
Examples
Example 1 (Syntax 1): Return rows from sale_detail where total_price appears in the shop table.
set odps.sql.allow.fullscan=true;
select * from sale_detail where total_price in (select total_price from shop);Result:
+-----------+-------------+-------------+-----------+----------+
| shop_name | customer_id | total_price | sale_date | region |
+-----------+-------------+-------------+-----------+----------+
| s1 | c1 | 100.1 | 2013 | china |
| s2 | c2 | 100.2 | 2013 | china |
| s3 | c3 | 100.3 | 2013 | china |
| s6 | c6 | 100.4 | 2014 | shanghai |
| s7 | c7 | 100.5 | 2014 | shanghai |
+-----------+-------------+-------------+-----------+----------+Example 2 (Syntax 2): Use a correlated condition to filter by customer_id across tables.
set odps.sql.allow.fullscan=true;
select * from sale_detail
where total_price in (
select total_price from shop
where customer_id = shop.customer_id
);Result:
+-----------+-------------+-------------+-----------+----------+
| shop_name | customer_id | total_price | sale_date | region |
+-----------+-------------+-------------+-----------+----------+
| s1 | c1 | 100.1 | 2013 | china |
| s2 | c2 | 100.2 | 2013 | china |
| s3 | c3 | 100.3 | 2013 | china |
| s6 | c6 | 100.4 | 2014 | shanghai |
| s7 | c7 | 100.5 | 2014 | shanghai |
+-----------+-------------+-------------+-----------+----------+Example 3 (Syntax 3): Multi-column matching with a simple SELECT, aggregate functions, and constants.
-- Set up sample tables for this example.
create table if not exists t1(a bigint, b bigint, c bigint, d bigint, e bigint);
create table if not exists t2(a bigint, b bigint, c bigint, d bigint, e bigint);
insert into table t1 values (1,3,2,1,1),(2,2,1,3,1),(3,1,1,1,1),(2,1,1,0,1),(1,1,1,0,1);
insert into table t2 values (1,3,5,0,1),(2,2,3,1,1),(3,1,1,0,1),(2,1,1,0,1),(1,1,1,0,1);
-- Scenario 1: Multi-column SELECT.
select a, b from t1 where (c, d) in (select a, b from t2 where e = t1.e);Result:
+------------+------------+
| a | b |
+------------+------------+
| 1 | 3 |
| 2 | 2 |
| 3 | 1 |
+------------+------------+-- Scenario 2: Aggregate functions in the subquery.
select a, b from t1 where (c, d) in (select max(a), b from t2 where e = t1.e group by b having max(a) > 0);Result:
+------------+------------+
| a | b |
+------------+------------+
| 2 | 2 |
+------------+------------+-- Scenario 3: Constant value list.
select a, b from t1 where (c, d) in ((1, 3), (1, 1));Result:
+------------+------------+
| a | b |
+------------+------------+
| 2 | 2 |
| 3 | 1 |
+------------+------------+NOT IN subquery
A NOT IN subquery filters the outer query to rows where a column value does not match any value returned by the subquery. Internally, MaxCompute converts this to a LEFT ANTI JOIN. For more information, see Semi join.
NOT IN subquery and LEFT ANTI JOIN handle NULLs differently. If any row in the main table has a NULL value in the filtered column, the entire NOT IN expression evaluates to NULL. The WHERE condition fails, and no rows are returned — even rows with non-NULL values. If you need to handle NULLs explicitly, rewrite the query as a LEFT ANTI JOIN instead.
Syntax
Syntax 1 — uncorrelated, single column:
select <select_expr1> from <table_name1>
where <select_expr2> not in (select <select_expr2> from <table_name2>);
-- Equivalent LEFT ANTI JOIN form:
select <select_expr1> from <table_name1> <alias_name1>
left anti join <table_name2> <alias_name2>
on <alias_name1>.<select_expr1> = <alias_name2>.<select_expr2>;When select_expr2 is a partition key column, MaxCompute does not convert the subquery to LEFT ANTI JOIN. It runs the subquery separately, then skips partitions whose key values do not appear in the result. Partition pruning remains valid.
Syntax 2 — correlated, single column:
select <select_expr1> from <table_name1>
where <select_expr2> not in (
select <select_expr2> from <table_name2>
where <table_name2_colname> = <table_name1>.<colname>
);The where condition referencing both tables is a correlated condition. MaxCompute V1.0 does not support expressions that reference source tables from both subqueries and main queries. MaxCompute V2.0 supports such expressions. These conditions become part of the ON clause in the resulting ANTI JOIN.
When a NOT IN subquery cannot be converted to an ANTI JOIN — for example, when the WHERE clause includes an AND with other conditions — MaxCompute runs it as a separate job. Correlated conditions are not supported in that path.
Syntax 3 — multi-column match:
select <select_expr1> from <table_name1>
where (<col_name1>, <col_name2>) not in (
select <select_expr2>, <select_expr3> from <table_name2>
);Multi-column NOT IN subqueries eliminate the need to split a query into multiple subqueries and save one JOIN operation. The subquery can use a simple SELECT with multiple columns, aggregate functions, or constant value lists.
Parameters
| Parameter | Required | Description |
|---|---|---|
select_expr1 | Yes | Columns to return from the main query. |
table_name1, table_name2 | Yes | The main table and the subquery table. |
select_expr2, select_expr3 | Yes | Columns to compare between the two tables. |
col_name | Yes | Column name used in a correlated condition. |
Usage notes
NULL values are automatically excluded from the subquery result.
If a row in the main table has a NULL value in the filtered column, the NOT IN expression returns NULL and the row is excluded from results regardless of other values. Use LEFT ANTI JOIN if this behavior is not desired.
Examples
Example 1 (Syntax 1): Return rows from shop1 where shop_name does not appear in sale_detail.
-- Create a shop1 table with an extra row not in sale_detail.
create table shop1 as select shop_name, customer_id, total_price from sale_detail;
insert into shop1 values ('s8','c1',100.1);
select * from shop1 where shop_name not in (select shop_name from sale_detail);Result:
+------------+-------------+-------------+
| shop_name | customer_id | total_price |
+------------+-------------+-------------+
| s8 | c1 | 100.1 |
+------------+-------------+-------------+Example 2 (Syntax 2): Correlated NOT IN subquery.
set odps.sql.allow.fullscan=true;
select * from shop1
where shop_name not in (
select shop_name from sale_detail
where customer_id = shop1.customer_id
);Result:
+------------+-------------+-------------+
| shop_name | customer_id | total_price |
+------------+-------------+-------------+
| s8 | c1 | 100.1 |
+------------+-------------+-------------+Example 3: NOT IN subquery that cannot be converted to ANTI JOIN. When the WHERE clause includes an AND with other predicates, MaxCompute cannot convert to ANTI JOIN and runs a separate subquery job.
set odps.sql.allow.fullscan=true;
select * from shop1
where shop_name not in (select shop_name from sale_detail)
and total_price < 100.3;Result:
+------------+-------------+-------------+
| shop_name | customer_id | total_price |
+------------+-------------+-------------+
| s8 | c1 | 100.1 |
+------------+-------------+-------------+Example 4: NULL in the filtered column — no rows returned. This example illustrates the NULL behavior warning above.
-- Create a sale table with a row where shop_name is NULL.
create table if not exists sale
(
shop_name string,
customer_id string,
total_price double
)
partitioned by (sale_date string, region string);
alter table sale add partition (sale_date='2013', region='china');
insert into sale partition (sale_date='2013', region='china')
values ('null','null',null),('s2','c2',100.2),('s3','c3',100.3),('s8','c8',100.8);
set odps.sql.allow.fullscan=true;
select * from sale where shop_name not in (select shop_name from sale_detail);Result: No rows are returned because one row in sale has a NULL shop_name, which makes the entire NOT IN expression evaluate to NULL.
+------------+-------------+-------------+------------+------------+
| shop_name | customer_id | total_price | sale_date | region |
+------------+-------------+-------------+------------+------------+
+------------+-------------+-------------+------------+------------+Example 5 (Syntax 3): Multi-column NOT IN matching.
-- Uses the same t1 and t2 tables from the IN subquery examples.
-- Scenario 1: Multi-column SELECT.
select a, b from t1 where (c, d) not in (select a, b from t2 where e = t1.e);Result:
+------------+------------+
| a | b |
+------------+------------+
| 2 | 1 |
| 1 | 1 |
+------------+------------+-- Scenario 2: Aggregate functions.
select a, b from t1 where (c, d) not in (select max(a), b from t2 where e = t1.e group by b having max(a) > 0);Result:
+------------+------------+
| a | b |
+------------+------------+
| 1 | 3 |
| 3 | 1 |
| 2 | 1 |
| 1 | 1 |
+------------+------------+-- Scenario 3: Constant value list.
select a, b from t1 where (c, d) not in ((1, 3), (1, 1));Result:
+------------+------------+
| a | b |
+------------+------------+
| 1 | 3 |
| 2 | 1 |
| 1 | 1 |
+------------+------------+EXISTS subquery
An EXISTS subquery returns True for an outer query row when the subquery returns at least one matching row. Use it to check whether a related record exists without caring about the actual values returned.
MaxCompute supports only WHERE-clause EXISTS subqueries with correlated conditions. You must convert an EXISTS subquery to LEFT SEMI JOIN before execution.
Syntax
select <select_expr> from <table_name1>
where exists (
select <select_expr> from <table_name2>
where <table_name2_colname> = <table_name1>.<colname>
);Parameters
| Parameter | Required | Description |
|---|---|---|
select_expr | Yes | Columns to return (col1_name, col2_name, ... or a regular expression). |
table_name1, table_name2 | Yes | The main table and the subquery table. |
col_name | Yes | Column used in the correlated condition. |
Usage notes
MaxCompute supports EXISTS subqueries only in the WHERE clause with a correlated condition.
To use an EXISTS subquery, you must convert this clause into LEFT SEMI JOIN.
Example
Return all rows from sale_detail that have a matching customer_id in shop.
set odps.sql.allow.fullscan=true;
select * from sale_detail
where exists (
select * from shop
where customer_id = sale_detail.customer_id
);
-- Equivalent LEFT SEMI JOIN form:
select * from sale_detail a left semi join shop b on a.customer_id = b.customer_id;Result:
+------------+-------------+-------------+------------+------------+
| shop_name | customer_id | total_price | sale_date | region |
+------------+-------------+-------------+------------+------------+
| null | c5 | NULL | 2014 | shanghai |
| s6 | c6 | 100.4 | 2014 | shanghai |
| s7 | c7 | 100.5 | 2014 | shanghai |
| s1 | c1 | 100.1 | 2013 | china |
| s2 | c2 | 100.2 | 2013 | china |
| s3 | c3 | 100.3 | 2013 | china |
+------------+-------------+-------------+------------+------------+NOT EXISTS subquery
A NOT EXISTS subquery returns True for an outer query row when the subquery returns no matching rows. Use it to find rows in the main table that have no related record in another table.
MaxCompute supports only WHERE-clause NOT EXISTS subqueries with correlated conditions. You must convert a NOT EXISTS subquery to LEFT ANTI JOIN before execution.
Syntax
select <select_expr> from <table_name1>
where not exists (
select <select_expr> from <table_name2>
where <table_name2_colname> = <table_name1>.<colname>
);Parameters
| Parameter | Required | Description |
|---|---|---|
select_expr | Yes | Columns to return (col1_name, col2_name, ... or a regular expression). |
table_name1, table_name2 | Yes | The main table and the subquery table. |
col_name | Yes | Column used in the correlated condition. |
Usage notes
MaxCompute supports NOT EXISTS subqueries only in the WHERE clause with a correlated condition.
To use a NOT EXISTS subquery, you must convert this clause into LEFT ANTI JOIN.
Example
Return all rows from sale_detail that have no matching shop_name in shop.
set odps.sql.allow.fullscan=true;
select * from sale_detail
where not exists (
select * from shop
where shop_name = sale_detail.shop_name
);
-- Equivalent LEFT ANTI JOIN form:
select * from sale_detail a left anti join shop b on a.shop_name = b.shop_name;Result: No rows are returned because every shop_name in sale_detail has a match in shop.
+------------+-------------+-------------+------------+------------+
| shop_name | customer_id | total_price | sale_date | region |
+------------+-------------+-------------+------------+------------+
+------------+-------------+-------------+------------+------------+Scalar subquery
A scalar subquery returns exactly one column from one row. You can use the result as a scalar value in a SELECT list, WHERE clause, or HAVING clause — wherever a scalar expression is valid.
MaxCompute can verify at compile time that a subquery returns one row only when:
The SELECT list uses aggregate functions not passed as parameters to a user-defined table-valued function (UDTF).
The aggregate subquery does not include a
GROUP BYclause.
If MaxCompute cannot statically determine at compile time that the subquery returns a single row, it reports a compile error. It does not wait until runtime to discover this.
If the output result of a scalar subquery contains only one row of data and one MAX or MIN operator is nested outside the scalar subquery, the result does not change.
MaxCompute also converts scalar subqueries to JOIN operations where possible. For example, a correlated scalar subquery in a WHERE clause that compares against a scalar value is internally rewritten as a LEFT SEMI JOIN with a HAVING predicate.
Syntax
Syntax 1 — correlated scalar in a WHERE clause (compared against a scalar value):
select <select_expr> from <table_name1>
where (
select count(*) from <table_name2>
where <table_name2_colname> = <table_name1>.<colname>
) <scalar_operator> <scalar_value>;
-- Equivalent JOIN form:
select <table_name1>.<select_expr> from <table_name1>
left semi join (
select <colname>, count(*) from <table_name2>
group by <colname>
having count(*) <scalar_operator> <scalar_value>
) <table_name2> on <table_name1>.<colname> = <table_name2>.<colname>;Syntax 2 — scalar subquery in the SELECT list:
select (<select_statement>) from <table_name>;Parameters
| Parameter | Required | Description |
|---|---|---|
select_expr | Yes | Columns to return from the main query. |
table_name1, table_name2 | Yes | The main table and the subquery table. |
col_name | Yes | Column used in the correlated condition. |
scalar_operator | Yes | Comparison operator: >, <, =, >=, or <=. |
scalar_value | Yes | The scalar threshold value to compare against. |
select_statement | Yes (Syntax 2) | The subquery statement. Must return exactly one row. See SELECT syntax. |
Usage notes
A scalar subquery can reference columns from the main query. If using multi-level nesting, only the outermost main query's columns can be referenced by the scalar subquery.
Multi-column scalar subqueries are supported in the SELECT list and WHERE clause, but only with equality (
=) comparisons. Greater-than or less-than comparisons with multi-column expressions are not supported.
Limitations
The following multi-level nesting is not supported because the inner scalar subquery references columns from t1, which is two levels above:
-- This statement fails: inner subquery cannot reference t1 columns.
select * from t1
where (
select count(*) from t2
where (select count(*) from t3 where t3.a = t1.a) = 2
) = 3;
-- This statement works: the scalar subquery references only t1 (one level up).
select * from t1 where (select count(*) from t2 where t1.a = t2.a) = 3;Examples
Example 1: Filter rows using a correlated count.
set odps.sql.allow.fullscan=true;
select * from shop
where (
select count(*) from sale_detail
where sale_detail.shop_name = shop.shop_name
) >= 1;Result:
+------------+-------------+-------------+
| shop_name | customer_id | total_price |
+------------+-------------+-------------+
| s1 | c1 | 100.1 |
| s2 | c2 | 100.2 |
| s3 | c3 | 100.3 |
| null | c5 | NULL |
| s6 | c6 | 100.4 |
| s7 | c7 | 100.5 |
+------------+-------------+-------------+Example 2: Multi-column scalar subqueries (equality only).
-- Set up sample tables.
create table if not exists ts(a bigint, b bigint, c double);
create table if not exists t(a bigint, b bigint, c double);
insert into table ts values (1,3,4.0),(1,3,3.0);
insert into table t values (1,3,4.0),(1,3,5.0);
-- Scenario 1: Scalar subquery with multiple columns in the SELECT list (equality only).
-- This fails: select (select a, b from t where c > ts.c) as (a, b), a from ts;
select (select a, b from t where c = ts.c) as (a, b), a from ts;Result:
+------------+------------+------------+
| a | b | a2 |
+------------+------------+------------+
| 1 | 3 | 1 |
| NULL | NULL | 1 |
+------------+------------+------------+-- Scenario 2: BOOLEAN expression with multi-column equality in the SELECT list.
-- This fails: select (a,b) > (select a,b from ts where c = t.c) from t;
select (a,b) = (select a,b from ts where c = t.c) from t;Result:
+-------+
| _c0 |
+-------+
| true |
| false |
+-------+-- Scenario 3: Multi-column equality comparison in a WHERE clause.
-- This fails: select * from t where (a,b) > (select a,b from ts where c = t.c);
select * from t where c > 3.0 and (a,b) = (select a,b from ts where c = t.c);Result:
+------------+------------+------------+
| a | b | c |
+------------+------------+------------+
| 1 | 3 | 4.0 |
+------------+------------+------------+select * from t where c > 3.0 or (a,b) = (select a,b from ts where c = t.c);Result:
+------------+------------+------------+
| a | b | c |
+------------+------------+------------+
| 1 | 3 | 4.0 |
| 1 | 3 | 5.0 |
+------------+------------+------------+Example 3 (Syntax 2): Use a scalar subquery in the SELECT list. The subquery must return exactly one row.
set odps.sql.allow.fullscan=true;
select (select * from sale_detail where shop_name='s1') from sale_detail;Result: The single row from the subquery is returned once for each row in the outer sale_detail table.
+------------+-------------+-------------+------------+------------+
| shop_name | customer_id | total_price | sale_date | region |
+------------+-------------+-------------+------------+------------+
| s1 | c1 | 100.1 | 2013 | china |
| s1 | c1 | 100.1 | 2013 | china |
| s1 | c1 | 100.1 | 2013 | china |
| s1 | c1 | 100.1 | 2013 | china |
| s1 | c1 | 100.1 | 2013 | china |
| s1 | c1 | 100.1 | 2013 | china |
+------------+-------------+-------------+------------+------------+Limitations
The following limitations apply across all subquery types:
EXISTS and NOT EXISTS: MaxCompute supports these only in the
WHEREclause with a correlated condition. Non-correlated EXISTS subqueries are not supported.Scalar subqueries: The subquery must be statically determinable as returning one row at compile time. Use an aggregate function without
GROUP BYto satisfy this requirement. If MaxCompute cannot confirm one-row output at compile time, the statement fails with a compile error.Scalar subquery nesting: Only the outermost main query's columns can be referenced inside a nested scalar subquery. Referencing columns from intermediate query levels is not supported.
Multi-column scalar comparisons: Only equality (
=) is supported. Greater-than or less-than comparisons with multi-column subquery expressions are not valid.IN/NOT IN without JOIN conversion: When an IN or NOT IN subquery appears outside a
WHEREclause, or when theWHEREcondition cannot be rewritten as a JOIN condition, MaxCompute runs a separate job. Correlated conditions are not supported in this execution path.
See also
Overusing subqueries or writing inefficient subqueries can slow down queries significantly in a large-scale data environment. Consider these alternatives:
Replace repeated subqueries with temporary tables or materialized views to avoid redundant computation.
Rewrite multiple correlated subqueries as a single JOIN operation.
For more information, see Materialized views and JOIN.