A common table expression (CTE) is a named subquery defined at the top of a DML statement using the WITH keyword. Unlike a derived table or inline subquery, a CTE can be referenced multiple times within the same statement, eliminating duplicate code. MaxCompute supports standard SQL CTEs and extends them with a materialize hint that caches intermediate results to improve performance in complex queries.
MaxCompute supports three forms of CTE:
Non-recursive CTE: A standard named subquery with no self-reference. Use this to simplify queries that reuse the same subquery logic.
Recursive CTE: A self-referencing CTE that iterates until a termination condition is met. Use this to traverse hierarchical data such as organizational charts or bill-of-materials trees.
Materialized CTE: A non-recursive CTE annotated with
/*+ MATERIALIZE */. MaxCompute caches the result in a temporary table so subsequent references read from the cache instead of re-executing the query.
Non-recursive CTE
Syntax
WITH
<cte_name> [(col_name [, col_name] ...)] AS (
<cte_query>
)
[, <cte_name> [(col_name [, col_name] ...)] AS (
<cte_query2>
)
, ...]Parameters
Parameter | Required | Description |
| Yes | The name of the CTE. Must be unique within the |
| No | Output column names for the CTE. If omitted, column names are inherited from the |
| Yes | A |
Example
The following query uses a UNION ALL to combine two JOIN operations. Both joins share the same left-hand subquery, which must be duplicated without a CTE:
INSERT OVERWRITE TABLE srcp PARTITION (p='abc')
SELECT * FROM (
SELECT a.key, b.value
FROM (
SELECT * FROM src WHERE key IS NOT NULL) a
JOIN (
SELECT * FROM src2 WHERE value > 0) b
ON a.key = b.key
) c
UNION ALL
SELECT * FROM (
SELECT a.key, b.value
FROM (
SELECT * FROM src WHERE key IS NOT NULL) a
LEFT OUTER JOIN (
SELECT * FROM src3 WHERE value > 0) b
ON a.key = b.key AND b.key IS NOT NULL
) d;Rewriting with a CTE removes the duplication. The subquery a is defined once and reused by both joins:
WITH
a AS (SELECT * FROM src WHERE key IS NOT NULL),
b AS (SELECT * FROM src2 WHERE value > 0),
c AS (SELECT * FROM src3 WHERE value > 0),
d AS (SELECT a.key, b.value FROM a JOIN b ON a.key = b.key),
e AS (SELECT a.key, c.value FROM a LEFT OUTER JOIN c ON a.key = c.key AND c.key IS NOT NULL)
INSERT OVERWRITE TABLE srcp PARTITION (p='abc')
SELECT * FROM d UNION ALL SELECT * FROM e;Recursive CTE
Syntax
WITH RECURSIVE <cte_name> [(col_name [, col_name] ...)] AS (
<initial_part> UNION ALL <recursive_part>
)
SELECT ... FROM ...;How it works
A recursive CTE evaluates in repeated iterations using a working table:
MaxCompute evaluates
initial_partand writes the result to both the final result set and a working table. Within the CTE,cte_nameis an alias for the working table.While the working table is not empty:
MaxCompute evaluates
recursive_partusing the current working table wherevercte_nameis referenced.The result is appended to the final result set and written to a temporary table.
The working table is replaced by the temporary table.
MaxCompute returns the union of all accumulated results.
Each iteration's cte_name reference reads only the results of the previous iteration, not all accumulated results. Add a termination condition in recursive_part (such as WHERE level < 5) to stop iteration when the working table becomes empty.
Parameters
Parameter | Required | Description |
| Yes | The recursive CTE clause must start with |
| Yes | The name of the CTE. Must be unique within the current |
| No | Output column names. If omitted, column names are inferred from |
| Yes | A |
| Yes | A |
| Yes | Connects |
Limitations
Recursive CTEs cannot appear in
IN,EXISTS, or scalar subqueries.The default maximum number of iterations is 10. Increase the limit by setting
odps.sql.rcte.max.iterate.num(maximum value: 100).Intermediate results are not saved between iterations. If the task fails, execution restarts from the beginning. For long-running recursive computations, either limit the number of iterations or store intermediate results in a temporary table.
Recursive CTEs are not supported in MaxCompute Query Acceleration (MCQA). If the task runs in MCQA mode with
interactive_auto_rerun=true, it falls back to normal mode. Otherwise, the task fails.
Examples
Example 1: Generate a sequence with explicit column names
The following defines a recursive CTE with two output columns, a and b, iterating while a + 1 <= 5:
-- Method 1: Explicitly specify output column names
WITH RECURSIVE cte_name(a, b) AS (
SELECT 1L, 1L -- initial_part: iteration 0
UNION ALL
SELECT a+1, b+1 FROM cte_name WHERE a+1 <= 5 -- recursive_part: references previous iteration
)
SELECT * FROM cte_name ORDER BY a LIMIT 100;
-- Method 2: Infer column names from initial_part
WITH RECURSIVE cte_name AS (
SELECT 1L AS a, 1L AS b
UNION ALL
SELECT a+1, b+1 FROM cte_name WHERE a+1 <= 5
)
SELECT * FROM cte_name ORDER BY a LIMIT 100;Inrecursive_part, include a termination condition such asWHERE a + 1 <= 5to stop iteration when the working table becomes empty and avoid an infinite loop.
Result:
+------------+------------+
| a | b |
+------------+------------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 5 | 5 |
+------------+------------+Example 2: Recursive CTE in a subquery (compile error)
Recursive CTEs are not allowed inside IN, EXISTS, or scalar subqueries. The following query fails to compile:
WITH RECURSIVE cte_name(a, b) AS (
SELECT 1L, 1L
UNION ALL
SELECT a+1, b+1 FROM cte_name WHERE a+1 <= 5)
SELECT x, x IN (SELECT a FROM cte_name) FROM VALUES (1L), (2L) AS t(x);Error:
FAILED: ODPS-0130071:[5,31] Semantic analysis exception - using Recursive-CTE cte_name in scalar/in/exists sub-query is not allowed, please check your query, the query text location is from [line 5, column 13] to [line 5, column 40]Example 3: Traverse an organizational hierarchy
Create an employees table and insert data:
CREATE TABLE employees(name STRING, boss_name STRING);
INSERT INTO TABLE employees VALUES
('zhang_3', null),
('li_4', 'zhang_3'),
('wang_5', 'zhang_3'),
('zhao_6', 'li_4'),
('qian_7', 'wang_5');Define a recursive CTE named company_hierarchy with three output columns: the employee's name, their manager's name, and their level in the hierarchy:
WITH RECURSIVE company_hierarchy(name, boss_name, level) AS (
SELECT name, boss_name, 0L FROM employees WHERE boss_name IS NULL
UNION ALL
SELECT e.name, e.boss_name, h.level + 1
FROM employees e
JOIN company_hierarchy h ON e.boss_name = h.name
)
SELECT * FROM company_hierarchy ORDER BY level, boss_name, name LIMIT 1000;Execution proceeds as follows:
Iteration 0 (
initial_part): Selects employees whereboss_name IS NULL, assigninglevel = 0. Result:('zhang_3', NULL, 0).Iteration 1 (
recursive_part): Joinsemployeeswith the working table (iteration 0). The conditione.boss_name = h.namefinds employees managed byzhang_3. Result:li_4andwang_5atlevel = 1.Iteration 2: Finds employees managed by
li_4orwang_5. Result:zhao_6andqian_7atlevel = 2.Iteration 3: No employees have
zhao_6orqian_7as managers. The working table is empty, and iteration stops.
Result:
+---------+-----------+------------+
| name | boss_name | level |
+---------+-----------+------------+
| zhang_3 | NULL | 0 |
| li_4 | zhang_3 | 1 |
| wang_5 | zhang_3 | 1 |
| zhao_6 | li_4 | 2 |
| qian_7 | wang_5 | 2 |
+---------+-----------+------------+Materialized CTE
By default, MaxCompute expands all CTEs inline when generating an execution plan. If a CTE is referenced multiple times, its query runs multiple times. For example:
WITH v1 AS (SELECT SIN(1.0) AS a)
SELECT a FROM v1 UNION ALL SELECT a FROM v1;This is equivalent to running SIN(1.0) twice:
SELECT a FROM (SELECT SIN(1.0) AS a)
UNION ALL
SELECT a FROM (SELECT SIN(1.0) AS a);In deeply nested CTE chains where each level fans out to multiple references, full expansion produces an exponentially large syntax tree that can exceed memory limits and prevent the execution plan from being generated.
Usage example
Add the /*+ MATERIALIZE */ hint to the top-level SELECT of a non-recursive CTE to cache its result in a temporary table. Subsequent references read from the cache instead of re-running the query:
WITH v1 AS (SELECT /*+ MATERIALIZE */ SIN(1.0) AS a)
SELECT a FROM v1 UNION ALL SELECT a FROM v1;
-- Result:
+------------+
| a |
+------------+
| 0.8414709848078965 |
| 0.8414709848078965 |
+------------+When the hint is effective, the Job Details tab in LogView shows multiple Fuxi Jobs, confirming that the intermediate result was stored.

Limitations
The hint must be placed on the top-level `SELECT` of the CTE body. If the top-level statement is a
UNION, the hint has no effect. Incorrect — the top-level statement isUNION, notSELECT, so the hint is ignored:WITH v1 AS ( SELECT /*+ MATERIALIZE */ SIN(1.0) AS a UNION ALL SELECT /*+ MATERIALIZE */ SIN(1.0) AS a) SELECT a FROM v1 UNION ALL SELECT a FROM v1;In LogView, only one Fuxi Job is submitted, confirming the hint had no effect: Correct — wrap the
UNIONin a subquery and apply the hint to the outerSELECT:WITH v1 AS ( SELECT /*+ MATERIALIZE */ * FROM ( SELECT SIN(1.0) AS a UNION ALL SELECT SIN(1.0) AS a ) ) SELECT a FROM v1 UNION ALL SELECT a FROM v1;
If the CTE uses a non-deterministic function such as
RANDor a non-deterministic Java/Python UDF, materializing the CTE caches a single evaluation of the function. Subsequent references return the cached value, which changes the semantic behavior of queries that expect independent random values per call.Materialized CTEs are not supported in MaxCompute Query Acceleration (MCQA). If the task runs in MCQA mode with
interactive_auto_rerun=true, it falls back to normal mode. Otherwise, the task fails.
Troubleshooting
Recursive CTE runs until it hits the iteration limit
Symptom: The recursive CTE task fails with an error such as:
FAILED: ODPS-0010000:System internal error - recursive-cte: company_hierarchy exceed max iterate number 10Cause 1: Cyclic data
The input data contains a cycle — a record that directly or indirectly references itself. For example, if qian_7 is listed as their own manager:
INSERT INTO TABLE employees VALUES('qian_7', 'qian_7');Re-running the company_hierarchy CTE against this data produces an infinite loop because qian_7 appears in every subsequent iteration with no termination condition that eliminates it.
Resolution: Audit the source data for cycles before running a recursive CTE on hierarchical data. Alternatively, track the visited path in an array column and exclude rows where the current node has already been visited.
Cause 2: Missing or incorrect termination condition
The recursive_part does not include a condition that causes the working table to become empty.
Resolution: Add a termination condition in recursive_part. The condition must evaluate to false for all rows at some iteration, producing an empty working table. For example, WHERE a + 1 <= 5 stops iteration when a reaches 5.
Adjusting the iteration limit
If your data is valid and requires more than 10 iterations, increase the limit with:
SET odps.sql.rcte.max.iterate.num = <value>;The maximum allowed value is 100. For recursive computations that require more iterations or that are long-running, store intermediate results in a temporary table to avoid restarting from the beginning on failure.