A MaxCompute Script node runs multiple SQL statements as a single script using the MaxCompute V2.0 SQL engine. All statements compile into a unified execution plan and run as one job, reducing per-statement overhead compared to submitting statements individually.
Key constraints to know before writing your first script:
A script can contain at most one statement that displays results on screen (such as a standalone
SELECT).A script can contain at most one
CREATE TABLE ASstatement, and it must be the last statement.If any statement fails, the entire script fails.
Script execution model
When you submit a script, MaxCompute compiles all statements into a single execution plan, queues the job once, and runs it in one pass. This differs from running statements individually, where each statement requires its own compilation, queue, and execution cycle.
Script mode supports table variables (prefixed with @) to pass intermediate results between statements. For example:
@a := SELECT 10; -- Assign the constant 10 to @a.
@b := SELECT key, VALUE + (SELECT * FROM @a) FROM t2 WHERE key > 10000;
SELECT * FROM @b;For the full syntax reference and behavior details, see SQL in script mode.
Use cases
Script mode works best for:
Multi-level nested subqueries — Complex queries that would otherwise require rewriting or splitting into multiple nodes.
Multi-step transformations — Pipelines where each step reads from the output of the previous step using table variables.
Script mode is not suitable when input data from different sources arrives at very different times (for example, one source at 01:00 and another at 07:00). A job starts only after all input data is ready, which can cause unnecessary delays.
Prerequisites
Before you begin, make sure you have:
A MaxCompute computing resource attached to the DataWorks workspace
The required tables created in MaxCompute with sample data (see Table creation statements and sample data)
Syntax
A complete MaxCompute script follows this order: SET statements → DDL statements → DML statements. Each section can contain zero or more statements, but you cannot mix statement types across sections. Statements starting with @ declare table variables.
-- SET statements
SET odps.sql.type.system.odps2=true;
[SET odps.stage.reducer.num=***;]
[...]
-- DDL statements
CREATE TABLE table1 xxx;
[CREATE TABLE table2 xxx;]
[...]
-- DML statements
@var1 := SELECT [ALL | DISTINCT] select_expr, select_expr, ...
FROM table3
[WHERE where_condition];
@var2 := SELECT [ALL | DISTINCT] select_expr, select_expr, ...
FROM table4
[WHERE where_condition];
@var3 := SELECT [ALL | DISTINCT] var1.select_expr, var2.select_expr, ...
FROM @var1 JOIN @var2 ON ...;
INSERT OVERWRITE|INTO TABLE [PARTITION (partcol1=val1, partcol2=val2 ...)]
SELECT [ALL | DISTINCT] select_expr, select_expr, ...
FROM @var3;
[@var4 := SELECT [ALL | DISTINCT] var1.select_expr, var.select_expr, ... FROM @var1
UNION ALL | UNION
SELECT [ALL | DISTINCT] var1.select_expr, var.select_expr, ... FROM @var2;
CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name
AS
SELECT [ALL | DISTINCT] select_expr, select_expr, ...
FROM var4;]Limitations
| Limitation | Consequence |
|---|---|
Script mode supports SET statements, DML statements, and some DDL statements. DDL statements that display results on screen — such as DESC and SHOW — are not supported. | An error occurs if you include unsupported DDL statements. |
A script can contain only one statement that displays results on screen, such as a standalone SELECT. | An error occurs if you include more than one. |
A script can contain at most one CREATE TABLE AS statement, and it must be the last statement in the script. INSERT statements and CREATE TABLE statements must be written separately. | An error occurs if CREATE TABLE AS is not last, or if it is combined with INSERT. |
| If any statement in the script fails, the entire script fails. | The script stops and no further statements are executed. |
| A job starts only after all input data is ready. | If input data from different sources arrives at very different times, execution is delayed until all sources are available. |
| Writing to a table and then reading from the same table within one script is not allowed. | An error occurs. Restructure the script to avoid read-after-write on the same table. |
Read-after-write example
The following script causes an error because it writes to src2 and then reads from it in the same script:
-- Incorrect
INSERT OVERWRITE TABLE src2 SELECT * FROM src WHERE key > 0;
@a := SELECT * FROM src2;
SELECT * FROM @a;Restructure the script to avoid reading from a table you just wrote to:
-- Correct
@a := SELECT * FROM src WHERE key > 0;
INSERT OVERWRITE TABLE src2 SELECT * FROM @a;
SELECT * FROM @a;Create a MaxCompute Script node
For step-by-step instructions on creating the node, see Create a MaxCompute Script node.
Develop a MaxCompute Script node
Basic script development
The following example shows a complete script that reads from three source tables, performs joins and a union, and writes results to two destination tables. Intermediate results are passed between statements using table variables.
CREATE TABLE IF NOT EXISTS dest(key string, VALUE bigint);
CREATE TABLE IF NOT EXISTS dest2(key string, VALUE bigint);
@a := SELECT * FROM src WHERE VALUE > 0;
@b := SELECT * FROM src2 WHERE key IS NOT NULL;
@c := SELECT * FROM src3 WHERE VALUE IS NOT NULL;
@d := SELECT a.key, b.value FROM @a LEFT OUTER JOIN @b ON a.key = b.key AND b.value > 0;
@e := SELECT a.key, c.value FROM @a INNER JOIN @c ON a.key = c.key;
@f := SELECT * FROM @d UNION SELECT * FROM @e UNION SELECT * FROM @a;
INSERT OVERWRITE TABLE dest SELECT * FROM @f;
@g := SELECT e.key, c.value FROM @e JOIN @c ON e.key = c.key;
INSERT OVERWRITE TABLE dest2 SELECT * FROM @g;Advanced script development
IF statements
Script mode supports IF statements for conditional execution. The behavior differs based on whether the condition is a BOOLEAN expression or a BOOLEAN scalar subquery.
BOOLEAN expression — The branch is determined at compile time. The script submits one job.
SET odps.sql.allow.fullscan=true;
SET odps.optimizer.cbo.rule.filter.black=LM;
@date := '${var}';
@row TABLE(key int, VALUE bigint); -- Declare table variable with schema.
IF (cast(@date AS bigint) % 2 == 0) BEGIN
@row := SELECT key, VALUE FROM src1;
END ELSE BEGIN
@row := SELECT key, VALUE FROM src2;
END
INSERT OVERWRITE TABLE dest1 PARTITION(p='${var}') SELECT key, VALUE FROM @row;The variable var in this example must be assigned a value in the node's scheduling parameter settings before running.BOOLEAN scalar subquery — The branch cannot be determined at compile time. The decision is made at runtime, so MaxCompute submits multiple jobs.
@i bigint;
@t TABLE(id bigint, VALUE bigint);
IF ((SELECT count(*) FROM src WHERE a = '5') > 1) BEGIN
@i := 1;
@t := SELECT @i, @i * 2;
END ELSE BEGIN
@i := 2;
@t := SELECT @i, @i * 2;
END
SELECT id, VALUE FROM @t;Embedded user-defined function (UDF) development
Script mode supports embedded user-defined functions (UDFs) by embedding Java or Python code directly in the SQL script. For details, see Embedded UDFs.
Test a MaxCompute Script node
On the node editor page, configure the parameters in the Run Configuration section on the right.
Parameter Description Computing resource Select the attached MaxCompute computing resource. Computing quota Select a computing quota (Quota) that provides CPU and memory for the job. If none is available, click Create Computing Quota in the drop-down list, then configure a quota in the MaxCompute console. Resource group Select the scheduling resource group that passed the connectivity test with the computing resource. For details, see Network connectivity solutions. In the parameter dialog box on the toolbar, select the MaxCompute data source and click Run to execute the script.
What's next
Node scheduling configuration: Set a Scheduling and configure the scheduling properties in the Scheduling Policies section to run the node on a schedule.
Node deployment: Click the
icon to publish the node to the production environment. Nodes run on a schedule only after they are published.
Table creation statements and sample data
Use the following statements to create the source tables and insert sample data used in the examples above:
-- Create tables and insert sample data
CREATE TABLE IF NOT EXISTS src(key string, VALUE BIGINT);
INSERT INTO src VALUES ('1', 11);
INSERT INTO src VALUES ('1', 11);
CREATE TABLE IF NOT EXISTS src2(key string, VALUE BIGINT);
INSERT INTO src2 VALUES ('1', 22);
INSERT INTO src2 VALUES ('2', 22);
CREATE TABLE IF NOT EXISTS src3(key string, VALUE BIGINT);
INSERT INTO src3 VALUES ('1', 33);
INSERT INTO src3 VALUES ('3', 33);