All Products
Search
Document Center

DataWorks:MaxCompute Script node

Last Updated:Mar 26, 2026

A MaxCompute Script node runs multiple SQL statements as a single script using the MaxCompute V2.0 SQL engine. All statements compile into a unified execution plan and run as one job, reducing per-statement overhead compared to submitting statements individually.

Key constraints to know before writing your first script:

  • A script can contain at most one statement that displays results on screen (such as a standalone SELECT).

  • A script can contain at most one CREATE TABLE AS statement, and it must be the last statement.

  • If any statement fails, the entire script fails.

Script execution model

When you submit a script, MaxCompute compiles all statements into a single execution plan, queues the job once, and runs it in one pass. This differs from running statements individually, where each statement requires its own compilation, queue, and execution cycle.

Script mode supports table variables (prefixed with @) to pass intermediate results between statements. For example:

@a := SELECT 10; -- Assign the constant 10 to @a.
@b := SELECT key, VALUE + (SELECT * FROM @a) FROM t2 WHERE key > 10000;
SELECT * FROM @b;

For the full syntax reference and behavior details, see SQL in script mode.

Use cases

Script mode works best for:

  • Multi-level nested subqueries — Complex queries that would otherwise require rewriting or splitting into multiple nodes.

  • Multi-step transformations — Pipelines where each step reads from the output of the previous step using table variables.

Script mode is not suitable when input data from different sources arrives at very different times (for example, one source at 01:00 and another at 07:00). A job starts only after all input data is ready, which can cause unnecessary delays.

Prerequisites

Before you begin, make sure you have:

Syntax

A complete MaxCompute script follows this order: SET statements → DDL statements → DML statements. Each section can contain zero or more statements, but you cannot mix statement types across sections. Statements starting with @ declare table variables.

-- SET statements
SET odps.sql.type.system.odps2=true;
[SET odps.stage.reducer.num=***;]
[...]
-- DDL statements
CREATE TABLE table1 xxx;
[CREATE TABLE table2 xxx;]
[...]
-- DML statements
@var1 := SELECT [ALL | DISTINCT] select_expr, select_expr, ...
    FROM table3
    [WHERE where_condition];
@var2 := SELECT [ALL | DISTINCT] select_expr, select_expr, ...
    FROM table4
    [WHERE where_condition];
@var3 := SELECT [ALL | DISTINCT] var1.select_expr, var2.select_expr, ...
    FROM @var1 JOIN @var2 ON ...;
INSERT OVERWRITE|INTO TABLE [PARTITION (partcol1=val1, partcol2=val2 ...)]
    SELECT [ALL | DISTINCT] select_expr, select_expr, ...
    FROM @var3;
[@var4 := SELECT [ALL | DISTINCT] var1.select_expr, var.select_expr, ... FROM @var1
    UNION ALL | UNION
    SELECT [ALL | DISTINCT] var1.select_expr, var.select_expr, ... FROM @var2;
CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name
    AS
    SELECT [ALL | DISTINCT] select_expr, select_expr, ...
    FROM var4;]

Limitations

LimitationConsequence
Script mode supports SET statements, DML statements, and some DDL statements. DDL statements that display results on screen — such as DESC and SHOW — are not supported.An error occurs if you include unsupported DDL statements.
A script can contain only one statement that displays results on screen, such as a standalone SELECT.An error occurs if you include more than one.
A script can contain at most one CREATE TABLE AS statement, and it must be the last statement in the script. INSERT statements and CREATE TABLE statements must be written separately.An error occurs if CREATE TABLE AS is not last, or if it is combined with INSERT.
If any statement in the script fails, the entire script fails.The script stops and no further statements are executed.
A job starts only after all input data is ready.If input data from different sources arrives at very different times, execution is delayed until all sources are available.
Writing to a table and then reading from the same table within one script is not allowed.An error occurs. Restructure the script to avoid read-after-write on the same table.

Read-after-write example

The following script causes an error because it writes to src2 and then reads from it in the same script:

-- Incorrect
INSERT OVERWRITE TABLE src2 SELECT * FROM src WHERE key > 0;
@a := SELECT * FROM src2;
SELECT * FROM @a;

Restructure the script to avoid reading from a table you just wrote to:

-- Correct
@a := SELECT * FROM src WHERE key > 0;
INSERT OVERWRITE TABLE src2 SELECT * FROM @a;
SELECT * FROM @a;

Create a MaxCompute Script node

For step-by-step instructions on creating the node, see Create a MaxCompute Script node.

Develop a MaxCompute Script node

Basic script development

The following example shows a complete script that reads from three source tables, performs joins and a union, and writes results to two destination tables. Intermediate results are passed between statements using table variables.

CREATE TABLE IF NOT EXISTS dest(key string, VALUE bigint);
CREATE TABLE IF NOT EXISTS dest2(key string, VALUE bigint);
@a := SELECT * FROM src WHERE VALUE > 0;
@b := SELECT * FROM src2 WHERE key IS NOT NULL;
@c := SELECT * FROM src3 WHERE VALUE IS NOT NULL;
@d := SELECT a.key, b.value FROM @a LEFT OUTER JOIN @b ON a.key = b.key AND b.value > 0;
@e := SELECT a.key, c.value FROM @a INNER JOIN @c ON a.key = c.key;
@f := SELECT * FROM @d UNION SELECT * FROM @e UNION SELECT * FROM @a;
INSERT OVERWRITE TABLE dest SELECT * FROM @f;
@g := SELECT e.key, c.value FROM @e JOIN @c ON e.key = c.key;
INSERT OVERWRITE TABLE dest2 SELECT * FROM @g;

Advanced script development

IF statements

Script mode supports IF statements for conditional execution. The behavior differs based on whether the condition is a BOOLEAN expression or a BOOLEAN scalar subquery.

BOOLEAN expression — The branch is determined at compile time. The script submits one job.

SET odps.sql.allow.fullscan=true;
SET odps.optimizer.cbo.rule.filter.black=LM;
@date := '${var}';
@row TABLE(key int, VALUE bigint); -- Declare table variable with schema.
IF (cast(@date AS bigint) % 2 == 0) BEGIN
  @row := SELECT key, VALUE FROM src1;
END ELSE BEGIN
  @row := SELECT key, VALUE FROM src2;
END
INSERT OVERWRITE TABLE dest1 PARTITION(p='${var}') SELECT key, VALUE FROM @row;
The variable var in this example must be assigned a value in the node's scheduling parameter settings before running.

BOOLEAN scalar subquery — The branch cannot be determined at compile time. The decision is made at runtime, so MaxCompute submits multiple jobs.

@i bigint;
@t TABLE(id bigint, VALUE bigint);
IF ((SELECT count(*) FROM src WHERE a = '5') > 1) BEGIN
  @i := 1;
  @t := SELECT @i, @i * 2;
END ELSE BEGIN
  @i := 2;
  @t := SELECT @i, @i * 2;
END
SELECT id, VALUE FROM @t;

Embedded user-defined function (UDF) development

Script mode supports embedded user-defined functions (UDFs) by embedding Java or Python code directly in the SQL script. For details, see Embedded UDFs.

Test a MaxCompute Script node

  1. On the node editor page, configure the parameters in the Run Configuration section on the right.

    ParameterDescription
    Computing resourceSelect the attached MaxCompute computing resource.
    Computing quotaSelect a computing quota (Quota) that provides CPU and memory for the job. If none is available, click Create Computing Quota in the drop-down list, then configure a quota in the MaxCompute console.
    Resource groupSelect the scheduling resource group that passed the connectivity test with the computing resource. For details, see Network connectivity solutions.
  2. In the parameter dialog box on the toolbar, select the MaxCompute data source and click Run to execute the script.

What's next

  • Node scheduling configuration: Set a Scheduling and configure the scheduling properties in the Scheduling Policies section to run the node on a schedule.

  • Node deployment: Click the image icon to publish the node to the production environment. Nodes run on a schedule only after they are published.

Table creation statements and sample data

Use the following statements to create the source tables and insert sample data used in the examples above:

-- Create tables and insert sample data
CREATE TABLE IF NOT EXISTS src(key string, VALUE BIGINT);
INSERT INTO src VALUES ('1', 11);
INSERT INTO src VALUES ('1', 11);
CREATE TABLE IF NOT EXISTS src2(key string, VALUE BIGINT);
INSERT INTO src2 VALUES ('1', 22);
INSERT INTO src2 VALUES ('2', 22);
CREATE TABLE IF NOT EXISTS src3(key string, VALUE BIGINT);
INSERT INTO src3 VALUES ('1', 33);
INSERT INTO src3 VALUES ('3', 33);