All Products
Search
Document Center

DataWorks:MaxCompute Script node

Last Updated:Apr 23, 2026

DataWorks provides the MaxCompute Script node, which supports development in script mode based on the MaxCompute 2.0 SQL engine. This topic describes how to use the MaxCompute Script node.

Background

The MaxCompute SQL engine supports script mode, which combines multiple SQL statements into a single script that is compiled and executed as one unit. This is useful for complex queries, such as nested subqueries or operations that require multiple steps. By submitting the entire script at once, you generate a unified execution plan, and the job is queued and run only once. This approach uses resources more efficiently. For more information, see SQL script mode. In DataWorks, you can use the MaxCompute Script node to create and schedule script-based tasks.

Use case

Script mode is ideal for rewriting deeply nested subqueries or splitting complex logic into multiple statements.

For example, in script mode, you can assign a constant to a variable and then use a SELECT * FROM variable statement to convert it into a scalar for use in calculations with other columns. The constant can also be stored in a single-row table. See the following example. For the conversion syntax, see Subquery (SUBQUERY).

@a := SELECT 10; -- Assign the constant 10 to @a. Alternatively, assign a value from a single-row table t1, such as SELECT col1 FROM t1.
@b := SELECT key,VALUE+(SELECT * FROM @a) FROM t2 WHERE key >10000; -- Calculate a new value by using the value column from table t2 and the value from @a.
SELECT * FROM @b;
Note

Avoid using a single script with table variables to combine data sources that become ready at significantly different times (for example, one source is ready at 01:00 and another at 07:00). The job will not start until all data is available, which can cause significant delays.

Prerequisites

Syntax

A complete MaxCompute script follows this order: SET statement > DDL statement > DML statement. Each type can have zero or more statements, but you cannot interleave them. Statements that start with an at sign (@) declare variables to pass results between statements. The syntax is as follows:

--SET statements
SET odps.sql.type.system.odps2=true;
[SET odps.stage.reducer.num=***;]
[...]
--DDL statements
CREATE TABLE table1 xxx;
[CREATE TABLE table2 xxx;]
[...]
--DML statements
@var1 := SELECT [ALL | DISTINCT] select_expr, select_expr, ...
    FROM table3
    [WHERE where_condition];
@var2 := SELECT [ALL | DISTINCT] select_expr, select_expr, ...
    FROM table4
    [WHERE where_condition];
@var3 := SELECT [ALL | DISTINCT] var1.select_expr, var2.select_expr, ...
    FROM @var1 JOIN @var2 ON ...;
INSERT OVERWRITE|INTO TABLE [PARTITION (partcol1=val1, partcol2=val2 ...)]
    SELECT [ALL | DISTINCT] select_expr, select_expr, ...
    FROM @var3;
[@var4 := SELECT [ALL | DISTINCT] var1.select_expr, var.select_expr, ... FROM @var1
    UNION ALL | UNION
    SELECT [ALL | DISTINCT] var1.select_expr, var.select_expr, ... FROM @var2;
CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name
    AS
    SELECT [ALL | DISTINCT] select_expr, select_expr, ...
    FROM var4;]

Limitations

The following limits apply to the MaxCompute Script node:

  • Script mode supports SET statements, DML statements, and some DDL statements. DDL statements that display results to the screen, such as DESC and SHOW, are not supported.

  • A script can contain at most one statement that displays results to the screen, such as a standalone SELECT statement. Otherwise, an error is returned. We recommend that you do not run screen-display SELECT statements in scripts.

  • A script can contain at most one CREATE TABLE AS statement, and it must be the last statement. We recommend that you write table creation statements and INSERT statements separately.

  • In script mode, if any statement fails, all statements in the script fail.

  • In script mode, a job is generated for data processing only after all input data is ready.

  • In script mode, an error is returned if a table is read after being written to.

    Incorrect example:

    INSERT OVERWRITE TABLE src2 SELECT * FROM src WHERE key > 0;
    @a := SELECT * FROM src2;
    SELECT * FROM @a;

    Correct example:

    To avoid read-after-write issues, you can make the following modification.

    @a := SELECT * FROM src WHERE key > 0;
    INSERT OVERWRITE TABLE src2 SELECT * FROM @a;
    SELECT * FROM @a;

Create a MaxCompute Script node

For information about how to create a node, see Create a MaxCompute Script node.

Develop a MaxCompute Script node

On the MaxCompute Script node editing page, you can select the following script examples based on your business needs for development.

Basic script development

In MaxCompute script mode, the SQL compilation process is straightforward. The following simple example shows how to use the MaxCompute Script node.

CREATE TABLE IF NOT EXISTS dest(key string , VALUE bigint) ;
CREATE TABLE IF NOT EXISTS dest2(key string,VALUE bigint ) ;
@a := SELECT * FROM src WHERE VALUE >0;
@b := SELECT * FROM src2 WHERE key IS NOT NULL;
@c := SELECT * FROM src3 WHERE VALUE IS NOT NULL;
@d := SELECT a.key,b.value FROM @a LEFT OUTER JOIN @b ON a.key=b.key AND b.value>0;
@e := SELECT a.key,c.value FROM @a INNER JOIN @c ON a.key=c.key;
@f := SELECT * FROM @d UNION SELECT * FROM @e UNION SELECT * FROM @a;
INSERT OVERWRITE TABLE dest  SELECT * FROM @f;
@g := SELECT e.key,c.value  FROM @e JOIN @c ON e.key=c.key;
INSERT OVERWRITE TABLE dest2 SELECT * FROM @g;
Advanced script development

MaxCompute script mode supports IF statements, which allow a program to automatically select execution logic based on conditions. The IF syntax in MaxCompute is categorized by Condition type: BOOLEAN expressions and BOOLEAN Scalar SubQueries.

  • IF condition is a BOOLEAN expression

    This type of IF ELSE statement determines which branch to execute at compile time. Example:

    --Data processing
    SET odps.sql.allow.fullscan=true;
    SET odps.optimizer.cbo.rule.filter.black=LM; 
    @date := '${var}';
    @row  TABLE(key int,VALUE bigint); --Declare the variable row. Its type is Table and its schema is string. 
    IF ( cast(@date  AS bigint) % 2 == 0 ) BEGIN 
    @row  := SELECT key,VALUE FROM src1; 
    END ELSE BEGIN
    @row  := SELECT key,VALUE FROM src2; 
    END
    INSERT OVERWRITE TABLE dest1 PARTITION(p='${var}')  SELECT key,VALUE FROM @row; 
    Note

    The code defines a variable named var. You must assign a value to the var variable in the scheduling parameter settings.

  • IF condition is a BOOLEAN Scalar SubQuery

    This type of IF ELSE statement cannot determine which branch to execute at compile time. Instead, it is determined at runtime. Therefore, multiple jobs must be submitted. Example:

    @i bigint;
    @t TABLE(id bigint, VALUE bigint);
    IF ((SELECT count(*) FROM src WHERE a = '5') > 1) BEGIN
    @i := 1;
    @t := SELECT @i, @i*2;
    END ELSE
    BEGIN
    @i := 2;
    @t := SELECT @i, @i*2;
    END
    SELECT id, VALUE FROM @t; 
  • Embedded UDF development

    You can also embed Java or Python code in SQL scripts to develop embedded UDFs in MaxCompute script mode. For more information, see Embedded UDF.

Debug a MaxCompute Script node

  1. Configure the relevant parameters in Run Configuration on the right side of the node editing page.

    Parameter

    Description

    Compute resource

    Select the MaxCompute compute resource that you have associated.

    Compute quota

    Select a compute quota that you created to provide the computing resources (CPU and memory) required by your compute jobs.

    If no compute quota is available, click Create Compute Quota in the drop-down list, and then create a compute quota in the MaxCompute console.

    Resource group

    Select a scheduling resource group that has passed the connectivity test with the compute resource. For more information, see Network connectivity.

  2. Select the MaxCompute data source that you created in the parameter dialog on the toolbar, and then click Run to run the MaxCompute Script task.

Next steps

  • Configure schedule settings: If you want the node in the project directory to be periodically scheduled, configure Scheduling Policy and the related scheduling properties in Scheduling Settings on the right side of the node.

  • Deploy a node: If you want to deploy the task to the production environment, click the image icon to initiate the deployment process. Nodes in the project directory are periodically scheduled only after they are deployed to the production environment.

Table creation statements and sample data

Example:

--Create tables
CREATE TABLE IF NOT EXISTS src(key string ,VALUE BIGINT);
INSERT INTO src VALUES ('1',11) ;
INSERT INTO src VALUES ('1',11) ;
CREATE TABLE IF NOT EXISTS  src2(key string ,VALUE BIGINT);
INSERT INTO src2  VALUES ('1',22);
INSERT INTO src2  VALUES ('2',22);
CREATE TABLE IF NOT EXISTS  src3(key string ,VALUE BIGINT);
INSERT INTO src3 VALUES ('1',33);
INSERT INTO src3 VALUES ('3',33);