All Products
Search
Document Center

DataWorks:Develop an ODPS SQL task

Last Updated:Apr 21, 2026

A MaxCompute SQL task uses an SQL-like syntax for distributed processing of large-scale data (TB-level) when real-time results are not required. This topic describes how to develop an SQL task in DataWorks and provides important usage notes.

Introduction

ODPS SQL is used to process and query data in MaxCompute. It supports common SQL operations, such as SELECT, INSERT, UPDATE, and DELETE, as well as syntax and functions that are specific to MaxCompute. With ODPS SQL, you can write SQL-like statements to query and process data without writing complex data processing logic. For more information about the SQL syntax, see SQL Overview.

Limitations

ODPS SQL nodes in DataWorks have the following limitations:

Category

Description

Use of comments

Only single-line comments in the -- comment content format are supported. Multi-line comments in the /* comment content */ format are not supported.

For more information, see MaxCompute SQL comments.

The following limitations on comments also apply:

  • You cannot add a comment after a keyword statement, such as SET or USE.

  • You cannot use a semicolon (;) in a comment.

  • You cannot add a comment after a statement that terminates with a semicolon (;).

SQL submission

You cannot use SET or USE statements alone. You must execute them with other SQL statements.

SQL development

The size of the SQL code cannot exceed 128 KB, and the number of SQL statements cannot exceed 200.

Query result

Only SQL statements that start with SELECT or WITH can return a formatted result set.

A query result is subject to the following limitations:

  • If a query result exceeds 10,000 rows, a maximum of 10,000 rows are displayed.

  • If the size of a query result exceeds 10 MB, the error message Result is too large, exceed the limit size: 10MB is returned.

Note

If you encounter these limitations, you can download the query result for local viewing by using one of the following methods:

Usage notes

  • Ensure that the account used to run the ODPS SQL task has the required permissions on the corresponding MaxCompute project. For more information, see DataWorks On MaxCompute permission control and MaxCompute permissions.

  • MaxCompute SQL tasks require quota resources to run. If a task runs for a long time, go to the MaxCompute console to check the quota consumption and ensure that sufficient resources are available to run the task. For more information, see Computing Resources - Quota Management.

  • When you develop a task on an ODPS SQL node, special parameters, such as an OSS address, must be enclosed in double quotation marks ("). Otherwise, a parsing error occurs and the task fails.

  • In some extreme cases, such as a server power outage or a primary/secondary switchover, DataWorks may not be able to completely terminate related MaxCompute tasks. In this situation, go to the corresponding MaxCompute project to terminate the job.

  • The execution order of keyword statements (SET and USE) varies by DataWorks environment.

    • In DataStudio: All keyword statements (SET and USE) are combined and executed before other SQL statements.

    • In the Scheduling environment: Statements are executed in their written order.

    Assume that the following code is defined in a node.

    SET a=b;
    CREATE TABLE name1(id string);
    SET c=d;
    CREATE TABLE name2(id string);

    The following table describes the execution order in different environments.

    Executed SQL

    DataStudio

    Scheduling environment

    First SQL statement

    SET a=b;
    SET c=d;
    CREATE TABLE name1(id string);
    SET a=b;
    CREATE TABLE name1(id string);

    Second SQL statement

    SET a=b;
    SET c=d;
    CREATE TABLE name2(id string);
    SET c=d;
    CREATE TABLE name2(id string);

Edit code: Simple examples

SQL commands

MaxCompute SQL commands use a syntax similar to standard SQL and support DDL, DML, and DQL statements, as well as specific MaxCompute syntax. For more information about the syntax requirements and usage examples of each SQL command, see SQL overview. The following simple example shows how to develop and run SQL commands.

Note
  • When you use new data types with MaxCompute 2.0 extended functions, you need to add SET odps.sql.type.system.odps2=true; before the SQL statement and submit them together so that the new data types can be used. For more information, see 2.0 Data Types Version.

  • DataWorks provides scheduling parameters to enable dynamic parameter passing in scheduled scenarios. You can define variables in your ODPS SQL node code using the ${variable_name} format and assign values in the Properties > Parameter section. For more information about supported formats for scheduling parameters, see Supported formats for scheduling parameters.

  • Create a table

    You can use the CREATE TABLE statement to create a non-partitioned table, partitioned table, external table, or clustered table. For more information, see CREATE TABLE. Sample SQL statement:

    -- Create a partitioned table named test1.
    CREATE TABLE if NOT EXISTS students
    ( id BIGINT,
      name STRING,
      age BIGINT,
      birth DATE)
    partitioned BY (gender STRING); 
  • Insert data

    You can use the INSERT INTO or INSERT OVERWRITE statement to insert or update data in a destination table. For more information, see Insert or overwrite data (INSERT INTO | INSERT OVERWRITE). Sample SQL statements:

    -- Insert data.
    INSERT INTO students PARTITION(gender='boy') VALUES (1,'John',15,DATE '2008-05-15') ;
    INSERT INTO students PARTITION(gender='boy') VALUES (2,'Jack',17,DATE '2006-07-20') ;
    INSERT INTO students PARTITION(gender='girl') VALUES (3,'Alice',20,DATE '2003-04-20') ;
    INSERT INTO students PARTITION(gender='girl') VALUES (4,'Lily',21,DATE '2002-01-08') ;
    INSERT INTO students PARTITION(gender='boy') VALUES (5,'Bob',17,DATE '2006-09-12') ;
    Important

    Avoid using the INSERT INTO statement to insert data, as it may cause accidental data duplication. We recommend that you use the INSERT OVERWRITE statement instead. For more information, see Insert or overwrite data.

  • Query data

    You can use the SELECT statement to perform operations such as nested queries, grouped queries, and sorting. For more information, see SELECT syntax. Sample SQL statements:

    -- (Optional) Enable full table scan at the project level. This operation requires high-level permissions.
    -- SETPROJECT odps.sql.allow.fullscan=true;  
    
    -- Enable full table scan at the session level. This setting is valid only for the current session.
    SET odps.sql.allow.fullscan=true; 
    -- Query the information about all boys and sort the results by ID in ascending order.
    SELECT * FROM students WHERE gender='boy' ORDER BY id;
    Note

    By default, a RAM user does not have query permissions on production tables. To query production tables, submit a request in Security Center. For information about MaxCompute data permission presets and access control in DataWorks, see MaxCompute data permission control details. For information about how to grant permissions by using commands in MaxCompute, see Manage user permissions by using commands.

SQL functions

MaxCompute supports built-in functions and user-defined functions (UDFs) for data development and analysis. For more information about built-in functions, see Overview of built-in functions. For more information about UDFs, see MaxCompute UDF overview. The following example shows how to use SQL functions.

  • Built-in functions: These functions are pre-installed in MaxCompute and can be called directly. Based on the preceding example, you can use the dateadd function to modify the birth column by a specified unit and interval. Sample command:

    -- Enable full table scan at the session level. This setting is valid only for the current session.
    SET odps.sql.allow.fullscan=true; 
    SELECT id, name, age, birth, dateadd(birth,1,'mm') AS birth_dateadd FROM students;
  • User-defined functions (UDFs): To use a UDF, you must write the function code, upload it as a resource, and register the function. For more information, see Create and use a UDF.

Execution and results

  • Run results appear directly in a spreadsheet format. You can perform operations in DataWorks, open the results in a spreadsheet, or copy and paste the content into a local Excel file. For more information, see Debug tasks.

    Note

    Due to an adjustment in the time zone information for China released by the International Organization for Standardization (ISO), a time discrepancy may appear in the date display when you execute relevant SQL queries in DataWorks. The time difference is 5 minutes and 52 seconds for dates between 1900 and 1928, and 9 seconds for dates before 1900.

  • Runtime Log: Click the Operational Logs tab to view the Logview. For more information, see View job information by using Logview V2.0.

  • Returned results:

    • Query the information about all boys and sort the results by ID in ascending order.

      +------------+------------+------------+------------+------------+
      | id         | name       | age        | birth      | gender     |
      +------------+------------+------------+------------+------------+
      | 1          | John       | 15         | 2008-05-15 | boy        |
      | 2          | Jack       | 17         | 2006-07-20 | boy        |
      | 5          | Bob        | 17         | 2006-09-12 | boy        |
      +------------+------------+------------+------------+------------+
    • Modify the birth column by a specified unit and interval.

      +------------+------------+------------+------------+---------------+
      | id         | name       | age        | birth      | birth_dateadd |
      +------------+------------+------------+------------+---------------+
      | 4          | Lily       | 21         | 2002-01-08 | 2002-02-08    |
      | 3          | Alice      | 20         | 2003-04-20 | 2003-05-20    |
      | 2          | Jack       | 17         | 2006-07-20 | 2006-08-20    |
      | 1          | John       | 15         | 2008-05-15 | 2008-06-15    |
      | 5          | Bob        | 17         | 2006-09-12 | 2006-10-12    |
      +------------+------------+------------+------------+---------------+

    On the query result page, you can sort the results by a specific field and download the results. For more information, see Process query results.

Edit code: Advanced examples

For more advanced examples of ODPS SQL tasks, see the following topics:

FAQ

  • Q: Why does my ODPS SQL task remain in the waiting state for a long time?

    A: ODPS SQL tasks require quota resources to run. If your task remains in the waiting state for a long time, go to the MaxCompute console to check the quota consumption and ensure that sufficient resources are available to run the task. For more information, see Computing Resources - Quota Management.

  • Q: Why does the error message "You have No privilege 'odps:xxxx' on xxxx" appear when I run an ODPS SQL task?

    A: The account used to run the ODPS SQL task must have the required permissions on the corresponding MaxCompute project. For more information, see MaxCompute permissions.