All Products
Search
Document Center

DataWorks:SQL Query (new version)

Last Updated:Mar 10, 2026

Use SQL statements to quickly query and analyze data from sources such as MaxCompute, EMR Hive, and Hologres. This topic describes how to run SQL queries against data sources.

Important

This document applies to the new version of DataAnalysis. For the previous version, see SQL query (Old). You can switch between the new and old versions of DataAnalysis using the navigation bar.

Supported data sources

SQL query supports the following data source types: MaxCompute, Hologres, EMR, CDH, ADB for PostgreSQL, ADB for MySQL, ClickHouse, StarRocks, MySQL, PostgreSQL, Oracle, SQL Server, Doris, and SelectDB.

Note

Only MaxCompute supports both direct connections and data source connections. All other data source types support connections only to data sources that have been added to your workspace.

Data source permissions

Data source scope

  • Data source connection mode: You can select data only from data sources within a workspace where you have appropriate permissions. A workspace administrator must first add you to the workspace as a Data Analyst, Developer, O&M Engineer, or Workspace Administrator.

  • Direct connection mode: You can select only MaxCompute projects that your current Alibaba Cloud account has joined. For more information about MaxCompute project permissions, see MaxCompute users and permissions.

Data source access permissions

You can access data sources using one of two identity modes.

Access identity mode

Description

Supported data sources

Authorization

Executor identity

The identity of the Alibaba Cloud account currently logged on to DataWorks.

MaxCompute and Hologres.

Ask the administrator of the specified MaxCompute project or Hologres instance to grant you member access permissions.

Default access identity of the data source

The access identity configured when the data source was created.

All data sources supported by this feature.

If your current account is not the default access identity for the data source, ask a user with Workspace Administrator permissions to grant permissions to your current Alibaba Cloud account.

Important

If an IP address whitelist is enabled for the MaxCompute project, add the DataAnalysis whitelist to the MaxCompute project's IP address whitelist.

Feature access

Log on to the DataWorks DataAnalysis console, switch to the destination region, and click Go to DataAnalysis.

  • If you see Go to New DataAnalysis in the navigation bar, click it to switch to the new version.

  • If you see Back to Old DataAnalysis in the navigation bar, you are already on the new version page.

Create an SQL query

  1. Hover over Personal Folder > My Files and click image > New SQL File on the right.

    You can also click New Folder to create a custom folder structure for your SQL query files.
  2. On the SQL editor page, write your SQL query statement.

    Important

    You can also generate SQL statements in the following ways:

    • On the Data Catalog page, after you add a data catalog, find the target table, right-click it, and select Generate Query SQL.

    • Copy an SQL query shared with you from the Shared Files folder.

    • After editing the SQL, click the Format button at the top to format the code.

    • SQL query supports Copilot. Click the Copilot icon (image) in the upper-right corner of the navigation bar to use features such as code generation and code correction to help you complete your code. Copilot also supports automatic code completion in the editor to improve your writing efficiency.

    Example SQL

    The following SQL statement queries the public GitHub event data in MaxCompute:

    -- Enable session-level schema syntax.
    SET odps.namespace.schema=true; 
    -- Query 100 rows from the dwd_github_events_odps table.
    SELECT * FROM bigdata_public_dataset.github_events.dwd_github_events_odps WHERE ds='${dt}' LIMIT 100;
    Note
    • When you query a MaxCompute project that has schema syntax enabled, add SET odps.namespace.schema=true; before the query statement. This enables session-level schema syntax and prevents query failures.

    • By default, queries run on the data source specified in the current Run Settings. To query other MaxCompute projects that you have joined, you must explicitly specify the project name in the query statement. For example, if the computing resource in the run settings is MaxCompute A, and the query statement specifies MaxCompute B (SELECT * FROM B.schema_name.table_name WHERE ****), the system uses A as the execution engine to query data in B.

  3. After you write the SQL query code, click Run Settings on the right. Configure the Data Source, Script Parameters, and other settings for the SQL query.

    image

    • Type: Select the target data source type for the SQL query as needed.

    • Computing Resource: Configure the target data source for the SQL query. Only MaxCompute queries support direct connections or data source connections. Other data source types support querying data only from data sources within workspaces where you have permissions. For more information about permissions, see Data source permissions.

    • Script Parameters: If your SQL query contains parameter variables, assign values to them here at runtime.

      Important

      If parsing fails, go to Management > Settings in the lower-left corner. Search for parsing and enable the Enable dataworks Isp code parameter function configuration item.

  4. In the toolbar at the top of the SQL Editor page, click Run. Click the drop-down icon image to the right of the Run button to switch the run mode. Select a mode based on your scenario.

    Run mode

    User scenario

    Trigger condition

    Applicable engine

    Query mode (LIMIT 10000)

    Quickly preview data and validate query logic. Suitable for preliminary data exploration where you only need to see a small sample of results.

    Query result displays ≤ 10,000 rows and ≤ 10 MB

    No limit

    Demand mode (full data)

    Get the complete result set for analysis or export. The system automatically triggers this mode when you need to process and view all data.

    Query result displays > 10,000 rows or > 10 MB

    No limit

    Temporary table mode

    Reuse results in multi-step complex queries. Use the output of one query as the input for the next to improve development and debugging efficiency.

    Query result displays ≤ 10,000 rows and ≤ 10 MB, and is automatically written to a temporary table

    MaxCompute only

  5. After the SQL execution completes, you can view the Operational Log, Results, and the corresponding SQL content on the query results page.

    Click the button in the upper-right corner of the query results to switch the page layout between side-by-side and top-and-bottom views.

Visualize query results

In the toolbar on the left of the query results, click the image button to automatically generate a visual chart from the results.

Note

image

Export and share

Important

To export data to a local file and then import it into another data source, use an offline sync task in Data Integration. This method provides more efficient and stable data migration and synchronization.

On the right side of the SQL query results, click Export. The following export options are supported:

  • Local File: Downloads the query results to your local computer as a CSV file. Key points are as follows:

    Item

    Description

    Download limits

    • Only the MaxCompute and EMR engines are supported. For more information, see Data download limits.

    • If the data protection mechanism is enabled for the MaxCompute project (which prohibits data downloads), downloading data through DataAnalysis will fail.

    Download scope

    You can choose to download Only data displayed in the table or All data.

    • Only data displayed in the table: Downloads only the data displayed on the current page, with a default maximum of 10000 rows.

    • All data: Exports all query results within the export limit.

    Download method

    Supports Download with Approval and Download without Approval.

    • Download with Approval: Allows you to identify risks in data download operations by setting Fraud Detection rules. When downloading data, you must submit a download approval request to ensure data compliance and security.

      Note

      Only DataWorks Enterprise Edition supports setting and enabling Fraud Detection rules.

    • Download without Approval: This is the default method. No permission request is required during the download process.

  • Object Storage Service (OSS): Exports the query results to an Alibaba Cloud Object Storage Service (OSS) bucket in a specified format, such as CSV or Parquet. This option is suitable for archiving large data volumes or integrating with other cloud products.

    The first time you use this feature, grant DataWorks permission to access your OSS resources. In the File Path drop-down list, click the one-click authorization link in the prompt and follow the on-screen instructions to complete the RAM authorization.

    Configuration Item

    Description

    File path

    Click the folder button on the right to select the OSS Bucket and directory where you want to store the result file.

    File name

    The system automatically generates a file name. You can also modify it manually.

    Text Type

    Select the file format for export. Supported formats include csv, text, orc, and parquet.

    Separator

    Specify the delimiter between columns. The default is a comma (,).

    Encoding format

    Select the encoding format for the file, such as UTF-8 or GBK.

    CU

    Configure the number of computing units (CUs) for this export task. The default value is 1 CU.

    Resource group

    Select the Serverless resource group to run this export task. If you do not select one, the data integration resource group set in DataAnalysis > System Administration is used by default.

    After completing the configuration, click OK to start the export task. You can view the export progress, operational log, and configuration details on the task running page. After the task succeeds, go to the OSS console to download the exported object file to your local computer.

  • DingTalk Sheet: You can export results to a DingTalk Sheet.

  • Workbook/Workbook and Share: You can save the results to a workbook for further analysis of query data. You can also share the latest analysis results from the workbook with others.