All Products
Search
Document Center

MaxCompute:DataWorks

Last Updated:Mar 28, 2025

DataWorks serves as a unified end-to-end big data development and governance platform based on compute engines such as MaxCompute. This topic describes how to use MaxCompute in DataWorks.

Background information

DataWorks allows you to associate compute engines with a DataWorks workspace. After you associate a compute engine with a DataWorks workspace as a compute engine instance, you can create nodes of the same compute engine type in the DataWorks console and then enable the system to periodically schedule the nodes. You can use one of the following methods to connect DataWorks to MaxCompute:

  • Use the SQL query feature of DataAnalysis

    You can use this feature to perform operations such as editing MaxCompute SQL statements, querying data, analyzing data by using workbooks, and sharing and downloading data online. For more information about the SQL query feature, see SQL query.

  • Use ODPS nodes in DataStudio

    DataWorks encapsulates different types of compute engine tasks into different types of nodes to define data development tasks. You can use resources, functions, and related logic processing nodes to develop more complex tasks. ODPS nodes include ODPS SQL nodes, ODPS Spark nodes, PyODPS 2 nodes, PyODPS 3 nodes, ODPS Script nodes, and ODPS MR nodes.

Scenarios

Use scenarios of DataAnalysis

You can use the SQL query feature of DataAnalysis in the following scenarios:

You can use the SQL query feature of DataAnalysis to query data and use Web Excel in analysis mode to analyze query results. To reduce the frequency at which data is transferred and ensure data security, you can also download the query results to your on-premises machine for analysis.

Use scenarios of ODPS nodes

If you want to periodically run a MaxCompute job, you can use Data Studio in the DataWorks console to develop an auto triggered node that relates to the job and configure settings such as time properties and scheduling dependencies for the node. Then, you can commit the node to DataWorks Operation Center for periodic scheduling.

Procedure

  1. Create a DataWorks workspace.

  2. Associate MaxCompute computing resources with the workspace or create a MaxCompute data source.

    You can perform subsequent operations based on whether you turn on Participate in Public Preview of Data Studio when you create the workspace.

    You can find the desired workspace on the Workspaces page in the DataWorks console and perform the following operations to check whether Participate in Public Preview of Data Studio is turned on:

    Participate in Public Preview of Data Studio not turned on

    Participate in Public Preview of Data Studio turned on

    Choose Shortcuts > Data Development in the Actions column.

    The old-version DataStudio page appears, as shown in the following figure.

    image

    For more information about old-version DataStudio, see Overview.

    Choose Shortcuts > DataStudio (new version) in the Actions column.

    The new-version Data Studio page appears, as shown in the following figure.

    image

    For more information about new-version Data Studio, see Data Studio (new version).

  3. Use MaxCompute in DataWorks.

    • Use DataAnalysis

      You can use one of the following methods to go to the SQL Query page in DataAnalysis:

      • In the left-side navigation pane of the MaxCompute console, click Data Analytics. On the DataAnalysis page in the DataWorks console, click SQL Query. The SQL Query page appears.

      • In the Shortcuts section on the homepage of DataAnalysis, click SQL Query. The SQL Query page appears.

      • In the left-side navigation pane of the DataAnalysis page, click SQL Query to go to the SQL Query page.

      For more information about how to perform operations such as creating SQL queries and executing query statements, see SQL query.

    • Use ODPS nodes

      For information about how to create an ODPS node, see Overview.