DataWorks connects MaxCompute to a managed workspace, giving you two ways to run SQL and build data pipelines:
-
DataAnalysis (SQL query) — Write and run SQL interactively, analyze results with Web Excel, and share or download data online.
-
DataStudio (ODPS nodes) — Build scheduled pipelines using typed nodes (ODPS SQL, Spark, PyODPS, and more), then commit them to Operation Center for periodic scheduling.
Choose your path
| Path | Best for |
|---|---|
| DataAnalysis SQL query | Ad-hoc queries, result analysis, and quick data sharing |
| DataStudio ODPS nodes | Scheduled pipelines with dependencies, complex transformations, and multi-node workflows |
Prerequisites
Before you begin, ensure that you have:
-
A DataWorks workspace. If you don't have one, create a DataWorks workspace
-
MaxCompute computing resources associated with the workspace, or a MaxCompute data source bound to DataStudio. The setup step depends on whether you enabled Participate in Public Preview of Data Studio when you created the workspace
Associate MaxCompute with a DataWorks workspace
The setup differs based on whether you enabled Participate in Public Preview of Data Studio at creation time.
To check which version your workspace uses, go to the Workspaces page and find your workspace in the Actions column:
| If the Actions column shows... | Your workspace uses... |
|---|---|
| Shortcuts > Data Development | DataStudio (old version) |
| Shortcuts > DataStudio (new version) | Data Studio (new version) |
If Participate in Public Preview of Data Studio is turned on:
Associate MaxCompute computing resources directly with the workspace. For details, see Associate a computing resource.
If Participate in Public Preview of Data Studio is not turned on:
Create a MaxCompute data source and bind it to DataStudio (old version). For details, see Add a data source or register a cluster.
If you create a MaxCompute data source but don't bind it to DataStudio (old version), only data synchronization is available. Data development, task scheduling, and data analysis require the data source to be bound.
Use DataAnalysis for SQL queries
DataAnalysis lets you run MaxCompute SQL interactively, analyze results with Web Excel, and download or share data — all from the browser.
To open the SQL Query page, use any of these entry points:
-
In the MaxCompute console, click Data Analytics in the left navigation pane. On the DataAnalysis page, click SQL Query.
-
On the DataAnalysis homepage, click SQL Query in the Shortcuts section.
-
In the left navigation pane of the DataAnalysis page, click SQL Query.
For more information about creating and running SQL queries, see SQL query (Legacy).
Use DataStudio for scheduled pipelines
DataStudio wraps MaxCompute jobs into typed nodes that you schedule and manage through Operation Center. Supported ODPS node types include:
For each node type, configure time properties and scheduling dependencies, then commit the node to Operation Center for periodic scheduling.
To create an ODPS node, see Overview.
What's next
-
SQL query (Legacy) — Full reference for the DataAnalysis SQL query feature
-
Data Development (Data Studio) (New) — Guide to the new-version Data Studio
-
Overview (old-version DataStudio) — Guide to the old-version DataStudio