DataWorks provides a visualization interface for creating and managing MaxCompute tables — no DDL required. Configure table schema, physical properties, and lifecycle through the interface, then publish to development and production environments in a single workflow.
Usage notes
-
Field deletion is not supported. Fields cannot be deleted from a table after it is created. Plan your schema carefully before committing.
-
Physical property changes require SQL. To modify physical properties of an existing table, run MaxCompute SQL commands directly. See Table operations. If a Resource Access Management (RAM) user encounters errors when running these commands, see Details of MaxCompute data permission control.
-
Metadata updates may lag. After creating, updating, or deleting a table, the change may not appear immediately in search results. If a table is missing or still visible after deletion, go to DataMap and refresh the table metadata to sync the changes.
Prerequisites
Before you begin, ensure that you have:
-
A MaxCompute computing resource bound to your workspace
Open DataStudio
-
Log on to the DataWorks console.
-
In the top navigation bar, select the region.
-
In the left-side navigation pane, choose Data Development and O&M > Data Development.
-
Select the workspace from the drop-down list and click Go to Data Development.
Create a MaxCompute table
Choose an entry point
DataWorks supports two entry points for creating a table:
-
Table Management — In the left navigation pane of DataStudio, select Table Management and click the
icon. If Table Management is not visible, add it via Customize the interface. -
A specific business process — Right-click the business process and choose Create Table > MaxCompute > Table. If the option is missing, verify that a MaxCompute computing resource is bound to the workspace. If no business process exists, create one first. See Business processes.
Configure the table
After creating a table, the table editing page opens. Configure the schema in DDL mode or visualization mode.
DDL mode
Click the
icon and enter a standard MaxCompute table creation statement. DataWorks parses the statement and populates the visualization fields automatically.
The table name is set before you enter the editing page — do not change it in the DDL command, or an error will occur. DDL mode defines physical properties only; configure business properties (Display Name, Theme) in the visualization interface after the schema is populated.
Visualization mode
Follow the steps below to configure each section of the table.
Step 1: Configure general properties
| Property | Description |
|---|---|
| Display name | The display name of the table in DataWorks. |
| Theme | Organizes the table into level-1 and level-2 folders for classification. Themes appear on the Table Management page for quick filtering. To create a theme, see Define table subjects. |
Step 2: Design the physical model
| Property | Description |
|---|---|
| Level | The data warehouse layer the table belongs to. Common layers: ODS (operational data store), DIM (dimension), DWD (data warehouse detail), DWS (data warehouse summary), and ADS (application data service). Assign the table to a layer that matches its role in your pipeline. For custom layers, see Layer management. |
| Category | The business classification of the table, such as basic business layer or advanced business layer. For custom categories, see Category management. |
| TTL | The table lifecycle — how long data is retained before automatic deletion. See Lifecycle and Lifecycle actions. |
| Partition type | Defines whether the table is a partitioned table or a non-partitioned table. See Partition. |
| Table type | Whether the table is an internal or external table. When you query data in an internal table, real data is imported, which results in faster queries. When you query data in an external table, you can query the data without importing it, which saves memory. See Tables. |
Step 3: Define the schema
| Property | Description |
|---|---|
| Field type | The data type of the field. Only MaxCompute-supported types are available. See Data type version guide. |
| Field security level | The sensitivity label for the field. This setting is visible only when label-based access control is enabled for the MaxCompute project. See Label-based access control. |
| Primary key field | A business-level primary key for management purposes. MaxCompute tables do not enforce physical primary keys. |
Commit and publish the table
After defining the schema, commit the table to one or both environments.
The commit workflow differs by workspace mode. In basic mode, commit directly to the production environment. In standard mode, only users with the Workspace Manager or O&M role can publish to production. For role assignments, see Add workspace members and manage their roles and permissions.
| Operation | What it does | What you can do after |
|---|---|---|
| Commit to development environment | Creates the table in the MaxCompute project for the development environment. | Run desc tablename on an ODPS SQL node to verify the schema. View the table in the MaxCompute development engine — see Bind a MaxCompute computing resource. |
| Load from development environment | Loads the committed development table info onto the current page, overwriting the table information on the current page. | Available only after committing to the development environment. |
| Commit to production environment | Creates the table in the MaxCompute project for the production environment. | Run desc projectname.tablename on an ODPS SQL node to verify the schema. View the table in the MaxCompute production engine — see Bind a MaxCompute computing resource. |
| Load from production environment | Loads the committed production table info onto the current page, overwriting the table information on the current page. | Available only after committing to the production environment. |
Write and export data
Write data
Three methods are available:
-
Data integration task — Use a data integration task to sync data from external sources.
-
MaxCompute node task — Use a MaxCompute node task to insert data with SQL.
-
Upload data — Use the Upload data feature to import local files directly.
Export data
After querying data in DataStudio, click Download on the query result page to download records locally.
By default, downloads are limited to 10,000 records per operation. Tenant administrators and tenant security administrators can change this limit on the Data query and analysis control page in Security Center.
For exports exceeding 10,000 records, use the Tunnel command on the MaxCompute client.
Query data in a MaxCompute table
Use an ODPS SQL node or the ad hoc query feature to run SQL statements against a MaxCompute table.
Default permissions
The following applies to standard mode workspaces. Basic mode workspaces do not support fine-grained access control or data isolation between environments.
When a RAM user is added to a workspace, the preset permissions are:
| Environment | Default access |
|---|---|
| Development | The RAM user's workspace role maps to a MaxCompute engine role in the development environment, granting access to the development project automatically. |
| Production | No access by default. Request permissions through Security Center. DataWorks provides a default approval flow; administrators can define custom flows. |
For a complete reference, see Details of MaxCompute data permission control.
Data access behavior
MaxCompute supports cross-project table queries. Specify the project name in the SQL statement to access tables across projects.
In standard mode, tasks in the development environment run under the personal identity of the task executor by default. Tasks in the production environment run under the scheduling access identity (an Alibaba Cloud account). See Add a MaxCompute computing resource for details on execution accounts and environment bindings. To view binding information and execution accounts for each environment, see Resource Management.
| SQL pattern | Development environment | Production environment |
|---|---|---|
select col1 from projectname_dev.tablename; |
Runs as the task executor's personal account. | Runs as the scheduling access identity. |
select col1 from projectname.tablename; |
Runs as the task executor's personal account. Personal accounts do not have production access by default — request permissions via Security Center. | Runs as the scheduling access identity. |
select col1 from tablename; |
Runs as the task executor's personal account, accessing the development engine. | Runs as the scheduling access identity, accessing the production engine. |
View MaxCompute data assets
View all production tables
In the Tenant Tables pane of DataStudio, you can view all tables that are in the production environment of your Alibaba Cloud account in the current region.
View table metadata
Go to Data Map to view table details and metadata.
View table lineage
On the table details page in Data Map, view the upstream and downstream lineage of the table. To find which tasks in the workspace reference a specific table, use the code search feature.
Manage tables in batches
Go to the My Data page in Data Map to perform the following batch operations on tables you own:
-
Delete tables — Due to production data security controls, you cannot run deletion commands directly in the production environment. To delete a table owned by another user, request permissions via Security Center.
-
Change table owners — Transfer ownership of multiple tables in one operation.
-
Modify table lifecycles — Update the lifecycle for multiple tables at once. To modify the lifecycle of a single table, use Table Management. See Create a MaxCompute table.