Create and manage MaxCompute tables by using the visualization method - DataWorks

Usage notes

Field deletion is not supported. Fields cannot be deleted from a table after it is created. Plan your schema carefully before committing.
Physical property changes require SQL. To modify physical properties of an existing table, run MaxCompute SQL commands directly. See Table operations. If a Resource Access Management (RAM) user encounters errors when running these commands, see Details of MaxCompute data permission control.
Metadata updates may lag. After creating, updating, or deleting a table, the change may not appear immediately in search results. If a table is missing or still visible after deletion, go to DataMap and refresh the table metadata to sync the changes.

Prerequisites

Before you begin, ensure that you have:

A MaxCompute computing resource bound to your workspace

Open DataStudio

Log on to the DataWorks console.
In the top navigation bar, select the region.
In the left-side navigation pane, choose Data Development and O&M > Data Development.
Select the workspace from the drop-down list and click Go to Data Development.

Create a MaxCompute table

Choose an entry point

DataWorks supports two entry points for creating a table:

Table Management — In the left navigation pane of DataStudio, select Table Management and click the icon. If Table Management is not visible, add it via Customize the interface.
A specific business process — Right-click the business process and choose Create Table > MaxCompute > Table. If the option is missing, verify that a MaxCompute computing resource is bound to the workspace. If no business process exists, create one first. See Business processes.

Configure the table

After creating a table, the table editing page opens. Configure the schema in DDL mode or visualization mode.

DDL mode

Click the ddl icon and enter a standard MaxCompute table creation statement. DataWorks parses the statement and populates the visualization fields automatically.

Important

The table name is set before you enter the editing page — do not change it in the DDL command, or an error will occur. DDL mode defines physical properties only; configure business properties (Display Name, Theme) in the visualization interface after the schema is populated.

Visualization mode

Follow the steps below to configure each section of the table.

Step 1: Configure general properties

Property	Description
Display name	The display name of the table in DataWorks.
Theme	Organizes the table into level-1 and level-2 folders for classification. Themes appear on the Table Management page for quick filtering. To create a theme, see Define table subjects.

Step 2: Design the physical model

Property	Description
Level	The data warehouse layer the table belongs to. Common layers: ODS (operational data store), DIM (dimension), DWD (data warehouse detail), DWS (data warehouse summary), and ADS (application data service). Assign the table to a layer that matches its role in your pipeline. For custom layers, see Layer management.
Category	The business classification of the table, such as basic business layer or advanced business layer. For custom categories, see Category management.
TTL	The table lifecycle — how long data is retained before automatic deletion. See Lifecycle and Lifecycle actions.
Partition type	Defines whether the table is a partitioned table or a non-partitioned table. See Partition.
Table type	Whether the table is an internal or external table. When you query data in an internal table, real data is imported, which results in faster queries. When you query data in an external table, you can query the data without importing it, which saves memory. See Tables.

Step 3: Define the schema

Property	Description
Field type	The data type of the field. Only MaxCompute-supported types are available. See Data type version guide.
Field security level	The sensitivity label for the field. This setting is visible only when label-based access control is enabled for the MaxCompute project. See Label-based access control.
Primary key field	A business-level primary key for management purposes. MaxCompute tables do not enforce physical primary keys.

Commit and publish the table

After defining the schema, commit the table to one or both environments.

The commit workflow differs by workspace mode. In basic mode, commit directly to the production environment. In standard mode, only users with the Workspace Manager or O&M role can publish to production. For role assignments, see Add workspace members and manage their roles and permissions.

Operation	What it does	What you can do after
Commit to development environment	Creates the table in the MaxCompute project for the development environment.	Run `desc tablename` on an ODPS SQL node to verify the schema. View the table in the MaxCompute development engine — see Bind a MaxCompute computing resource.
Load from development environment	Loads the committed development table info onto the current page, overwriting the table information on the current page.	Available only after committing to the development environment.
Commit to production environment	Creates the table in the MaxCompute project for the production environment.	Run `desc projectname.tablename` on an ODPS SQL node to verify the schema. View the table in the MaxCompute production engine — see Bind a MaxCompute computing resource.
Load from production environment	Loads the committed production table info onto the current page, overwriting the table information on the current page.	Available only after committing to the production environment.

Write and export data

Write data

Three methods are available:

Data integration task — Use a data integration task to sync data from external sources.
MaxCompute node task — Use a MaxCompute node task to insert data with SQL.
Upload data — Use the Upload data feature to import local files directly.

Export data

After querying data in DataStudio, click Download on the query result page to download records locally.

By default, downloads are limited to 10,000 records per operation. Tenant administrators and tenant security administrators can change this limit on the Data query and analysis control page in Security Center.

For exports exceeding 10,000 records, use the Tunnel command on the MaxCompute client.

Query data in a MaxCompute table

Use an ODPS SQL node or the ad hoc query feature to run SQL statements against a MaxCompute table.

Default permissions

The following applies to standard mode workspaces. Basic mode workspaces do not support fine-grained access control or data isolation between environments.

When a RAM user is added to a workspace, the preset permissions are:

Environment	Default access
Development	The RAM user's workspace role maps to a MaxCompute engine role in the development environment, granting access to the development project automatically.
Production	No access by default. Request permissions through Security Center. DataWorks provides a default approval flow; administrators can define custom flows.

For a complete reference, see Details of MaxCompute data permission control.

Data access behavior

MaxCompute supports cross-project table queries. Specify the project name in the SQL statement to access tables across projects.

In standard mode, tasks in the development environment run under the personal identity of the task executor by default. Tasks in the production environment run under the scheduling access identity (an Alibaba Cloud account). See Add a MaxCompute computing resource for details on execution accounts and environment bindings. To view binding information and execution accounts for each environment, see Resource Management.

SQL pattern	Development environment	Production environment
`select col1 from projectname_dev.tablename;`	Runs as the task executor's personal account.	Runs as the scheduling access identity.
`select col1 from projectname.tablename;`	Runs as the task executor's personal account. Personal accounts do not have production access by default — request permissions via Security Center.	Runs as the scheduling access identity.
`select col1 from tablename;`	Runs as the task executor's personal account, accessing the development engine.	Runs as the scheduling access identity, accessing the production engine.

View MaxCompute data assets

View all production tables

In the Tenant Tables pane of DataStudio, you can view all tables that are in the production environment of your Alibaba Cloud account in the current region.

View table metadata

Go to Data Map to view table details and metadata.

View table lineage

On the table details page in Data Map, view the upstream and downstream lineage of the table. To find which tasks in the workspace reference a specific table, use the code search feature.

Manage tables in batches

Go to the My Data page in Data Map to perform the following batch operations on tables you own:

Delete tables — Due to production data security controls, you cannot run deletion commands directly in the production environment. To delete a table owned by another user, request permissions via Security Center.
Change table owners — Transfer ownership of multiple tables in one operation.
Modify table lifecycles — Update the lifecycle for multiple tables at once. To modify the lifecycle of a single table, use Table Management. See Create a MaxCompute table.