To develop MaxCompute tasks in DataWorks, associate your MaxCompute project with a DataWorks workspace as a compute resource for data synchronization, development, and analysis.
Limitations
-
Region limits: China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Chengdu), China (Hong Kong), Japan (Tokyo), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Germany (Frankfurt), UK (London), US (Silicon Valley), and US (Virginia).
-
The MaxCompute project and DataWorks workspace must be in the same region and under the same Alibaba Cloud account.
-
Permissions:
Product
Operator
Permissions
DataWorks
Alibaba Cloud account
No additional permissions are required.
RAM user/RAM role
Only workspace members with the O&M or Workspace Administrator role, or members with the
AliyunDataWorksFullAccesspermission, can create compute resources. You can grant permissions to workspace members.MaxCompute
RAM user/RAM role
To bind a compute resource: You must have the odps:ListProjects permission for MaxCompute and the Super_Administrator permission for the target MaxCompute project.
Default access identity requirements: This identity must have the admin or super_administrator role for the MaxCompute project. After association, the platform grants this identity the Role_Project_Scheduler role for the production project.
The default access identity for the production environment owns all production data. Other accounts must request access through Security Center.
Prerequisites
-
MaxCompute is activated and a project is created in the same region as your DataWorks workspace.
-
A DataWorks workspace is created, and your RAM user has the Workspace Administrator role.
NoteDataWorks supports Simple mode and Standard mode. Understand the differences between Simple mode and Standard mode before creating a workspace.
-
A resource group is associated with the workspace and network connectivity is verified.
-
If you use a serverless resource group, ensure network connectivity between the MaxCompute compute resource and the serverless resource group.
-
If you use a legacy exclusive resource group, ensure network connectivity between the MaxCompute compute resource and the exclusive resource group for scheduling.
-
New Data Studio: Associate a compute resource
Associate a MaxCompute compute resource with a workspace that Use Data Studio (New Version).
Go to the compute resource page
-
Log on to the DataWorks console and switch to the target region. In the left navigation bar, click . From the drop-down list, select the corresponding workspace and click Go to Management Center.
-
In the left-side navigation pane, click Computing Resources to open the Compute resource page.
Associate the MaxCompute compute resource
On the Compute resource page, associate a MaxCompute compute resource.
-
Select the compute resource type.
-
Click Associate Computing Resources, and the Associate Computing Resources page opens.
-
On the Associate Computing Resources page, select MaxCompute as the compute resource type to go to the Associate MaxCompute Computing Resource configuration page.
-
-
Configure the MaxCompute compute resource.
On the Associate MaxCompute Computing Resource configuration page, configure the following parameters.
Parameter
Description
MaxCompute Project
Select the MaxCompute project to associate. You can create an internal MaxCompute project or create an external MaxCompute project, then select the new project.
Note-
In Standard mode, select separate MaxCompute projects for the production and development environments.
-
If the target project is not listed, grant the Super_Administrator role for the project to the current account.
Default Access Identity
The identity used to access the MaxCompute project from this workspace.
-
Development environment: Only the Executor identity is supported.
-
Production environment: The Alibaba Cloud Account, Alibaba Cloud RAM Sub-account, and RAM role identities are supported.
Note-
Only an Alibaba Cloud account or a principal with AdministratorAccess can select all access identities.
-
The Security Center for the production environment owns all production data. Other accounts must request access through Security Center. MaxCompute data access control and Approval Center.
Endpoint
The endpoint for accessing the MaxCompute project, including the service endpoint and Tunnel endpoint for data transfers.
-
Auto Fit: DataWorks automatically adapts to your configurations. Recommended.
-
Custom Configuration: Manually configure the MaxCompute and Tunnel endpoints, which vary by region.
Computing Resource Instance Name
A unique name for the compute resource, used at runtime to select it for tasks.
-
-
Test the connectivity.
Select the resource group for running MaxCompute tasks and click Test Connectivity to verify access. Network connection solutions.
NoteIf no resource groups are available, add and associate a serverless resource group first, then test connectivity on the Computing Resources page.
-
Click Confirm to complete the association.
Note-
After association, a MaxCompute data source with the same name is automatically created on the Data Sources page.
-
After association, the platform authorizes the access identity by adding it to the MaxCompute project with the required permissions. A connectivity test may fail with a permission error until authorization completes. If this happens, save the resource and retry after a few moments.
-
Legacy Data Studio: Associate a compute resource
Associate a MaxCompute compute resource with a workspace that does not Use Data Studio (New Version).
Go to the compute resource page
Log on to the DataWorks console. In the target region, click in the left-side navigation pane. Select a workspace from the drop-down list and click Go to Data Development.
-
In the left-side navigation pane, click the
icon to open the Computing Resources page.
Associate the MaxCompute compute resource
On the compute resource page, associate a MaxCompute compute resource.
-
Select the compute resource type.
-
Click Create Computing Resource to open the Create Computing Resource page.
-
On the Create Computing Resource page, select MaxCompute as the compute resource type to go to the Create Computing Resource configuration page.
-
-
Configure the MaxCompute compute resource.
On the Create Computing Resource configuration page, configure the following parameters.
Parameter
Description
Authentication Method
New compute resources support only Alibaba Cloud account or RAM role authentication.
Alibaba Cloud Account
Only MaxCompute projects under the Current Alibaba Cloud Account can be associated with this workspace.
MaxCompute Project Name
Select the MaxCompute project to associate. If none is available, create a MaxCompute project.
Note-
In Standard mode, select separate MaxCompute projects for the production and development environments.
-
If the target project is not listed, grant the Super_Administrator role for the project to the current account.
Region
The MaxCompute project region. Must match the workspace region.
Default Access Identity
The identity used to access the compute resource from this workspace.
-
Development environment: Only the Executor identity is supported.
-
Production environment: Alibaba Cloud accounts, RAM users, and RAM roles are supported.
Note-
Only an Alibaba Cloud account or a principal with AdministratorAccess can select all access identities.
-
The Security Center for the production environment owns all production data. Other accounts must request access through Security Center. MaxCompute data access control and Approval Center.
-
Endpoint
The endpoint for accessing the MaxCompute project, including the service endpoint and Tunnel endpoint for data transfers.
-
Auto Fit: DataWorks automatically adapts to your configurations. Recommended.
-
Custom Configuration: Manually configure the MaxCompute and Tunnel endpoints, which vary by region.
-
-
Test the connectivity.
Select the resource group for running MaxCompute tasks and click Test Connectivity to verify access. Network connection solutions.
NoteIf no resource groups are available, add and associate a serverless resource group first, then test connectivity on the Computing Resources page.
-
Click Create and Associate Computing Resource with DataStudio to complete the association.
Note-
After association, a MaxCompute data source with the same name is automatically created on the Data Sources page.
-
After association, the platform authorizes the access identity by adding it to the MaxCompute project with the required permissions. A connectivity test may fail with a permission error until authorization completes. If this happens, save the resource and retry after a few moments.
-
What to do next
After association, DataWorks automatically creates a MaxCompute data source in the workspace. Use it in Data Integration and in a database node (New Data Studio) or database node (Legacy Data Studio).
FAQ
-
Issue: When running a task on a MaxCompute compute resource, a
connect timed out, the possible reason is that the endpoint `http://service.odps.aliyun.com/api` is wrong, please check your endpointerror occurs.Solution: Verify the endpoint configuration. Use the VPC endpoint for the resource's region.
-
Issue: When testing connectivity, a
You have NO privilege 'odps:Read' on {acs:odps:*:projects/xxx}error occurs.Solution: Check the MaxCompute project status. If the project is frozen or suspended, Restore it in the MaxCompute console.