To use DataWorks for developing and managing MaxCompute tasks, you must add your MaxCompute project as a DataWorks computing resource. After you add the project, you can use this computing resource in various DataWorks modules to connect to the MaxCompute project. This lets you perform operations such as data synchronization, data development, and data analysis.
Limits
Region restrictions: China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Chengdu), China (Hong Kong), Japan (Tokyo), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Germany (Frankfurt), UK (London), US (Silicon Valley), and US (Virginia).
You can add a MaxCompute project as a computing resource only if the project and the DataWorks workspace are in the same region and belong to the same Alibaba Cloud account.
Permission requirements:
Product
Operator
Required permissions
DataWorks
Alibaba Cloud account
No extra permissions are required.
RAM user/RAM role
Only workspace members with the O&M and Workspace Administrator roles, or with the
AliyunDataWorksFullAccesspermission, can create computing resources. You can grant permissions to members of the workspace.MaxCompute
RAM user/RAM role
When you attach computing resources: You must have the odps:ListProjects permission for MaxCompute, and the Super_Administrator permission for the target MaxCompute project.
When used as the default access identity: You must have the admin or super_administrator permission for the MaxCompute project. After the computing resource is attached, the account or role will be added to the MaxCompute production project as the Role_Project_Scheduler role.
All production data in the current workspace is assigned to the default access identity of the production environment that you specify when you create a computing resource. If other accounts need to operate on or access production tables, they must request the required permissions in the Security Center.
Prerequisites
You have activated the MaxCompute product in the same region as DataWorks and created a MaxCompute project.
You have created a workspace in DataWorks, and your RAM user has been added to the workspace and assigned the workspace administrator role.
NoteDataWorks provides two types of workspaces: basic mode and standard mode. When you create a workspace, review the differences between basic mode and standard mode.
You have attached a resource group to the workspace and ensured network connectivity.
If you use a Serverless resource group, you only need to ensure that the network connectivity between the MaxCompute computing resource and the Serverless resource group is working correctly.
If you use an older version of exclusive resource groups, ensure that the MaxCompute computing resource can connect to the exclusive resource group for scheduling for the relevant scenario.
DataStudio: Add a MaxCompute computing resource
Add a MaxCompute computing resource to a workspace that Uses DataStudio.
Go to the computing resources page
Log on to the DataWorks console. Switch to the target region. In the navigation pane on the left, click . In the drop-down list, select the target workspace and click Go To Management Center.
In the navigation pane on the left, click Computing Resources to go to the computing resources list page.
Add the MaxCompute computing resource
On the computing resource list page, configure and attach MaxCompute computing resources.
Select the computing resource type.
Click Add Computing Resource to go to the Add Computing Resource page.
On the Add Computing Resource page, select MaxCompute to go to the Add MaxCompute Computing Resource configuration page.
Configure the MaxCompute computing resource.
On the Add MaxCompute Computing Resource configuration page, configure the parameters as described in the following table.
Parameter
Description
MaxCompute Project
Select the MaxCompute project to attach. You can create an internal MaxCompute project or create an external MaxCompute project. After the project is created, select the newly created project.
NoteIf you create a workspace that is a standard workspace, you must select and attach different MaxCompute projects for the production and development environments.
MaxCompute billable items and billing methods.
If you cannot select the target MaxCompute project, grant the Super_Administrator permission to the current logon account.
Default Access Identity
Defines the identity used to access the MaxCompute project in the current workspace.
Development environment: Only the Executor identity is supported.
Production environment: The Alibaba Cloud Account, Alibaba Cloud RAM User, and Alibaba Cloud RAM Role identities are supported.
NoteOnly Alibaba Cloud accounts and users or roles with the AdministratorAccess permission can select all access identities.
All production data in the current workspace is owned by the default access identity of the production environment, which is specified when the computing resource is created. If other accounts need to operate on and access production tables, they must request the relevant permissions in the Security Center. For more information, see MaxCompute data access permission control and Approval center overview.
Endpoint
Specifies the endpoint that DataWorks uses to access the MaxCompute project through this computing resource. This includes the endpoint for accessing the MaxCompute service and the Tunnel service endpoint for uploading and downloading local or cloud computing resource data. The following configuration methods are supported:
Automatic Adaptation: DataWorks automatically adapts the endpoint based on the actual situation. We recommend that you select this option.
Custom Configuration: You must manually configure the MaxCompute Endpoint and Tunnel Endpoint. The Endpoint varies by region.
Computing Resource Instance Name
Used to identify the computing resource. When a task runs, the computing resource instance name is used to select the computing resource for the task.
Test connectivity.
In the Connection Configuration section, select the resource group that DataWorks uses to run MaxCompute tasks and click Test Connectivity to verify that the resource group can access your MaxCompute project. For more information, see Network Connectivity Overview.
NoteIf no active resource groups are available, you can add and attach a Serverless resource group to the workspace and then test the connectivity to the computing resource from the Computing Resource section of the workspace.
Click Complete Creation to finish configuring the MaxCompute computing resource.
NoteAfter the computing resource is added, the system automatically creates a MaxCompute data source with the same name in the Data Source section of the current workspace.
After the computing resource is successfully added, the platform grants the required permissions to the access identity. This means that the access identity account is added to the MaxCompute project and is mapped to the corresponding MaxCompute permissions. Before the authorization is complete, the connectivity test may fail due to a permission error. In this case, save the computing resource and wait for a few moments.
Legacy Data Development: Add a MaxCompute computing resource
You can attach a MaxCompute computing resource to a workspace that is not set to Use The New Version Of Data Development (DataStudio).
Go to the computing resources page
Go to the DataStudio page.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Data Development.
In the navigation pane on the left, click the
icon to go to the Computing Resources list page.
Add the MaxCompute computing resource
On the computing resources list page, configure the parameters to add a MaxCompute computing resource.
Select the computing resource type.
Click Create Computing Resource to go to the Create Computing Resource page.
On the Create Computing Resource page, select MaxCompute to go to the Create Computing Resource configuration page.
Configure the MaxCompute computing resource.
On the Create Computing Resource configuration page, configure the parameters as described in the following table.
Parameter
Description
Authentication Method
New computing resources can be authenticated only using an Alibaba Cloud account or a RAM role.
Alibaba Cloud Account
Only a MaxCompute project under the Current Alibaba Cloud Account can be added as a computing resource for the current workspace.
MaxCompute Project Name
Select the MaxCompute project to attach. If a target project is not available, create a MaxCompute project.
NoteIf you create a workspace that is a standard workspace, you must select and attach different MaxCompute projects for the production and development environments.
MaxCompute's Billable items and billing methods.
If you cannot select the target MaxCompute project, grant the Super_Administrator permission to the current logon account.
Region
Select the region where the MaxCompute project is located. If the selected MaxCompute project is not in the same region as the current workspace, you cannot add the project as a computing resource.
Default Access Identity
Defines the identity used to access the computing resource in the current workspace.
Development environment: Only the Executor identity is supported.
Production environment: The Alibaba Cloud account, Alibaba Cloud RAM user, and Alibaba Cloud RAM role identities are supported.
NoteOnly Alibaba Cloud accounts and users or roles with the AdministratorAccess permission can select all access identities.
All production data in the current workspace is owned by the default access identity of the production environment, which is specified when the computing resource is created. If other accounts need to operate on and access production tables, they must request the relevant permissions in the Security Center. For more information, see MaxCompute data access permission control and Approval center overview.
Endpoint
Specifies the endpoint that DataWorks uses to access the MaxCompute project through this computing resource. This includes the endpoint for accessing the MaxCompute service and the Tunnel service endpoint for uploading and downloading local or cloud computing resource data. The following configuration methods are supported:
Automatic Adaptation: DataWorks automatically adapts the endpoint based on the actual situation. We recommend that you select this option.
Custom Configuration: You need to manually configure the MaxCompute Endpoint and the Tunnel Endpoint. The Endpoint is different for each region.
Test connectivity.
In the Connection Configuration section, select the resource group that DataWorks uses to run MaxCompute tasks and click Test Connectivity to verify that the resource group can access your MaxCompute project. For more information, see Network Connectivity Overview.
NoteIf no active resource groups are available, you can add and attach a Serverless resource group to the workspace and then test the connectivity to the computing resource from the Computing Resource section of the workspace.
Click Create Computing Resource And Add To Data Development to finish configuring the MaxCompute computing resource.
NoteAfter the computing resource is added, the system automatically creates a MaxCompute data source with the same name in the Data Source section of the current workspace.
After the computing resource is successfully added, the platform grants the required permissions to the access identity. This means that the access identity account is added to the MaxCompute project and is mapped to the corresponding MaxCompute permissions. Before the authorization is complete, the connectivity test may fail due to a permission error. In this case, save the computing resource and wait for a few moments.
What to do next
After you add a MaxCompute computing resource, a MaxCompute data source is automatically created for the workspace. You can use this data source in Data Integration, and in Database Node (New Data Development) or Database Node (Legacy Data Development).
FAQ
Problem: When a MaxCompute computing resource is scheduled, the error message
connect timed out, the possible reason is that the endpoint `http://service.odps.aliyun.com/api` is wrong, please check your endpointis reported.Solution: Check the Endpoint configuration of the MaxCompute computing resource. Enter the VPC Endpoint for the region where the resource is located.
Problem: When you test the connectivity to the computing resource, the error message
You have NO privilege 'odps:Read' on {acs:odps:*:projects/xxx}is reported.Solution: Check if the status of your MaxCompute project is Normal. If the project is in the frozen Suspended state, you can use the MaxCompute console to Recover the project.