To run AnalyticDB for Spark tasks in DataWorks, bind an AnalyticDB for MySQL cluster as an AnalyticDB for Spark computing resource. Once bound, the computing resource is available for task development in Data Studio.
Prerequisites
Before you begin, make sure you have:
-
An AnalyticDB for MySQL cluster with an interactive resource group of the Spark engine type. The cluster must be in the same region as your DataWorks workspace. If they are in different regions, the binding will fail.
-
A DataWorks workspace set to use Data Studio (New Version). Your RAM user must be a workspace member with the Workspace Administrator role.
-
A resource group associated with the workspace, in the same virtual private cloud (VPC) as the AnalyticDB for MySQL cluster. The resource group's IP address must be added to the cluster's whitelist.
-
Serverless resource group: Verify that the AnalyticDB for Spark computing resource can connect to the Serverless resource group.
-
Exclusive resource group: Verify that the computing resource can connect to the exclusive resource group for scheduling.
-
Limitations
Supported regions: China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Hong Kong), Japan (Tokyo), Singapore, Malaysia (Kuala Lumpur), and Indonesia (Jakarta).
Required permissions:
| User | Required permissions |
|---|---|
| Alibaba Cloud account | No additional authorization required |
| RAM user/RAM role | DataWorks: The O&M or Workspace Administrator role, or the AliyunDataWorksFullAccess permission. See Grant workspace administrator permissions. AnalyticDB for MySQL: The AliyunADBFullAccess policy, required to create a database in the cluster during this process. |
Bind an AnalyticDB for Spark computing resource
-
Log on to the DataWorks console. In the top navigation bar, switch to the target region. In the left navigation pane, choose More > Management Center, then select your workspace and click Go To Management Center.
-
In the left navigation pane, click Computing Resources.
-
On the Computing Resources page, click Associate Computing Resource.
-
On the Associate Computing Resource page, set the computing resource type to AnalyticDB for Spark. The Associate AnalyticDB For Spark Computing Resource configuration page opens.
-
Configure the computing resource parameters.
Parameter Description Example Configuration Mode Only Alibaba Cloud Instance Pattern is supported. Alibaba Cloud Instance Pattern Alibaba Cloud Account Only Current Alibaba Cloud Account is supported. Current Alibaba Cloud Account Instance Select the AnalyticDB for MySQL cluster to bind. To create a new cluster, click New in the drop-down menu. The cluster must have an interactive resource group with the engine type set to Spark. my-adb-cluster Database Name The name of the database in the AnalyticDB for MySQL cluster. spark_db Computing Resource Instance Name A custom name for this computing resource. At runtime, you can select the computing resource for a task based on this name. adb-spark-prod -
In the Connection Configuration section, select the resource group that DataWorks uses to run AnalyticDB for Spark tasks, then click Test Network Connectivity to verify that the resource group can reach the cluster. For troubleshooting, see Network connectivity solutions.
-
Click OK.
The computing resource is now available. DataWorks also automatically syncs a new AnalyticDB for Spark data source with the same name to the Data Sources section of your workspace.
What's next
In Data Studio, use the computing resource you just bound to develop ADB Spark node tasks and ADB Spark SQL node tasks.