How do I grant permissions to a RAM user to submit Spark jobs in DLA? - Data Lake Analytics - Deprecated

This topic describes how to grant permissions to a Resource Access Management (RAM) user. After the required permissions are granted, the RAM user can submit Spark jobs in Data Lake Analytics (DLA).

Important

DLA is discontinued. AnalyticDB for MySQL Data Lakehouse Edition supports the existing features of DLA and provides more features and better performance. For more information about how to grant permissions to a RAM user to access AnalyticDB for MySQL clusters, see Perform authorization for Alibaba Cloud accounts.

Prerequisites

A RAM user is created. For more information, see Create a RAM user.
A DLA sub-account is created. For more information, see Manage DLA accounts.

Background information

Access permissions of the serverless Spark engine are categorized into the following types:

Access permissions on the DLA console and API operations: control whether a RAM user can log on to the DLA console and whether the RAM user can call API operations to manage Spark jobs. For more information, see Step 1 in the "Procedure" section of this topic.
Access permissions on DLA tables: control whether a RAM user can access DLA tables. By default, DLA sub-accounts control access permissions on DLA tables, and RAM users control permissions to submit Spark jobs. If you want to access a table as a RAM user when you create Spark jobs in DLA, you must bind a DLA sub-account to the RAM user. For more information, see Step 2 in the "Procedure" section of this topic.
Access permissions on resources on which Spark jobs depend: control whether a RAM user can access resources, including the JAR packages on which Spark jobs depend and data sources except for DLA tables, such as Object Storage Service (OSS) directories. For more information, see Step 3 in the "Procedure" section of this topic.

Procedure

Log on to the RAM console and grant the RAM user permissions to access DLA. For more information, see Grant permissions to a RAM user.

Three system policies are provided for you to grant DLA access permissions. In the Add Permissions panel of the RAM console, set the Authorized Scope parameter to Alibaba Cloud Account, click System Policy in the Select Policy section, and then enter DLA in the search box to select DLA-related policies.

The following table describes the DLA-related policies.

Policy	Description
AliyunDLAFullAccess	Provides the administrator-level permissions on DLA. After you attach this policy to a RAM user, the RAM user has all permissions on DLA. For example, the RAM user can create or delete clusters and submit jobs. The RAM user also has the permissions of the role that is assigned to a DLA sub-account.
AliyunDLAReadOnlyAccess	Provides the visitor-level permissions on DLA. After you attach this policy to a RAM user, the RAM user has read-only permissions on DLA. For example, the user can view the status of clusters and jobs. However, the RAM user is not authorized to change the status of clusters or submit jobs.
AliyunDLADeveloperAccess	Provides the developer-level permissions on DLA. After you attach this policy to a RAM user, the RAM user has permissions to view the status of clusters and jobs, submit and run jobs. The RAM user also have the permissions of the role that is assigned to a DLA sub-account. However, the RAM user is not authorized to create or delete clusters.

Bind the DLA sub-account to the RAM user. For more information, see Bind a RAM user with a DLA sub-account.
On the Cloud Resource Access Authorization page of the RAM console, grant the RAM user permissions to access DLA resources.
After you perform this step, the system automatically creates the AliyunDLASparkProcessingDataRole role that allows the RAM user to read data from and write data to all OSS buckets within your Alibaba Cloud account.

Note

Make sure that all the preceding steps are performed. If you do not perform all the steps, a permission error is returned when you submit a job.

Verify the permissions of the RAM user

After you perform all the preceding steps, you can log on to the DLA console as the RAM user. In the left-side navigation pane, choose Serverless Spark > Submit job to submit a job to check whether the permissions of the RAM user are correctly configured. The following sample code provides an example on how to configure a Spark job. For more information, see Create and run Spark jobs and Configure a Spark job.

{
    "name": "SparkPi",
    "file": "local:///tmp/spark-examples.jar",
    "className": "org.apache.spark.examples.SparkPi",
    "args": [
        "100"
    ],
    "conf": {
        "spark.driver.resourceSpec": "medium",
        "spark.executor.instances": 1,
        "spark.executor.resourceSpec": "medium"
    }
}

Note

If you do not specify the spark.dla.roleArn parameter in conf, the system automatically uses the Alibaba Cloud Resource Name (ARN) of the AliyunDLASparkProcessingDataRole role. You can also manually specify the spark.dla.roleArn parameter. For more information, see Grant permissions to a RAM user (detailed version).