This topic describes how to grant permissions to a RAM user. After the permissions are granted, the Resource Access Management (RAM) user can submit Spark jobs in Data Lake Analytics (DLA).

Prerequisites

Background information

Access permissions of the serverless Spark engine are categorized into the following types:
  • Access permissions on the DLA console and API operations: control whether a RAM user can log on to the DLA console and whether the RAM user can call API operations to manage Spark jobs. For more information, see Step 1.
  • Access permissions on DLA tables: control whether a RAM user can access DLA tables. By default, DLA sub-accounts control access permissions on DLA tables, and RAM users control permissions to submit Spark jobs. If you want to access a table as a RAM user, you must bind a DLA sub-account to the RAM user. For more information, see Step 2.
  • Access permissions on resources on which Spark jobs depend: control whether a RAM user can access resources, including the JAR packages on which Spark jobs depend and data sources except for DLA tables, such as Object Storage Service (OSS) directories. For more information, see Step 3.

Procedure

  1. Log on to the RAM console and grant the RAM user permissions to access DLA. For more information, see Grant permissions to a RAM user.
    RAM provides three system policies for you to grant DLA access permissions. In the Add Permissions panel of the RAM console, you can select Alibaba Cloud Account for Authorized Scope, click System Policy in the Select Policy section, and then enter DLA in the search box to select DLA-related policies.
    The following table describes the DLA-related policies.
    Policy Description
    AliyunDLAFullAccess Provides the administrator-level permissions on DLA. After you attach this policy to a RAM user, the RAM user has all permissions on DLA, such as create or delete clusters and submit jobs. In addition, the RAM user can use the permissions of the role that is assigned to a DLA sub-account.
    AliyunDLAReadOnlyAccess Provides the visitor-level permissions on DLA. After you attach this policy to a RAM user, the RAM user has read-only permissions on DLA, such as view the status of clusters and jobs. However, the RAM user is not authorized to change the status of clusters or submit jobs.
    AliyunDLADeveloperAccess Provides the developer-level permissions on DLA. After you attach this policy to a RAM user, the RAM user has permissions to view the status of clusters and jobs, submit and run jobs. In addition, the RAM user can use the permissions of the role that is assigned to a DLA sub-account. However, the RAM user is not authorized to create or delete clusters.
  2. Bind the DLA sub-account to the RAM user. For more information, see Bind a RAM user with a DLA sub-account.
  3. On the Cloud Resource Access Authorization page of the RAM console, grant the RAM user permissions to access DLA resources.
    After you perform this step, the system automatically creates the AliyunDLASparkProcessingDataRole role that allows the RAM user to read data from and write data to all OSS buckets within your Alibaba Cloud account.
Note Make sure that all the preceding steps are performed. If you do not perform all the steps, a permission error is returned when you submit a job.

Verify the permissions of the RAM user

After you perform all the preceding steps, you can log on to the DLA console as a RAM user. In the left-side navigation pane, choose Serverless Spark > Submit job to submit a job to check whether the permissions of the RAM user are correctly configured. For more information, see Create and run Spark jobs and Configure a Spark job. Sample job configurations:
{
    "name": "SparkPi",
    "file": "local:///tmp/spark-examples.jar",
    "className": "org.apache.spark.examples.SparkPi",
    "args": [
        "100"
    ],
    "conf": {
        "spark.driver.resourceSpec": "medium",
        "spark.executor.instances": 1,
        "spark.executor.resourceSpec": "medium"
    }
}
Note If you do not specify spark.dla.roleArn in conf, the system automatically uses the Alibaba Cloud Resource Name (ARN) of the AliyunDLASparkProcessingDataRole role. You can also manually specify spark.dla.roleArn. For more information, see Grant permissions to a RAM user (detailed version).