A workspace is the basic unit in Serverless Spark and is used to manage jobs, members, roles, and permissions. All job development must be performed in a workspace. Therefore, you must create a workspace before you can start developing jobs. This topic describes how to quickly create a workspace on the EMR Serverless Spark page.
Prerequisites
You have registered an Alibaba Cloud account and completed real-name verification.
The account that you use to create a workspace has the required permissions.
If you use an Alibaba Cloud account to create the workspace, see Assign roles to an Alibaba Cloud account for more information.
If you use a Resource Access Management (RAM) user or a RAM role to create a workspace, make sure that the AliyunEMRServerlessSparkFullAccess, AliyunOSSFullAccess, and AliyunDLFFullAccess access policies are attached to the RAM user or RAM role. Then, add the RAM user or RAM role on the Access Control page of EMR Serverless Spark and grant it the administrator role. For more information, see Grant permissions to a RAM user and Manage users and roles.
Data Lake Formation (DLF) is activated. For more information, see Quick Start. For a list of supported regions, see Regions and endpoints.
Object Storage Service (OSS) is activated and a bucket is created. For more information, see Activate OSS and Create a bucket.
Precautions
The runtime environment of the code is managed and configured by the owner of the environment.
Procedure
Go to the EMR Serverless Spark page.
Log on to the EMR console.
In the left navigation pane, choose .
In the top navigation bar, select the required region.
ImportantYou cannot change the region of a workspace after it is created.
Click Create Workspace.
On the EMR Serverless Spark page, configure the parameters.
Parameter
Description
Example
Region
We recommend that you select the region where your data is stored.
China (Hangzhou)
Billing Method
The Subscription and Pay-as-you-go billing methods are supported.
Pay-as-you-go
Workspace Name
The name must be 1 to 64 characters in length and can contain only Chinese characters, letters, digits, hyphens (-), and underscores (_).
NoteThe names of workspaces within the same Alibaba Cloud account must be unique. If you enter the name of an existing workspace, the system prompts you to enter a different name.
emr-serverless-spark
Maximum Quota
The maximum number of compute units (CUs) that can be concurrently used to process jobs in the workspace.
1000
Workspace Directory
The path that is used to store data files, such as task logs, running events, and resources.
We recommend that you select a bucket for which OSS-HDFS is enabled. This provides compatibility with native Hadoop Distributed File System (HDFS) interfaces. If your application scenario does not involve HDFS, you can select a standard OSS bucket.
emr-oss-hdfs
DLF for Metadata Storage
Used to store and manage your metadata.
After you activate DLF, the system selects a default data catalog for you. The default data catalog is named after your UID. If you want to use different data catalogs for different clusters, you can create a data catalog.
Click Create Catalog. In the dialog box that appears, enter a Catalog ID and click OK.
From the drop-down list, select the data catalog that you created.
emr-dlf
Execution Role
The name of the role that EMR Serverless Spark uses to run jobs. The default role is AliyunEMRSparkJobRunDefaultRole.
EMR Spark uses this role to access your resources in other cloud products, such as OSS and DLF. If you want to control the permissions of the execution role, you can use a custom execution role. For more information, see Execution role.
AliyunEMRSparkJobRunDefaultRole
(Optional) Advanced Settings
Tags: Tags are used to identify cloud resources. You can use tags to classify, search for, and aggregate cloud resources that have the same characteristics from different dimensions. This improves the efficiency of resource management. You can attach a maximum of 20 tags to each workspace. Each tag consists of a custom tag key and tag value to meet your diversified management needs. You can also use tags for cost allocation and fine-grained management of pay-as-you-go resources.
You can attach tags when you create a workspace, or add or modify tags on the workspace list page at any time after the workspace is created. By attaching tags to resources, you can easily implement resource classification and operational optimization.
For more information about tags, see What is a tag?.
Enter a custom tag key and tag value
Click Create Workspace.
References
After you create a workspace, you can start developing jobs, such as SparkSQL jobs. For more information, see Quick start for SparkSQL development.