A workspace is the basic unit in EMR Serverless Spark for managing tasks and members, and for assigning roles and permissions. You can perform all configurations and run tasks and workflows in a specific workspace. A workspace administrator can add members to the workspace and assign specific roles, such as workspace administrator, data analytics, data development, and guest. This enables workspace members with different roles to collaborate effectively. This topic describes the basic operations that you can perform on a workspace.
Prerequisites
You have registered an Alibaba Cloud account and completed real-name verification.
The account that you use to create a workspace has the required permissions.
If you use an Alibaba Cloud account to create the workspace, see Assign roles to an Alibaba Cloud account for more information.
If you use a Resource Access Management (RAM) user or a RAM role to create a workspace, make sure that the AliyunEMRServerlessSparkFullAccess, AliyunOSSFullAccess, and AliyunDLFFullAccess access policies are attached to the RAM user or RAM role. Then, add the RAM user or RAM role on the Access Control page of EMR Serverless Spark and grant it the administrator role. For more information, see Grant permissions to a RAM user and Manage users and roles.
Data Lake Formation (DLF) is activated. For more information, see Quick Start. For a list of supported regions, see Regions and endpoints.
Object Storage Service (OSS) is activated and a bucket is created. For more information, see Activate OSS and Create a bucket.
Create a workspace
Go to the EMR Serverless Spark workspace page.
Log on to the E-MapReduce console.
In the navigation pane on the left, choose .
On the Spark page, click Create Workspace.
In the Create Workspace dialog box, configure the parameters.
Parameter
Description
Region
The region where the data center is located. We recommend that you select the region where your data resides. You cannot change the region after the workspace is created.
Billing Method
The Subscription and Pay-as-you-go billing methods are supported.
Workspace Name
The name must be 1 to 64 characters in length and can contain only Chinese characters, letters, digits, hyphens (-), and underscores (_).
NoteThe names of workspaces within the same Alibaba Cloud account must be unique. If you enter the name of an existing workspace, the system prompts you to enter a different name.
CU Quota
The maximum number of compute units (CUs) that can be concurrently used to process jobs in the workspace.
Workspace Directory
The path of the OSS bucket that is used to store data files, such as job logs, running events, and resources. To view quasi-real-time incremental logs during O&M, we recommend that you use a bucket for which the HDFS service is enabled.
DLF for Metadata Storage
Used to store and manage metadata. Select the ID of the data catalog that you want to associate with the workspace. You can also perform the following steps to create a data catalog.
Click Create Catalog. In the dialog box that appears, enter a Catalog ID and then click OK.
From the drop-down list, select the data catalog that you created.
NoteAfter you create a workspace, you can add an existing DLF data catalog to the workspace. For more information, see Data Catalog.
Execution Role
The name of the role that EMR Serverless Spark uses to run jobs. The default role is AliyunEMRSparkJobRunDefaultRole.
EMR Spark uses this role to access your resources in other cloud products, such as OSS and DLF. If you want to control the permissions of the execution role, you can use a custom execution role. For more information, see Execution role.
Advanced Settings
Tags: Tags are used to identify cloud resources. You can use tags to classify, search for, and aggregate cloud resources that have the same characteristics from different dimensions. This improves the efficiency of resource management. You can attach a maximum of 20 tags to each workspace. Each tag consists of a custom tag key and tag value to meet your diversified management needs. You can also use tags for cost allocation and fine-grained management of pay-as-you-go resources.
You can attach tags when you create a workspace, or add or modify tags on the workspace list page at any time after the workspace is created. By attaching tags to resources, you can easily implement resource classification and operational optimization.
Click Create Workspace.
Delete a workspace
Before you delete a workspace, make sure that no jobs are running in it. If a job is running in the workspace, the system reports an error and prompts you to stop the job before you can delete the workspace.
After a workspace is deleted, its resources, including jobs and data, are released and cannot be restored. Therefore, you must back up the job scripts before you delete the workspace to prevent data loss.
Data associated with the workspace, such as logs on OSS or HDFS, is not deleted when the workspace is destroyed.
The procedure to delete a workspace varies based on the billing method:
Subscription: You must first unsubscribe from the subscription quota order. After you unsubscribe, the corresponding workspace is deleted. For more information, see Unsubscription policy.
Pay-as-you-go:
On the Spark page, find the desired workspace and click Delete in the Actions column.
In the dialog box that appears, enter the name of the workspace to confirm the operation, and then click OK.
Adjust subscription quota
If you created a subscription workspace and want to adjust its compute unit (CU) quota to meet your business needs, you can upgrade or downgrade the quota.
On the Spark page, find the target workspace and choose or Downgrade Subscription Quota in the Actions column.
On the EMR Serverless Spark Reserved Resources | Upgrade/Downgrade page, adjust the CU Quota parameter. The system automatically calculates and displays the price difference.
After you confirm the changes, click Buy Now.
Modify pay-as-you-go quota
On the Spark page, choose in the Actions column of the target workspace.
In the dialog box that appears, adjust the Maximum Pay-as-you-go Quota (CUs) parameter and click OK.
References
To add RAM users to a workspace for collaborative development, you can use the Resource Access Management feature to import users and assign them roles and permissions. For more information, see Manage users and roles.
If you require resource isolation and management, you can add queues. For more information, see Manage resource queues.
> Upgrade Subscription Quota
> Modify Pay-as-you-go Quota