All Products
Search
Document Center

E-MapReduce:Create a workspace

Last Updated:Dec 05, 2025

A workspace is the basic unit in Serverless Spark and is used to manage jobs, members, roles, and permissions. All job development must be performed in a workspace. Therefore, you must create a workspace before you can start developing jobs. This topic describes how to quickly create a workspace on the EMR Serverless Spark page.

Prerequisites

  • You have registered an Alibaba Cloud account and completed real-name verification.

  • The account that you use to create a workspace has the required permissions.

    • If you use an Alibaba Cloud account to create the workspace, see Assign roles to an Alibaba Cloud account for more information.

    • If you use a Resource Access Management (RAM) user or a RAM role to create a workspace, make sure that the AliyunEMRServerlessSparkFullAccess, AliyunOSSFullAccess, and AliyunDLFFullAccess access policies are attached to the RAM user or RAM role. Then, add the RAM user or RAM role on the Access Control page of EMR Serverless Spark and grant it the administrator role. For more information, see Grant permissions to a RAM user and Manage users and roles.

  • Data Lake Formation (DLF) is activated. For more information, see Quick Start. For a list of supported regions, see Regions and endpoints.

  • Object Storage Service (OSS) is activated and a bucket is created. For more information, see Activate OSS and Create a bucket.

Precautions

The runtime environment of the code is managed and configured by the owner of the environment.

Procedure

  1. Go to the EMR Serverless Spark page.

    1. Log on to the EMR console.

    2. In the left navigation pane, choose EMR Serverless > Spark.

    3. In the top navigation bar, select the required region.

      Important

      You cannot change the region of a workspace after it is created.

  2. Click Create Workspace.

  3. On the EMR Serverless Spark page, configure the parameters.

    Parameter

    Description

    Example

    Region

    We recommend that you select the region where your data is stored.

    China (Hangzhou)

    Billing Method

    The Subscription and Pay-as-you-go billing methods are supported.

    Pay-as-you-go

    Workspace Name

    The name must be 1 to 64 characters in length and can contain only Chinese characters, letters, digits, hyphens (-), and underscores (_).

    Note

    The names of workspaces within the same Alibaba Cloud account must be unique. If you enter the name of an existing workspace, the system prompts you to enter a different name.

    emr-serverless-spark

    Maximum Quota

    The maximum number of compute units (CUs) that can be concurrently used to process jobs in the workspace.

    1000

    Workspace Directory

    The path that is used to store data files, such as task logs, running events, and resources.

    We recommend that you select a bucket for which OSS-HDFS is enabled. This provides compatibility with native Hadoop Distributed File System (HDFS) interfaces. If your application scenario does not involve HDFS, you can select a standard OSS bucket.

    emr-oss-hdfs

    DLF for Metadata Storage

    Used to store and manage your metadata.

    After you activate DLF, the system selects a default data catalog for you. The default data catalog is named after your UID. If you want to use different data catalogs for different clusters, you can create a data catalog.

    1. Click Create Catalog. In the dialog box that appears, enter a Catalog ID and click OK.

    2. From the drop-down list, select the data catalog that you created.

    emr-dlf

    Execution Role

    The name of the role that EMR Serverless Spark uses to run jobs. The default role is AliyunEMRSparkJobRunDefaultRole.

    EMR Spark uses this role to access your resources in other cloud products, such as OSS and DLF. If you want to control the permissions of the execution role, you can use a custom execution role. For more information, see Execution role.

    AliyunEMRSparkJobRunDefaultRole

    (Optional) Advanced Settings

    Tags: Tags are used to identify cloud resources. You can use tags to classify, search for, and aggregate cloud resources that have the same characteristics from different dimensions. This improves the efficiency of resource management. You can attach a maximum of 20 tags to each workspace. Each tag consists of a custom tag key and tag value to meet your diversified management needs. You can also use tags for cost allocation and fine-grained management of pay-as-you-go resources.

    You can attach tags when you create a workspace, or add or modify tags on the workspace list page at any time after the workspace is created. By attaching tags to resources, you can easily implement resource classification and operational optimization.

    For more information about tags, see What is a tag?.

    Enter a custom tag key and tag value

  4. Click Create Workspace.

References

After you create a workspace, you can start developing jobs, such as SparkSQL jobs. For more information, see Quick start for SparkSQL development.