All Products
Search
Document Center

Create a virtual cluster

Last Updated: Jun 10, 2020

Before you use the Serverless Spark feature, you must create a virtual cluster to execute Spark jobs.

Precautions

  • The Serverless Spark feature is available only to China(Hong Kong),Singapore,US(Silicon Valley).

  • You can use Serverless Spark to access only OSS data.

  • The Serverless Spark feature only supports three compute unit specifications: small, medium, and large.

  • A maximum of 10 virtual clusters can be created under an Alibaba Cloud account.

Procedure

  1. Log on to the Data Lake Analytics console.

  2. In the top navigation bar, select the region where Data Lake Analytics is deployed.

    The Serverless Spark feature is available only to China(Hong Kong),Singapore,US(Silicon Valley).

  3. In the left-side navigation pane, choose Serverless Spark > Virtual Cluster management.

  4. On the Virtual Cluster management page, click New Virtual Cluster.

  5. In the Create a virtual cluster pane, specify the parameters as required.

    Parameter Description
    Name The name of the virtual cluster, which must be unique under an Alibaba Cloud account.

    The name can contain letters, digits, and underscores (_) and must satrt with a letter.

    Resource upper limit The upper limit of the number of CPUs and the memory that the Spark jobs of a virtual cluster can use.

    You can select cluster specifications from the drop-down list. Alternatively, you can click Custom to enter the upper limit.

    If the total resources that are used by a single Spark job exceed the upper limit, the system rejects the Spark job.

    Version The version number of the Serverless Spark engine.
    Version Description The version description of the Serverless Spark engine.

    You can click Show to set the default parameters for a Spark job.

    executor default resource The default resource specifications of the executors in a Spark job, which corresponds to spark.executor.resourceSpec in the command line.
    executor default quantity The default number of executors in a Spark job, which corresponds to spark.executor.instances in the command line.
    driver default resources The default resource specifications of the driver in a Spark job, which corresponds to spark.driver.resourceSpec in the command line.
  6. Click OK.

    After about one second, the status of the virtual cluster becomes RUNNING, which indicates that the cluster is created.

What to do next

Create a Spark job and use the resources in the virtual cluster to execute the job. For information, see Create and execute a Spark job.