Use Elastic Container Instance to elastically schedule Spark jobs - E-MapReduce

Alibaba Cloud E-MapReduce (EMR) allows you to elastically schedule Spark jobs by using Elastic Container Instance. This way, you can create pods based on your business requirements without being limited by the computing capabilities of Container Service for Kubernetes (ACK) clusters. This effectively reduces computing costs. This topic describes how to use Elastic Container Instance to elastically schedule Spark jobs.

Background information

To use more advanced features of Elastic Container Instance, you can add more pod annotations to configure parameters based on your business requirements. For more information, see Pod annotations.

Prerequisites

A Spark cluster is created on the EMR on ACK page of the new EMR console. For more information, see Get started with EMR on ACK.
Elastic Container Instance is activated. For more information, see General workflows to use Elastic Container Instance.

Procedure

Install virtual nodes required by Elastic Container Instance in an ACK cluster. For more information, see Step 1: Deploy ack-virtual-node in ACK clusters.

When you submit a Spark job in a Spark cluster created on the EMR on ACK page, configure a pod label or pod annotation, or add Spark configuration items to use Elastic Container Instance to schedule the Spark job.

For more information about how to submit a Spark job, see Submit a Spark job.

Note

In this topic, Spark 3.1.1 for EMR V5.2.1 is used. If you use Spark of another version, modify the sparkVersion and mainApplicationFile parameters. For more information about the parameters in this topic, see spark-on-k8s-operator at GitHub.

Method 1: Configure a pod label

Set the alibabacloud.com/eci parameter to true to schedule specific pods to Elastic Container Instance. Sample code:

apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
  name: spark-pi-eci
spec:
  type: Scala
  sparkVersion: 3.1.1
  mainClass: org.apache.spark.examples.SparkPi
  mainApplicationFile: "local:////opt/spark/examples/jars/spark-examples_2.12-3.1.1.jar"
  arguments:
    - "1000000"
  driver:
    cores: 2
    coreLimit: 2000m
    memory: 4g
  executor:
    cores: 4
    coreLimit: 4000m
    memory: 8g
    instances: 10
    # Configure a pod label to allow all executors to use Elastic Container Instance. 
    labels:
      alibabacloud.com/eci: "true"
    # Optional. Enable the image cache feature of Elastic Container Instance to improve performance.
    annotations:
      k8s.aliyun.com/eci-image-cache: "true"

Method 2: Configure a pod annotation

Set the alibabacloud.com/burst-resource parameter to eci to schedule specific pods to Elastic Container Instance. Valid values of the alibabacloud.com/burst-resource parameter:

eci: Elastic Container Instance is used when resources on the regular nodes of a cluster are insufficient.
eci_only: Only Elastic Container Instance is used.

Sample code:

apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
  name: spark-pi-eci
spec:
  type: Scala
  sparkVersion: 3.1.1
  mainClass: org.apache.spark.examples.SparkPi
  mainApplicationFile: "local:////opt/spark/examples/jars/spark-examples_2.12-3.1.1.jar"
  arguments:
    - "1000000"
  driver:
    cores: 2
    coreLimit: 2000m
    memory: 4g
  executor:
    cores: 4
    coreLimit: 4000m
    memory: 8g
    instances: 10
    # Configure a pod annotation to allow executors to use Elastic Container Instance when resources on regular nodes of the cluster are insufficient. 
    annotations:
      alibabacloud.com/burst-resource: "eci"
      # Optional. Enable the image cache feature of Elastic Container Instance to improve performance.
      k8s.aliyun.com/eci-image-cache: "true"

Method 3: Add Spark configuration items

You can add Spark configuration items to configure pod annotations to use Elastic Container Instance to schedule Spark jobs. Valid values of the annotations are the same as those in Method 2.

Go to the spark-defaults.conf tab.
1. Log on to the EMR console. In the left-side navigation pane, click EMR on ACK.
2. On the EMR on ACK page, find the desired cluster and click Configure in the Actions column.
3. On the Configure tab, click the spark-defaults.conf tab.

Enable Elastic Container Instance for the Spark cluster.

On the spark-defaults.conf tab, click Add Configuration Item.

In the Add Configuration Item dialog box, add the configuration items that are described in the following table.

Configuration item	Description
spark.kubernetes.driver.annotation.alibabacloud.com/burst-resource	Specifies whether Spark drivers use Elastic Container Instance. Valid values: eci and eci_only.
spark.kubernetes.driver.annotation.k8s.aliyun.com/eci-image-cache	Specifies whether Spark drivers use the image cache feature of Elastic Container Instance. We recommend that you set the configuration item to true.
spark.kubernetes.executor.annotation.alibabacloud.com/burst-resource	Specifies whether Spark executors use Elastic Container Instance. Valid values: eci and eci_only.
spark.kubernetes.executor.annotation.k8s.aliyun.com/eci-image-cache	Specifies whether Spark executors use the image cache feature of Elastic Container Instance. We recommend that you set the configuration item to true.
spark.kubernetes.driver.annotation.k8s.aliyun.com/eci-ram-role-name	The name of the RAM role that is assigned to a Spark driver pod when you create the Spark driver pod. Set the configuration item to AliyunECSInstanceForEMRRole.

Click OK.
In the dialog box that appears, configure the Execution Reason parameter and click Save.

Make the configurations take effect.
1. In the lower part of the Configure tab, click Deploy Client Configuration.
2. In the dialog box that appears, configure the Execution Reason parameter and click OK.
3. In the Confirm message, click OK.

Optional. If you want to run the job to read data from or write data to Object Storage Service (OSS) buckets or use metadata in Data Lake Formation (DLF), you must grant related permissions to Elastic Container Instance. You can grant permissions by using one of the following methods:
- Method 1: Assign a RAM role to Elastic Container Instance to implement password-free access
  1. Log on to the RAM console. Create a RAM role whose trusted entity is an Alibaba Cloud service. For more information, see Create a regular service role.
    Note
    Select Elastic Cloud Service as the trusted service.
  2. Attach the AliyunOSSFullAccess and AliyunDLFFullAccess policies to the RAM role.
    For more information, see Grant permissions to a RAM role.
  3. Add a pod annotation for the Spark job to use the RAM role.
```
annotations:
  k8s.aliyun.com/eci-ram-role-name: <Name of the RAM role>
```
- Method 2: Configure an AccessKey pair for accessing OSS or DLF
  - If you want to run the job to read data from or write data to OSS buckets, configure an AccessKey pair in hadoopConf. Sample code:
```
hadoopConf:
  fs.jfs.cache.oss.accessKeyId: <yourAccessKeyId>
  fs.jfs.cache.oss.accessKeySecret: <yourAccessKeySecret>
```
  - If the job uses DLF, configure an AccessKey pair in hadoopConf. Sample code:
```
hadoopConf:
  dlf.catalog.accessKeyId: <yourAccessKeyId>
  dlf.catalog.accessKeySecret: <yourAccessKeySecret>
  dlf.catalog.akMode: "MANUAL"
```