Use BestEffort pods to run Spark applications - Container Compute Service

Alibaba Cloud Container Compute Service (ACS) provides the serverless computing feature. For big data computing jobs, you can use pods whose computing power quality of service (QoS) class is BestEffort to meet the elastic computing requirements and reduce the computing costs of the jobs. This topic describes how to run Spark applications by using the BestEffort pods provided by ACS.

Background information

Apache Spark and Spark Operator

Apache Spark provides powerful capabilities in data science and machine learning scenarios, which can be used to handle various complex data processing and analysis tasks. Spark provides efficient solutions for offline batch processing and real-time stream processing. Spark Operator can be operated in Kubernetes and use custom resources. It allows you to create Spark applications by using YAML files, which simplifies the process and increases efficiency in cloud-native environments.

BestEffort pods

You can use Spark Operator to manage and schedule Spark applications in Kubernetes. This significantly improves the efficiency of data processing and analysis. ACS supports the creation of pods whose computing power QoS class is BestEffort (BestEffort pods). BestEffort pods provide an economical and efficient solution for short-running jobs and stateless applications that have high scalability and fault tolerance. This helps reduce computing costs and ensures the efficient execution of the job.

Note

This topic describes how to run a Spark application by using the BestEffort pods provided by ACS. If you want to use Apache Spark and Spark Operator in a production environment, we recommend that you configure them based on the official recommendations from Spark.

Prerequisites

ACS is activated. For more information, see Step 1: Activate ACS.
An ACS cluster is created and CoreDNS is installed. For more information, see Create an ACS cluster.
The ack-spark-operator 3.0 component is installed by using Helm. For more information, see Use Helm to manage applications in ACS.
A kubectl client is connected to the ACS cluster. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster and Manage Kubernetes clusters with kubectl on CloudShell.

Procedure

Step 1: Create the Spark application and configure related parameters

Create a file named spark-sa.yaml and copy the following content to the file. Grant the ServiceAccount named spark the permissions to modify cluster resources.

apiVersion: v1
kind: Namespace
metadata:
  name: spark-demo
---
apiVersion: v1
kind: ServiceAccount
metadata:
  namespace: spark-demo
  name: spark
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: spark-role-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: edit
subjects:
  - kind: ServiceAccount
    name: spark
    namespace: spark-demo

Run the following command to configure the Spark application settings.
```
kubectl apply -f spark-sa.yaml
```

Step 2: Use a BestEffort pod to run a Spark application

Create a file named spark-pi.yaml and copy the following content to the file. alibabacloud.com/compute-qos: best-effort is specified in the .spec.executor.labels parameter. This indicates that the executor of the computing job in the Spark application uses BestEffort pods.

apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
  namespace: spark-demo
  name: spark-pi
spec:
  type: Scala
  mode: cluster
  image: "registry.cn-hangzhou.aliyuncs.com/koordinator-sh/spark-test:v3.4.1-0.1"
  imagePullPolicy: IfNotPresent
  mainClass: org.apache.spark.examples.SparkPi
  mainApplicationFile: "local:///opt/spark/examples/jars/spark-examples_2.12-3.4.1.jar"
  sparkVersion: "3.4.1"
  restartPolicy:
    type: Never
  driver:
    cores: 1
    coreLimit: "1"
    memory: "512m"
    labels:
      version: 3.4.1
    serviceAccount: spark
  executor:
    cores: 1
    coreLimit: "1"
    instances: 1
    memory: "512m"
    deleteOnTermination: false  
    labels:
      version: 3.4.1
      alibabacloud.com/compute-qos: best-effort     # best-effort indicates that the executor uses BestEffort pods.

Run the following command to deploy a Spark application:
```
kubectl apply -f spark-pi.yaml
```

Step 3: Check the status of the Spark application

Run the following command to query the name of the pod that runs the Spark application:

kubectl get pod -n spark-demo -o wide

Expected output:

NAME                               READY   STATUS    RESTARTS   AGE   IP              NODE                           NOMINATED NODE   READINESS GATES
spark-pi-xxxxx591db3xxxxx-exec-1   1/1     Running   0          15s   192.168.x.xxx   virtual-kubelet-cn-xxxxxxx-x   <none>           <none>
spark-pi-driver                    1/1     Running   0          39s   192.168.x.xxx   virtual-kubelet-cn-xxxxxxx-x   <none>           <none>

The output shows that the pod is in the Running state. This indicates that the Spark application runs as expected in the ACS cluster.

Verify the result

After the status of the Spark application changes to Completed, run the following command to view the running result of the Spark application:
```
kubectl logs -n spark-demo spark-pi-driver
```
Expected output:
```
......
24/09/10 07:21:30 INFO DAGScheduler: Job 0 finished: reduce at SparkPi.scala:38, took 1.346414 s
Pi is roughly 3.1402757013785068
......
```
The output shows that the time consumed to execute the Spark computing job is 1.346414s, and the value of Pi is 3.1402757013785068.
Important
The execution time and Pi value provided in this step are reference values. The actual data is subject to your operating environment.

Run the following command to check the computing power QoS class of the pod by using the alibabacloud.com/compute-qos label:

kubectl get pod -n spark-demo -Lalibabacloud.com/compute-qos -o wide

Expected output:

NAME                               READY   STATUS      RESTARTS   AGE     IP              NODE                           NOMINATED NODE   READINESS GATES   COMPUTE-QOS
spark-pi-xxxxx591db3xxxxx-exec-1   0/1     Completed   0          6m35s   192.168.x.xxx   virtual-kubelet-cn-xxxxxxx-x   <none>           <none>            best-effort
spark-pi-driver                    0/1     Completed   0          8m11s   192.168.x.xxx   virtual-kubelet-cn-xxxxxxx-x   <none>           <none>            default

The value in the COMPUTE-QOS column on the right of the output indicates that the computing power QoS class of the pod that executes the Spark job is best-effort.