All Products
Search
Document Center

Container Service for Kubernetes:Use idle resources to schedule and distribute Spark jobs in multiple clusters

Last Updated:Apr 11, 2025

If you have multiple Container Service for Kubernetes (ACK) clusters that run online services and you want to use the idle resources in the clusters to run Spark jobs without interrupting the online services, you can use the multi-cluster Spark job scheduling and distribution provided by Distributed Cloud Container Platform for Kubernetes (ACK One). This topic describes how to use an ACK One Fleet instance and the ACK Koordinator component to use the idle resources in the clusters associated with the Fleet instance to schedule and distribute a Spark job across multiple clusters. This helps you utilize idle resources in multiple clusters. You can configure job priority and the colocation feature to prevent the online services from being affected by the Spark job.

Background information

The following features are required when you use idle resources to schedule and distribute a Spark job in multiple clusters:

  • Multi-cluster Spark job scheduling and distribution provided by ACK One Fleet instances, including idle resource-aware scheduling.

  • Colocation of Koordinator supported by ACK Spark Operator.

  • Single-cluster colocation of ACK Koordinator.

image

Procedure:

  1. Associate multiple ACK clusters with an ACK Fleet instance and deploy ACK Koordinator and ACK Spark Operator in each associated cluster.

  2. Create SparkApplication and PropagationPolicy for the Fleet instance.

  3. The multi-cluster scheduling component (Global Scheduler) of the Fleet instance matches Spark job resource requests based on the remaining resources of each associated sub-cluster.

    For sub-clusters whose Kubernetes version is 1.28 or later, the Fleet instance supports resource preoccupation to improve the success rate of Spark job scheduling.

  4. After the Fleet instance schedules jobs, SparkApplication is scheduled and distributed to the associated clusters.

  5. In the associated clusters, ACK Spark Operator runs the driver and executor of Spark jobs. At the same time, the Fleet instance watches the running status of the Spark job in sub-clusters. If the driver cannot be run due to insufficient resources, the Fleet instance reclaims the SparkApplication after a specific period of time and reschedules SparkApplication to other associated clusters that have sufficient resources.

Prerequisites

Step 1: deploy ack-koordinator in each associated cluster

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, click the name of the one you want to change. In the left-side navigation pane, choose Configurations > ConfigMaps.

  3. On the ConfigMap page, click Create from YAML. Copy the following YAML template to the Template code editor. For more information, see Get started with colocation.

    # Example of the ack-slo-config ConfigMap. 
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: ack-slo-config
      namespace: kube-system
    data:
      colocation-config: |-
        {
          "enable": true
        }
      resource-qos-config: |-
        {
          "clusterStrategy": {
            "lsClass": {
              "cpuQOS": {
                "enable": true
              },
              "memoryQOS": {
                "enable": true
              },
              "resctrlQOS": {
                "enable": true
              }
            },
            "beClass": {
              "cpuQOS": {
                "enable": true
              },
              "memoryQOS": {
                "enable": true
              },
              "resctrlQOS": {
                "enable": true
              }
            }
          }
        }
      resource-threshold-config: |-
        {
          "clusterStrategy": {
            "enable": true
          }
        }

Step 2: (Optional) Create a namespace on the Fleet instance and distribute the namespace to the associated clusters

Before you install ack-spark-operator in an associated cluster, make sure that the cluster has a namespace dedicated to the Spark job. If the cluster does not have a namespace dedicated to the Spark job, ack-spark-operator cannot be installed as normal. You can create a namespace on the Fleet instance and then create a ClusterPropagationPolicy to distribute the namespace to each associated cluster. In this example, a namespace named spark is created and distributed to each associated cluster.

  1. Use the kubeconfig file of the Fleet instance to connect to the Fleet instance and run the following command to create a namespace named spark:

    kubectl create ns spark
  2. Create a ClusterPropagationPolicy to distribute the namespace to the associated clusters that match specific rules. If you want to distribute the namespace to all associated clusters, leave the clusterAffinity parameter empty.

    apiVersion: policy.one.alibabacloud.com/v1alpha1
    kind: ClusterPropagationPolicy
    metadata:
      name: ns-policy
    spec:
      resourceSelectors:
      - apiVersion: v1
        kind: Namespace
        name: spark
      placement:
        clusterAffinity:
          clusterNames:
          - ${cluster1-id} # The ID of an associated cluster. 
          - ${cluster2-id} # The ID of an associated cluster. 
        replicaScheduling:
          replicaSchedulingType: Duplicated

Step 3: Install ack-spark-operator in the associated clusters

Install ack-spark-operator 2.1.2 or later in the associated cluster in which you want to run the Spark job.

  1. Log on to the ACK console. In the left-side navigation pane, choose Marketplace > Marketplace.

  2. On the Marketplace page, click the App Catalog tab. Find and click ack-spark-operator.

  3. On the ack-spark-operator page, click Deploy.

  4. In the Deploy panel, select a cluster and namespace, and then click Next.

  5. In the Parameters step, select 2.1.2 from the Chart Version drop-down list, add the spark namespace to the jobNamespaces parameter in the Parameters code editor, and then click OK.

    Important

    You must specify the namespace of the SparkApplication you want to create in the spark.jobNamespaces parameter.

    The following table describes some parameters. You can find the parameter configurations in the Parameters section on the ack-spark-operator page.

    Parameter

    Description

    Example

    controller.replicas

    The number of controller replicas.

    Default value: 1.

    webhook.replicas

    The number of webhook replicas.

    Default value: 1.

    spark.jobNamespaces

    The namespaces that can run Spark jobs. If this parameter is left empty, Spark jobs can be run in all namespaces. Separate multiple namespaces with commas (,).

    • Default value: ["default"].

    • [""]: All namespaces.

    • ["ns1","ns2","ns3"]: Specify one or more namespaces.

    spark.serviceAccount.name

    A Spark job automatically creates a ServiceAccount named spark-operator-spark and the corresponding role-based access control (RBAC) resources in each namespace specified by spark.jobNamespaces. You can specify a custom name for the ServiceAccount and then specify the custom name when you submit a Spark job.

    Default value: spark-operator-spark.

Step 4: Create a PriorityClass on the Fleet instance and distribute the PriorityClass to the associated clusters

To ensure that the submitted Spark job does not occupy the resources used by the online service or affect the online service, we recommend that you assign the Spark job a lower priority than the online service.

    1. Use the kubeconfig file of the Fleet instance to create a low-priority PriorityClass and set the value to negative.

      apiVersion: scheduling.k8s.io/v1
      kind: PriorityClass
      metadata:
        name: low-priority
      value: -1000
      globalDefault: false
      description: "Low priority for Spark applications"
    2. Creates a ClusterPropagationPolicy on the Fleet instance to distribute the PriorityClass to the specified cluster. If you want to distribute PriorityClass to all associated clusters, you can delete the clusterAffinity parameter.

      apiVersion: policy.one.alibabacloud.com/v1alpha1
      kind: ClusterPropagationPolicy
      metadata:
        name: priority-policy
      spec:
        preserveResourcesOnDeletion: false
        resourceSelectors:
        - apiVersion: scheduling.k8s.io/v1
          kind: PriorityClass
        placement:
          clusterAffinity:
            clusterNames:
            - ${cluster1-id} # The ID of your cluster. 
            - ${cluster2-id} # The ID of your cluster. 
      #      labelSelector:
      #        matchLabels:
      #          key: value
          replicaScheduling:
            replicaSchedulingType: Duplicated

Step 5: Submit a SparkApplication in a colocation architecture on the Fleet instance

  1. Create a PropagationPolicy by using the following YAML template. The PropagationPolicy is used to distribute all SparkApplications that use the sparkoperator.k8s.io/v1beta2 API version to the associated clusters that match specific rules. If you want to distribute the SparkApplications to all associated clusters, leave the clusterAffinity parameter empty.

    apiVersion: policy.one.alibabacloud.com/v1alpha1
    kind: PropagationPolicy
    metadata:
      name: sparkapp-policy 
      namespace: spark
    spec:
      preserveResourcesOnDeletion: false
      propagateDeps: true
      placement:
        clusterAffinity:
          clusterNames:
          - ${cluster1-id} # The ID of an associated cluster. 
          - ${cluster2-id} # The ID of an associated cluster. 
    #      labelSelector:
    #        matchLabels:
    #          key: value
        replicaScheduling:
          replicaSchedulingType: Divided
          customSchedulingType: Gang
      resourceSelectors:
        - apiVersion: sparkoperator.k8s.io/v1beta2
          kind: SparkApplication
  2. Create a Spark job on the Fleet instance. Add the sparkoperator.k8s.io/koordinator-colocation: "true" annotation to the SparkApplication to use idle resources to schedule the driver pod and the executor pod of the SparkApplication. The following SparkApplication template uses idle resources to schedule the driver pod and the executor pod.

    apiVersion: sparkoperator.k8s.io/v1beta2
    kind: SparkApplication
    metadata:
      name: spark-pi
      namespace: spark
    spec:
      arguments:
      - "50000"
      driver:
        coreLimit: 1000m
        cores: 1
        memory: 512m
        priorityClassName: low-priority
        template:
          metadata:
            annotations:
              sparkoperator.k8s.io/koordinator-colocation: "true"
          spec:
            containers:
            - name: spark-kubernetes-driver
            serviceAccount: spark-operator-spark
      executor:
        coreLimit: 1000m
        cores: 1
        instances: 1
        memory: 1g
        priorityClassName: low-priority
        template:
          metadata:
            annotations:
              sparkoperator.k8s.io/koordinator-colocation: "true"
          spec:
            containers:
            - name: spark-kubernetes-executor
      image: registry-cn-hangzhou.ack.aliyuncs.com/ack-demo/spark:3.5.4
      mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.12-3.5.4.jar
      mainClass: org.apache.spark.examples.SparkPi
      mode: cluster
      restartPolicy:
        type: Never
      sparkVersion: 3.5.4
      type: Scala

Step 6: Check the status of the Spark job

  1. Run the following command on the Fleet instance to view the status of the Spark job:

    kubectl get sparkapp -nspark

    Expected output:

    NAME       STATUS    ATTEMPTS   START                  FINISH       AGE
    spark-pi   RUNNING   1          2025-03-05T11:19:43Z   <no value>   48s
  2. Run the following command on the Fleet instance to query the associated cluster to which the Spark job is scheduled:

    kubectl describe sparkapp spark-pi  -nspark

    Expected output:

    Normal   ScheduleBindingSucceed  2m29s                  default-scheduler                   Binding has been scheduled successfully. Result: {c6xxxxx:0,[{driver 1} {executor 1}]}
  3. Run the following command on the Fleet instance to query the status of resource distribution:

    kubectl get rb  spark-pi-sparkapplication -nspark 

    Expected output:

    NAME                        SCHEDULED   FULLYAPPLIED   OVERRIDDEN   ALLAVAILABLE   AGE
    spark-pi-sparkapplication   True        True           True         True     
  4. Run the following command on the Fleet instance to check the status of the Spark job in the associated cluster:

    kubectl amc get sparkapp -M -nspark

    Expected output:

    NAME       CLUSTER     STATUS      ATTEMPTS   START                  FINISH                 AGE   ADOPTION
    spark-pi   c6xxxxxxx   COMPLETED   1          2025-02-24T12:10:34Z   2025-02-24T12:11:20Z   61s   Y
  5. Run the following command on the Fleet instance to query the status of the pods:

    kubectl amc get pod -M -nspark    

    Expected output:

    NAME                               CLUSTER     READY   STATUS      RESTARTS   AGE
    spark-pi-3c0565956608ad6d-exec-1   c6xxxxxxx   1/1     Running            0          2m35s
    spark-pi-driver                    c6xxxxxxx   1/1     Running            0          2m50s
  6. Run the following command on the Fleet instance to view the details of the Spark job in the associated cluster:

    kubectl amc get sparkapp spark-pi -m ${member clusterid} -oyaml -nspark