All Products
Search
Document Center

Container Service for Kubernetes:Multi-cluster scheduling and distribution of Spark jobs

Last Updated:Mar 26, 2026

Use Fleet instances of ACK One (Distributed Cloud Container Platform for Kubernetes) to schedule and distribute Apache Spark jobs across multiple clusters. This lets you run batch Spark workloads on idle cluster capacity without competing with online service workloads for resources.

How it works

image
  1. Install the ack-spark-operator component on each sub-cluster where Spark jobs will run.

  2. Create a SparkApplication and a PropagationPolicy on the Fleet instance.

  3. The Global Scheduler (the multi-cluster scheduling component of the Fleet instance) compares the Spark job's resource requests against the remaining capacity of each associated sub-cluster and selects the best target.

    For sub-clusters running Kubernetes 1.28 or later, the Fleet instance supports resource preoccupation to improve scheduling success rates.

  4. The Fleet instance distributes SparkApplication to the selected sub-cluster.

  5. ACK Spark Operator in the sub-cluster starts the Spark driver and executor Pods. The Fleet instance watches the job's running status. If the driver fails to start due to insufficient resources, the Fleet instance reclaims the SparkApplication after a timeout and reschedules it to another sub-cluster with sufficient capacity.

Prerequisites

Before you begin, ensure that you have:

Step 1: Install ack-spark-operator on sub-clusters

Install the ack-spark-operator component on each sub-cluster where you want to run Spark jobs.

  1. Log on to the ACK console. In the left-side navigation pane, choose Marketplace > Marketplace.

  2. On the Marketplace page, click the App Catalog tab, then find and click ack-spark-operator.

  3. On the ack-spark-operator page, click Deploy.

  4. In the Deploy panel, select a cluster and namespace, then click Next.

  5. In the Parameters step, configure the parameters and click OK.

The following table describes the key parameters. You can find all parameter configurations in the Parameters section on the ack-spark-operator page.

ParameterDescriptionDefault
controller.replicasNumber of controller replicas.1
webhook.replicasNumber of webhook replicas.1
spark.jobNamespacesNamespaces where Spark jobs can run. Set to [""] for all namespaces, or list specific namespaces separated by commas, for example, ["ns1","ns2","ns3"].["default"]
spark.serviceAccount.nameName of the ServiceAccount automatically created in each namespace listed in spark.jobNamespaces. The operator also creates the corresponding role-based access control (RBAC) resources. Specify a custom name here if you plan to reference it when submitting Spark jobs.spark-operator-spark

Step 2: Create a PriorityClass and distribute it to sub-clusters

Assign a low priority to Spark jobs to prevent them from competing with online service workloads for resources.

  1. Using the kubeconfig file of the Fleet instance, create a low-priority PriorityClass with a negative value:

    apiVersion: scheduling.k8s.io/v1
    kind: PriorityClass
    metadata:
      name: low-priority
    value: -1000
    globalDefault: false
    description: "Low priority for Spark applications"
  2. Create a ClusterPropagationPolicy on the Fleet instance to distribute the PriorityClass to the target sub-clusters. To distribute to all associated clusters, remove the clusterAffinity field.

    apiVersion: policy.one.alibabacloud.com/v1alpha1
    kind: ClusterPropagationPolicy
    metadata:
      name: priority-policy
    spec:
      preserveResourcesOnDeletion: false
      resourceSelectors:
      - apiVersion: scheduling.k8s.io/v1
        kind: PriorityClass
      placement:
        clusterAffinity:
          clusterNames:
          - ${cluster1-id} # The ID of your cluster.
          - ${cluster2-id} # The ID of your cluster.
    #      labelSelector:
    #        matchLabels:
    #          key: value
        replicaScheduling:
          replicaSchedulingType: Duplicated

Step 3: Submit a Spark job on the Fleet instance

(Optional) Create and distribute namespaces to sub-clusters

If the namespace where the SparkApplication will run does not already exist on the Fleet instance, create it first. The namespace must also be listed in the spark.jobNamespaces parameter configured in Step 1.

  1. Create the namespace on the Fleet instance:

    kubectl create ns xxx
  2. Create a ClusterPropagationPolicy to distribute the namespace to each sub-cluster:

    apiVersion: policy.one.alibabacloud.com/v1alpha1
    kind: ClusterPropagationPolicy
    metadata:
      name: ns-policy
    spec:
      resourceSelectors:
      - apiVersion: v1
        kind: Namespace
        name: xxx
      placement:
        clusterAffinity:
          clusterNames:
          - ${cluster1-id} # The ID of your cluster.
          - ${cluster2-id} # The ID of your cluster.
        replicaScheduling:
          replicaSchedulingType: Duplicated

Create a PropagationPolicy for SparkApplication

Create a PropagationPolicy on the Fleet instance. This policy tells the Global Scheduler to distribute all SparkApplication resources of the sparkoperator.k8s.io/v1beta2 API version to the target sub-clusters using Gang scheduling.

apiVersion: policy.one.alibabacloud.com/v1alpha1
kind: PropagationPolicy
metadata:
  name: sparkapp-policy
  namespace: default
spec:
  preserveResourcesOnDeletion: false
  propagateDeps: true   # Automatically propagates dependent resources (such as the ServiceAccount referenced in the driver spec) to the target sub-clusters.
  placement:
    clusterAffinity:
      clusterNames:
      - ${cluster1-id} # The ID of your cluster.
      - ${cluster2-id} # The ID of your cluster.
#      labelSelector:
#        matchLabels:
#          key: value
    replicaScheduling:
      replicaSchedulingType: Divided
      customSchedulingType: Gang
  resourceSelectors:
    - apiVersion: sparkoperator.k8s.io/v1beta2
      kind: SparkApplication

Submit a SparkApplication

Create a SparkApplication on the Fleet instance. Set priorityClassName to the PriorityClass created in Step 2 for both the driver and executor.

The Global Scheduler uses the memory and cores values in the driver and executor specs as Kubernetes resource requests. It compares the total requested resources against each sub-cluster's remaining capacity to select the target cluster. Size these values to reflect the actual resources your job needs — undersized values may cause scheduling to succeed but the job to fail at runtime; oversized values may prevent scheduling if no cluster has sufficient capacity.

After you create the SparkApplication, the PropagationPolicy from the previous step distributes it to the selected sub-cluster.

apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
  name: spark-pi
  namespace: default     # Make sure this namespace is listed in the spark.jobNamespaces parameter.
spec:
  type: Scala
  mode: cluster
  image: registry-cn-hangzhou.ack.aliyuncs.com/ack-demo/spark:3.5.4
  imagePullPolicy: IfNotPresent
  mainClass: org.apache.spark.examples.SparkPi
  mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.12-3.5.4.jar
  arguments:
  - "1000"
  sparkVersion: 3.5.4
  driver:
    cores: 1
    memory: 512m
    priorityClassName: low-priority
    serviceAccount: spark-operator-spark   # Replace with the custom name you specified in Step 1.
  executor:
    instances: 1
    cores: 1
    memory: 512m
    priorityClassName: low-priority
  restartPolicy:
    type: Never

Step 4: Verify the Spark job

Check job status and scheduling result

  1. Run the following command on the Fleet instance to view the Spark job status:

    kubectl get sparkapp

    Expected output:

    NAME       STATUS    ATTEMPTS   START                  FINISH       AGE
    spark-pi   RUNNING   1          2025-02-24T12:10:34Z   <no value>   11s
  2. Run the following command to confirm which sub-cluster the job was scheduled to:

    kubectl describe sparkapp spark-pi

    Expected output:

    Normal   ScheduleBindingSucceed  2m29s                  default-scheduler                   Binding has been scheduled successfully. Result: {c6xxxxx:0,[{driver 1} {executor 1}]}
  3. Run the following command to view the job status in the associated sub-cluster:

    kubectl amc get sparkapp -M

    Expected output:

    NAME       CLUSTER     STATUS      ATTEMPTS   START                  FINISH                 AGE   ADOPTION
    spark-pi   c6xxxxxxx   COMPLETED   1          2025-02-24T12:10:34Z   2025-02-24T12:11:20Z   61s   Y
  4. Run the following command to query the Pod status:

    kubectl amc get pod -M

    Expected output:

    NAME              CLUSTER     READY   STATUS      RESTARTS   AGE
    spark-pi-driver   c6xxxxxxx   0/1     Completed   0          68s
  5. Run the following command to view the full details of the Spark job in the sub-cluster:

    kubectl amc get sparkapp spark-pi -m ${member clusterid} -oyaml

Troubleshooting

If the job does not reach COMPLETED, use the following commands to diagnose the issue.

Job stuck in `PENDING` or not scheduled:

Check the events on the SparkApplication to see if the Global Scheduler reported a scheduling failure:

kubectl describe sparkapp spark-pi

Look for events with reason ScheduleBindingFailed or similar. Common causes include insufficient resources across all sub-clusters or a missing PriorityClass on the target cluster.

Driver Pod not starting:

Check the driver Pod's events in the sub-cluster:

kubectl amc get pod -M
kubectl amc describe pod spark-pi-driver -m ${member clusterid}

Operator not running in sub-cluster:

Confirm that ack-spark-operator is running in the sub-cluster:

kubectl amc get pod -n spark-operator -M

What's next