All Products
Search
Document Center

Container Service for Kubernetes:Overview of Spark on ACK

Last Updated:Mar 26, 2026

Spark on Container Service for Kubernetes (ACK) lets you build an efficient, flexible, and scalable big data processing platform on Kubernetes without managing the underlying infrastructure. It extends the open-source Spark Operator with ACK-native capabilities: deep integration with Alibaba Cloud storage, observability, and elastic computing resources.

How it works

When you submit a Spark job through Spark Operator, the following happens:

  1. Spark Operator receives the SparkApplication resource and creates a driver pod in your target namespace.

  2. The driver pod creates executor pods, connects to them, and executes application code.

  3. When the job completes, executor pods are cleaned up.

  4. Spark Operator manages the entire lifecycle — configuration, submission, and retry — so you don't need to interact with spark-submit directly.

Key features

Simplified development and operations

  • Portability: Package Spark applications and their dependencies into container images for easy migration between Kubernetes clusters.

  • Observability: Monitor job status through Spark History Server. Collect and analyze logs with Simple Log Service and metrics with Managed Service for Prometheus.

  • Workflow orchestration: Use Apache Airflow or Argo Workflows to manage Spark jobs, automate data pipeline scheduling, and ensure consistent deployments across environments.

  • Multi-version support: Run multiple versions of Spark jobs concurrently in a single ACK cluster.

Job scheduling and resource management

  • Job queue management: Integrated with ack-kube-queue for flexible job queue and resource quota management.

  • Multiple scheduling strategies: Supports Gang Scheduling and Capacity Scheduling through the ACK scheduler.

  • Multi-architecture scheduling: Supports hybrid use of x86 and Arm Elastic Compute Service (ECS) resources.

  • Multi-cluster scheduling: Use ACK One to distribute Spark jobs across multiple clusters.

  • Elastic computing: Combines node autoscaling and instant elasticity with Elastic Container Instance (ECI) and Container Compute Service (ACS) resources for on-demand scaling without maintaining ECS instances.

  • Workload colocation: Integrated with ack-koordinator to colocate multiple workload types and improve cluster resource utilization.

Performance and stability

  • Shuffle performance: Use Apache Celeborn as the Remote Shuffle Service (RSS) to achieve storage-compute separation and reduce out-of-memory (OOM) errors and fetch failures.

  • Data access acceleration: Use Fluid's distributed cache to speed up data access for Spark jobs reading from OSS or remote storage.

Architecture

image
ComponentRole
ClientSubmit jobs via kubectl or Arena
WorkflowOrchestrate jobs via Apache Airflow or Argo Workflows
Spark OperatorAutomate job lifecycle management (configuration, submission, retry)
ObservabilityMonitor job status and collect logs and metrics via Spark History Server, Simple Log Service, and Managed Service for Prometheus
Remote Shuffle Service (RSS)Improve shuffle performance and stability via Apache Celeborn
CacheAccelerate data access via Fluid
Cloud infrastructureECS instances, elastic container instances, ACS clusters, Object Storage Service (OSS), File Storage NAS (NAS), disks, elastic network interfaces (ENIs), virtual private clouds (VPCs), and Server Load Balancer (SLB) instances

Billing

Installing Spark-related ACK components (ack-spark-operator, ack-spark-history-server, and others) is free. Standard ACK cluster fees — cluster management fees and associated cloud resource fees — apply. For details, see Billing overview.

Additional fees from other cloud products may apply. For example, Simple Log Service charges for log collection, and OSS or NAS charges apply for data read and write operations by Spark jobs.

Getting started

Running Spark jobs on ACK follows a layered setup: start with the basics, add observability, then tune for performance.

image

Prerequisites

Before you begin, make sure you have:

  • A running ACK cluster with kubectl access configured

  • Sufficient permissions to create and manage pods in your cluster. Run the following command to verify:

    kubectl auth can-i create pods
  • A dedicated namespace for Spark jobs (this guide uses spark):

    kubectl create namespace spark
  • A service account for Spark driver pods. The driver pod requires permissions to create, list, and delete executor pods and services. Ensure the service account (for example, spark-operator-spark) has the appropriate RBAC permissions in your job namespace before submitting jobs.

Basic usage

Step 1: Build a Spark container image

Use the open-source Spark image directly, or customize it to add dependencies such as OSS support or Celeborn RSS. The following Dockerfile adds the common dependencies used in this guide.

Expand to view the sample Dockerfile

ARG SPARK_IMAGE=spark:3.5.4

FROM ${SPARK_IMAGE}

# Add dependency for Hadoop Aliyun OSS support
ADD --chown=spark:spark --chmod=644 https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aliyun/3.3.4/hadoop-aliyun-3.3.4.jar ${SPARK_HOME}/jars
ADD --chown=spark:spark --chmod=644 https://repo1.maven.org/maven2/com/aliyun/oss/aliyun-sdk-oss/3.17.4/aliyun-sdk-oss-3.17.4.jar ${SPARK_HOME}/jars
ADD --chown=spark:spark --chmod=644 https://repo1.maven.org/maven2/org/jdom/jdom2/2.0.6.1/jdom2-2.0.6.1.jar ${SPARK_HOME}/jars

# Add dependency for log4j-layout-template-json
ADD --chown=spark:spark --chmod=644 https://repo1.maven.org/maven2/org/apache/logging/log4j/log4j-layout-template-json/2.24.1/log4j-layout-template-json-2.24.1.jar ${SPARK_HOME}/jars

# Add dependency for Celeborn
ADD --chown=spark:spark --chmod=644 https://repo1.maven.org/maven2/org/apache/celeborn/celeborn-client-spark-3-shaded_2.12/0.5.3/celeborn-client-spark-3-shaded_2.12-0.5.3.jar ${SPARK_HOME}/jars

Build the image and push it to your image repository, then reference it in your SparkApplication resources.

Step 2: Deploy Spark Operator and run your first job

Deploy the ack-spark-operator component and set spark.jobNamespaces=["spark"] so it only watches jobs in the spark namespace.

The following is a minimal SparkApplication that runs the SparkPi example — enough to verify that Spark Operator is working:

Expand to view the sample Spark job

apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
  name: spark-pi
  namespace: spark  # Must be in the namespace list specified by spark.jobNamespaces
spec:
  type: Scala
  mode: cluster
  # Replace <SPARK_IMAGE> with your own Spark container image
  image: <SPARK_IMAGE>
  imagePullPolicy: IfNotPresent
  mainClass: org.apache.spark.examples.SparkPi
  mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.12-3.5.4.jar
  arguments:
    - "5000"
  sparkVersion: 3.5.4
  driver:
    cores: 1
    coreLimit: 1200m
    memory: 512m
    template:
      spec:
        containers:
          - name: spark-kubernetes-driver
        serviceAccount: spark-operator-spark
  executor:
    instances: 1
    cores: 1
    coreLimit: 1200m
    memory: 512m
    template:
      spec:
        containers:
          - name: spark-kubernetes-executor
  restartPolicy:
    type: Never
restartPolicy: type: Never is appropriate for batch jobs that should not retry on failure. Set it to OnFailure (with onFailureRetries and onFailureRetryInterval) for production pipelines that require automatic retries.

For more information, see Use Spark Operator to run Spark jobs.

Step 3: Read and write OSS data

Spark jobs can access OSS using Hadoop Aliyun SDK, Hadoop AWS SDK, or JindoSDK. Include the corresponding dependencies in your container image and configure the Hadoop parameters in the job.

Expand to view the sample code

This example runs SparkPageRank and reads input data from OSS. Upload your test dataset to OSS first — see Read and write OSS data in Spark jobs for instructions.

apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
  name: spark-pagerank
  namespace: spark
spec:
  type: Scala
  mode: cluster
  # Replace <SPARK_IMAGE> with your own Spark image
  image: <SPARK_IMAGE>
  imagePullPolicy: IfNotPresent
  mainClass: org.apache.spark.examples.SparkPageRank
  mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.12-3.5.4.jar
  arguments:
    # Replace <OSS_BUCKET> with your OSS bucket name
    - oss://<OSS_BUCKET>/data/pagerank_dataset.txt
    # Number of iterations
    - "10"
  sparkVersion: 3.5.4
  hadoopConf:
    fs.oss.impl: org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem
    # Replace <OSS_ENDPOINT> with the OSS endpoint, for example oss-cn-beijing-internal.aliyuncs.com
    fs.oss.endpoint: <OSS_ENDPOINT>
    fs.oss.credentials.provider: com.aliyun.oss.common.auth.EnvironmentVariableCredentialsProvider
  driver:
    cores: 1
    coreLimit: "1"
    memory: 4g
    template:
      spec:
        containers:
          - name: spark-kubernetes-driver
            envFrom:
              # Read OSS credentials from a Kubernetes Secret
              - secretRef:
                  name: spark-oss-secret
        serviceAccount: spark-operator-spark
  executor:
    instances: 2
    cores: 1
    coreLimit: "1"
    memory: 4g
    template:
      spec:
        containers:
          - name: spark-kubernetes-executor
            envFrom:
              - secretRef:
                  name: spark-oss-secret
  restartPolicy:
    type: Never

For more information, see Read and write OSS data in Spark jobs.

Observability

Deploy Spark History Server

Deploy ack-spark-history-server in the spark namespace. It reads Spark event logs from a configured storage backend (PVC, OSS/OSS-HDFS, or HDFS) and exposes them through a web UI.

The following example configures Spark History Server to read event logs from a NAS file system at /spark/event-logs:

Expand to view the sample configuration

# Spark configuration
sparkConf:
  spark.history.fs.logDirectory: file:///mnt/nas/spark/event-logs

# Environment variables
env:
  - name: SPARK_DAEMON_MEMORY
    value: 7g

# Data volume
volumes:
  - name: nas
    persistentVolumeClaim:
      claimName: nas-pvc

# Data volume mount
volumeMounts:
  - name: nas
    subPath: spark/event-logs
    mountPath: /mnt/nas/spark/event-logs

# Adjust resource size based on the number and scale of Spark jobs
resources:
  requests:
    cpu: 2
    memory: 8Gi
  limits:
    cpu: 2
    memory: 8Gi

Mount the same NAS file system in your Spark jobs and configure spark.eventLog.dir to write event logs to the same path. The following example shows a complete job with event logging enabled:

Expand to view the sample Spark job

apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
  name: spark-pi
  namespace: spark
spec:
  type: Scala
  mode: cluster
  # Replace <SPARK_IMAGE> with your Spark image
  image: <SPARK_IMAGE>
  imagePullPolicy: IfNotPresent
  mainClass: org.apache.spark.examples.SparkPi
  mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.12-3.5.4.jar
  arguments:
    - "5000"
  sparkVersion: 3.5.4
  sparkConf:
    spark.eventLog.enabled: "true"
    spark.eventLog.dir: file:///mnt/nas/spark/event-logs
  driver:
    cores: 1
    coreLimit: 1200m
    memory: 512m
    template:
      spec:
        containers:
          - name: spark-kubernetes-driver
            volumeMounts:
              - name: nas
                subPath: spark/event-logs
                mountPath: /mnt/nas/spark/event-logs
        volumes:
          - name: nas
            persistentVolumeClaim:
              claimName: nas-pvc
        serviceAccount: spark-operator-spark
  executor:
    instances: 1
    cores: 1
    coreLimit: 1200m
    memory: 512m
    template:
      spec:
        containers:
          - name: spark-kubernetes-executor
  restartPolicy:
    type: Never

For more information, see Use Spark History Server to view information about Spark jobs.

Collect Spark logs with Simple Log Service

When running many jobs in a cluster, use Simple Log Service to centrally collect stdout and stderr logs from all Spark containers for querying and analysis.

Expand to view the sample job

This example configures Simple Log Service to collect logs from /opt/spark/logs/*.log in Spark containers.

apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
  name: spark-pi
  namespace: spark
spec:
  type: Scala
  mode: cluster
  # Replace <SPARK_IMAGE> with the Spark image built in step one
  image: <SPARK_IMAGE>
  imagePullPolicy: IfNotPresent
  mainClass: org.apache.spark.examples.SparkPi
  mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.12-3.5.4.jar
  arguments:
    - "5000"
  sparkVersion: 3.5.4
  # Read log4j2.properties from the specified ConfigMap
  sparkConfigMap: spark-log-conf
  sparkConf:
    spark.eventLog.enabled: "true"
    spark.eventLog.dir: file:///mnt/nas/spark/event-logs
  driver:
    cores: 1
    coreLimit: 1200m
    memory: 512m
    template:
      spec:
        containers:
          - name: spark-kubernetes-driver
            volumeMounts:
              - name: nas
                subPath: spark/event-logs
                mountPath: /mnt/nas/spark/event-logs
        serviceAccount: spark-operator-spark
        volumes:
          - name: nas
            persistentVolumeClaim:
              claimName: nas-pvc
  executor:
    instances: 1
    cores: 1
    coreLimit: 1200m
    memory: 512m
    template:
      spec:
        containers:
          - name: spark-kubernetes-executor
  restartPolicy:
    type: Never

For more information, see Use Simple Log Service to collect the logs of Spark jobs.

Performance optimization

Improve shuffle performance with RSS

Shuffle operations involve significant disk I/O, data serialization, and network I/O — common sources of OOM errors and fetch failures in large-scale jobs. Configure Apache Celeborn as the Remote Shuffle Service to achieve storage-compute separation and improve shuffle stability.

Deploy the ack-celeborn component first, then reference it in your job configuration. All examples use spark.shuffle.manager: org.apache.spark.shuffle.celeborn.SparkShuffleManager and spark.celeborn.master.endpoints pointing to the Celeborn master pods.

Expand to view the sample code

apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
  name: spark-pagerank
  namespace: spark
spec:
  type: Scala
  mode: cluster
  # Replace <SPARK_IMAGE> with your Spark image
  image: <SPARK_IMAGE>
  imagePullPolicy: IfNotPresent
  mainClass: org.apache.spark.examples.SparkPageRank
  mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.12-3.5.4.jar
  arguments:
    - oss://<OSS_BUCKET>/data/pagerank_dataset.txt
    - "10"
  sparkVersion: 3.5.4
  hadoopConf:
    fs.oss.impl: org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem
    fs.oss.endpoint: <OSS_ENDPOINT>
    fs.oss.credentials.provider: com.aliyun.oss.common.auth.EnvironmentVariableCredentialsProvider
  sparkConfigMap: spark-log-conf
  sparkConf:
    spark.eventLog.enabled: "true"
    spark.eventLog.dir: file:///mnt/nas/spark/event-logs

    # Celeborn RSS configuration
    spark.shuffle.manager: org.apache.spark.shuffle.celeborn.SparkShuffleManager
    # KryoSerializer is required because Java serializer does not support relocation
    spark.serializer: org.apache.spark.serializer.KryoSerializer
    # Configure based on the number of Celeborn master replicas
    spark.celeborn.master.endpoints: celeborn-master-0.celeborn-master-svc.celeborn.svc.cluster.local,celeborn-master-1.celeborn-master-svc.celeborn.svc.cluster.local,celeborn-master-2.celeborn-master-svc.celeborn.svc.cluster.local
    spark.celeborn.client.spark.shuffle.writer: hash
    spark.celeborn.client.push.replicate.enabled: "false"
    spark.sql.adaptive.localShuffleReader.enabled: "false"
    spark.sql.adaptive.enabled: "true"
    spark.sql.adaptive.skewJoin.enabled: "true"
    spark.shuffle.sort.io.plugin.class: org.apache.spark.shuffle.celeborn.CelebornShuffleDataIO
    spark.dynamicAllocation.shuffleTracking.enabled: "false"
    spark.executor.userClassPathFirst: "false"
  driver:
    cores: 1
    coreLimit: "1"
    memory: 4g
    template:
      spec:
        containers:
          - name: spark-kubernetes-driver
            envFrom:
              - secretRef:
                  name: spark-oss-secret
            volumeMounts:
              - name: nas
                subPath: spark/event-logs
                mountPath: /mnt/nas/spark/event-logs
        volumes:
          - name: nas
            persistentVolumeClaim:
              claimName: nas-pvc
        serviceAccount: spark-operator-spark
  executor:
    instances: 2
    cores: 1
    coreLimit: "1"
    memory: 4g
    template:
      spec:
        containers:
          - name: spark-kubernetes-executor
            envFrom:
              - secretRef:
                  name: spark-oss-secret
  restartPolicy:
    type: Never

For more information, see Use Celeborn as RSS in Spark jobs.

Define elastic resource scheduling priority

Use ECI-based pods with a ResourcePolicy to run Spark jobs on demand and pay only for actual resource usage. The ACK scheduler automatically assigns pods to ECS or ECI resources based on the configured strategy — no changes to the SparkApplication spec are required.

Expand to view the sample elastic policy

This example prioritizes ECS resources (up to 10 pods) and falls back to elastic container instances (up to 10 pods) when ECS capacity is insufficient:

apiVersion: scheduling.alibabacloud.com/v1alpha1
kind: ResourcePolicy
metadata:
  name: spark
  namespace: spark
spec:
  # Apply this strategy to pods launched by Spark Operator
  selector:
    sparkoperator.k8s.io/launched-by-spark-operator: "true"
  strategy: prefer
  units:
    # First: use ECS resources, up to 10 pods
    - resource: ecs
      max: 10
      podLabels:
        k8s.aliyun.com/resource-policy-wait-for-ecs-scaling: "true"
      nodeSelector:
        node.alibabacloud.com/instance-charge-type: PostPaid
    # Second: use ECI resources, up to 10 pods
    - resource: eci
      max: 10
  ignorePreviousPod: false
  ignoreTerminatingPod: true
  preemptPolicy: AfterAllUnits
  whenTryNextUnits:
    policy: TimeoutOrExceedMax
    # Wait up to 30 seconds for ECS autoscaling before falling back to ECI
    timeout: 30s

For more information, see Use elastic container instances to run Spark jobs.

Configure Dynamic Resource Allocation

Dynamic Resource Allocation (DRA) adjusts executor count based on workload size, preventing both resource starvation and waste. The following example configures DRA together with Celeborn RSS:

Expand to view the sample job

apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
  name: spark-pagerank
  namespace: spark
spec:
  type: Scala
  mode: cluster
  # Replace <SPARK_IMAGE> with your Spark image
  image: <SPARK_IMAGE>
  imagePullPolicy: IfNotPresent
  mainClass: org.apache.spark.examples.SparkPageRank
  mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.12-3.5.4.jar
  arguments:
    - oss://<OSS_BUCKET>/data/pagerank_dataset.txt
    - "10"
  sparkVersion: 3.5.4
  hadoopConf:
    fs.oss.impl: org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem
    fs.oss.endpoint: <OSS_ENDPOINT>
    fs.oss.credentials.provider: com.aliyun.oss.common.auth.EnvironmentVariableCredentialsProvider
  sparkConfigMap: spark-log-conf
  sparkConf:
    # ====================
    # Event log
    # ====================
    spark.eventLog.enabled: "true"
    spark.eventLog.dir: file:///mnt/nas/spark/event-logs

    # ====================
    # Celeborn
    # Ref: https://github.com/apache/celeborn/blob/main/README.md#spark-configuration
    # ====================
    # Shuffle manager class name changed in 0.3.0:
    # before 0.3.0: `org.apache.spark.shuffle.celeborn.RssShuffleManager`
    # since 0.3.0: `org.apache.spark.shuffle.celeborn.SparkShuffleManager`
    spark.shuffle.manager: org.apache.spark.shuffle.celeborn.SparkShuffleManager
    # Must use KryoSerializer because Java serializer does not support relocation
    spark.serializer: org.apache.spark.serializer.KryoSerializer
    # Configure based on the number of Celeborn master replicas
    spark.celeborn.master.endpoints: celeborn-master-0.celeborn-master-svc.celeborn.svc.cluster.local,celeborn-master-1.celeborn-master-svc.celeborn.svc.cluster.local,celeborn-master-2.celeborn-master-svc.celeborn.svc.cluster.local
    # options: hash, sort
    # Hash shuffle writer uses (partition count) * (celeborn.push.buffer.max.size) * (spark.executor.cores) memory.
    # Sort shuffle writer uses less memory — use it when partition count is large.
    spark.celeborn.client.spark.shuffle.writer: hash
    # Enable server-side data replication if you have more than one worker
    # If your Celeborn is using HDFS, set this to false
    spark.celeborn.client.push.replicate.enabled: "false"
    spark.sql.adaptive.localShuffleReader.enabled: "false"
    spark.sql.adaptive.enabled: "true"
    spark.sql.adaptive.skewJoin.enabled: "true"
    # Required for Spark >= 3.5.0 to support dynamic resource allocation with Celeborn
    spark.shuffle.sort.io.plugin.class: org.apache.spark.shuffle.celeborn.CelebornShuffleDataIO
    spark.executor.userClassPathFirst: "false"

    # ====================
    # Dynamic resource allocation
    # Ref: https://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation
    # ====================
    spark.dynamicAllocation.enabled: "true"
    # Disable shuffle tracking when using Celeborn as RSS (Spark >= 3.4.0)
    spark.dynamicAllocation.shuffleTracking.enabled: "false"
    spark.dynamicAllocation.initialExecutors: "3"
    spark.dynamicAllocation.minExecutors: "0"
    spark.dynamicAllocation.maxExecutors: "10"
    # Release idle executors after 60 seconds
    spark.dynamicAllocation.executorIdleTimeout: 60s
    # Release executors that have cached data blocks after the specified timeout (default: infinity)
    # spark.dynamicAllocation.cachedExecutorIdleTimeout:
    # Request additional executors when scheduling backlog exceeds 1 second
    spark.dynamicAllocation.schedulerBacklogTimeout: 1s
    spark.dynamicAllocation.sustainedSchedulerBacklogTimeout: 1s
  driver:
    cores: 1
    coreLimit: "1"
    memory: 4g
    template:
      spec:
        containers:
          - name: spark-kubernetes-driver
            envFrom:
              - secretRef:
                  name: spark-oss-secret
            volumeMounts:
              - name: nas
                subPath: spark/event-logs
                mountPath: /mnt/nas/spark/event-logs
        volumes:
          - name: nas
            persistentVolumeClaim:
              claimName: nas-pvc
        serviceAccount: spark-operator-spark
  executor:
    cores: 1
    coreLimit: "1"
    memory: 4g
    template:
      spec:
        containers:
          - name: spark-kubernetes-executor
            envFrom:
              - secretRef:
                  name: spark-oss-secret
  restartPolicy:
    type: Never

For more information, see Configure dynamic resource allocation for Spark jobs.

Use Fluid to accelerate data access

If your data is in a remote data center or you're hitting data access bottlenecks, use Fluid's distributed cache to accelerate reads for Spark jobs.

For more information, see Use Fluid to accelerate data access for Spark applications.

What's next

References