All Products
Search
Document Center

E-MapReduce:Submit a Spark job

Last Updated:Mar 26, 2026

EMR on ACK supports three ways to submit a Spark job: using a Custom Resource Definition (CRD), running the spark-submit command via the emr-spark-ack tool, or using the EMR console.

Prerequisites

Before you begin, make sure you have:

  • A Spark cluster created on the EMR on ACK page. For details, see Create a cluster.

Usage notes

The examples use a JAR file packaged directly into the Spark image (local:///opt/spark/examples/spark-examples.jar). To use your own JAR file, upload it to Object Storage Service (OSS) and replace the path with the OSS path in the oss://<yourBucketName>/<path>.jar format. For details, see Simple upload.

Choose a submission method

MethodBest for
CRDDeclarative, Kubernetes-native job management. Define the full job spec in a YAML file and let the Spark operator handle the lifecycle.
spark-submit (`emr-spark-ack`)Familiar spark-submit syntax with support for cluster mode, interactive spark-shell, and automatic upload of local file dependencies (Spark 3 or later, EMR V5.X).
EMR consoleQuick interactive queries or one-off jobs without leaving the browser.

Method 1: Submit a Spark job using a CRD

  1. Connect to the ACK cluster using kubectl. For details, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.

  2. Create a file named spark-pi.yaml with the following content:

    apiVersion: "sparkoperator.k8s.io/v1beta2"
    kind: SparkApplication
    metadata:
      name: spark-pi-simple
    spec:
      type: Scala
      sparkVersion: 3.2.1
      mainClass: org.apache.spark.examples.SparkPi
      mainApplicationFile: "local:///opt/spark/examples/spark-examples.jar"
      arguments:
        - "1000"
      driver:
        cores: 1
        coreLimit: 1000m
        memory: 4g
      executor:
        cores: 1
        coreLimit: 1000m
        memory: 8g
        memoryOverhead: 1g
        instances: 1
    The example uses Spark 3.2.1 for EMR V5.6.0. Adjust sparkVersion to match your EMR version. For all supported fields, see the spark-on-k8s-operator API reference.
  3. Submit the job:

    kubectl apply -f spark-pi.yaml --namespace <your-namespace>

    Replace <your-namespace> with the namespace where your cluster resides. To find the namespace, go to the Cluster Details tab in the EMR console. The expected output is:

    sparkapplication.sparkoperator.k8s.io/spark-pi-simple created
  4. (Optional) View the submitted job on the Job Details tab in the EMR console.

Method 2: Submit a Spark job using spark-submit

The emr-spark-ack tool wraps the standard Spark CLI for ACK clusters. It supports cluster mode, client mode (spark-sql, spark-shell, pyspark), and automatic upload of local file dependencies in Spark 3 or later for EMR V5.X.

  1. Connect to the ACK cluster using kubectl. For details, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.

  2. Download the emr-spark-ack tool and make it executable:

    wget https://ecm-repo-cn-hangzhou.oss-cn-hangzhou.aliyuncs.com/emr-on-ack/util/emr-spark-ack
    chmod 755 emr-spark-ack
  3. Submit a Spark job. The general syntax is:

    ./emr-spark-ack -n <your-namespace> <spark-command>

    <spark-command> can be spark-submit, spark-sql, spark-shell, or pyspark.

    Cluster mode — spark-submit:

    ./emr-spark-ack -n <your-namespace> spark-submit \
        --name spark-pi-submit \
        --deploy-mode cluster \
        --class org.apache.spark.examples.SparkPi \
        local:///opt/spark/examples/spark-examples.jar \
        1000

    Client mode — spark-sql:

    # Prepare a local SQL file
    echo "select 1+1" > test.sql
    # Submit the job
    ./emr-spark-ack -n <your-namespace> spark-sql -f test.sql

    The tool automatically uploads test.sql (and any other local files specified via --jars, --files, or -f) to the Spark cluster before submitting.

    result-spark

    Client mode — spark-shell:

    ./emr-spark-ack -n <your-namespace> spark-shell

    result-shell

  4. (Optional) View the submitted job on the Job Details tab in the EMR console.

  5. (Optional) To stop a running job, use the kill subcommand with the Spark application ID shown in the submission output:

    ./emr-spark-ack -n <your-namespace> kill <Spark_app_id>

Method 3: Submit a Spark job in the EMR console

  1. In the EMR console, click EMR on ACK in the left-side navigation pane.

  2. On the EMR on ACK page, click the cluster name in the Cluster ID/Name column.

  3. Click the Access Links and Ports tab, then click the link in the SparkSubmitGateway UI row. A browser-based Shell terminal opens.

  4. In the Shell terminal, run one of the following commands: Interactive query with spark-sql:

    spark-sql

    Batch job with spark-submit:

    spark-submit \
        --name spark-pi-submit \
        --deploy-mode cluster \
        --class org.apache.spark.examples.SparkPi \
        local:///opt/spark/examples/spark-examples.jar \
        1000

    spark-sql

  5. (Optional) View the submitted job on the Job Details tab.

What's next