All Products
Search
Document Center

E-MapReduce:Submit a job to the master node

Last Updated:Jun 21, 2026

In Hadoop, the master node manages the entire cluster, including job submission, monitoring, and termination. To run a job on a Hadoop cluster, submit it to the master node.

Prerequisites

  • You have created a cluster in EMR on ECS. For details, see Create a cluster.

  • Ensure that your local server can connect to the cluster's master node. You can enable the public network switch when you create the cluster. Alternatively, after the cluster is created, you can assign a static public IP address or an Elastic IP Address (EIP) to the master node's ECS instance in the ECS console. For details, see Elastic IP Address (EIP).

  • Port 22 is open in the cluster's security group.

Procedure

  1. Log on to the cluster's master node over SSH. For details, see Log on to a cluster.

  2. After you connect to the node over SSH, run the following command to submit and run a job. This example is for Spark 3.1.1.

    spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --driver-memory 512m --num-executors 1 --executor-memory 1g --executor-cores 2 /opt/apps/SPARK3/spark-current/examples/jars/spark-examples_2.12-3.1.1.jar 10
    Note

    spark-examples_2.12-3.1.1.jar is the example JAR file in your cluster. You can Log on to a cluster and find it at the /opt/apps/SPARK3/spark-current/examples/jars path.

  3. View job execution records. After you submit a job, you can view its execution records on the YARN UI.

    1. Open port 8443. For details, see Manage security groups.

    2. Add a user. For details, see OpenLDAP user management.

      You need a Knox username and password to access the YARN UI.

    3. On the EMR on ECS page, click Cluster Services in the target cluster's row.

    4. Click the Access Links and Ports tab.

    5. Click the public link in the YARN UI row.

      Log on with your Knox username and password.

    6. On the All Applications page, click the target job's ID to view its execution details.

      The top of the page displays Cluster Metrics (including Apps Submitted, Apps Running, Containers Running, Memory Used, and more) and Cluster Nodes Metrics. Below these metrics, a table lists applications with columns such as ID, User, Name, Application Type, Queue, StartTime, State, and FinalStatus. Use the State column to find your job in the list, then click its ID to view the details.