In Hadoop, the cluster master node is responsible for managing the entire cluster, including job submission, monitoring, and termination. To execute a job on a Hadoop cluster, you need to submit the job through the master node.
Prerequisites
A cluster is created in EMR on ECS. For more information, see Create a cluster.
Your on-premises server is connected to the master node of the cluster. You can enable the Public Network switch when creating a cluster, or attach a public network to the master node in the ECS console after the cluster is created. You can assign a static public IP address or an Elastic IP Address to the master node ECS instance. For more information, see Elastic IP Address.
Port 22 is enabled for the security group to which your cluster belongs.
Procedure
Log on to the master node of the cluster using SSH. For more information, see Log on to a cluster.
After connecting to the node using SSH, execute the following command in the command line to submit and run a job. In this example, Spark 3.1.1 is used and the following command is used to submit and run a job:
spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --driver-memory 512m --num-executors 1 --executor-memory 1g --executor-cores 2 /opt/apps/SPARK3/spark-current/examples/jars/spark-examples_2.12-3.1.1.jar 10Notespark-examples_2.12-3.1.1.jaris the name of the JAR package in your cluster. You can log on to the cluster and check the path/opt/apps/SPARK3/spark-current/examples/jars.View the job execution records. After submitting the job, you can view the job execution records through the YARN web UI. The following provides a brief description.
Enable port 8443. For more information, see Manage security groups.
Add a user. For more information, see OpenLDAP user management.
To access the YARN web UI using your Knox account, you must obtain the username and password of the Knox account.
On the EMR on ECS page, click Cluster Services in the row of the target cluster.
Click the Access Links and Ports tab.
Click the public link in the YARN UI row.
Use the added user for logon authentication and access the YARN web UI.
On the All Applications page, click the ID of the target job to view the details of the job.
