edit-icon download-icon

Create a cluster, job, and execution plan

Last Updated: Apr 20, 2018

Note: Make sure you have completed all the prerequisites.

In this tutorial, you will be able to get a idea of what clusters, jobs, and execution plans play and how they are used in E-MapReduce. You will also be able to create a Spark Pi job and run it successfully in the cluster. Finally, you can see the approximate calculation result of Pi on the console page.

  1. Create a cluster.
    1. On the left side of the console, select Cluster and click Create Cluster at the upper right corner.
    2. Software configurations.
      1. Use lasted EMR product version.
      2. Use the default software configuration.
    3. Hardware configurations.
      1. Select Pay-As-You-Go.
      2. If there is no security group, click New and enter the security group name.
      3. Select 4-core and 8G for the master node.
      4. Select 4-core and 8G for the Core node (one instance).
      5. Keep others in default status.
    4. Basic configurations
      1. Enter the name of the cluster.
      2. Select the log path to save job logs and select make sure that the logging feature is on. In the region for the cluster, create an OSS bucket.
      3. Enter the password.
    5. Create a cluster.
  2. Create a job.
    1. On the left side of the console, select Job and click Create Job at the upper right corner.
    2. Enter the job name.
    3. Select Spark as the job type.
    4. Enter parameters as follows.
      1. --class org.apache.spark.examples.SparkPi --master yarn-client --driver-memory 512m --num-executors 1 --executor-memory 1g --executor-cores 2 /usr/lib/spark-current/examples/jars/spark-examples_2.11-2.1.1.jar 10
      Caution: The /usr/lib/spark-current/examples/jars/spark-examples_2.11-2.1.1.jar jar file name is decided by Spark version in cluster, for example, if Spark version is 2.1.1, it should be spark-examples_2.11-2.1.1.jar, if Spark version is 2.2.0, then file name is spark-examples_2.11-2.2.0.jar
    5. Keep others in default status to create the job.
  3. Create an execution plan
    1. When a cluster is created successfully, its status on the list is shown as Idle.
    2. Select Execution Plan on the left side of the console and click Create Execution Plan at the upper right corner.
    3. Select Existing Cluster. Choose the newly created cluster and associate it with the execution plan to create.
    4. Add the job created earlier to the queue.
    5. Enter the name of the execution plan.
    6. Choose Manual Execution by default.
    7. Create an execution plan.
  4. Run the execution plan.
    1. On the execution plan list page, click Run Now.
  5. View job logs and confirm the results.
    1. Click Management and proceed to the management page. View the Running Log at the bottom of the page.
    2. Click the right side of the running log to view the job list.
    3. Click stdout and you can see the approximate calculation result of Pi: 3.14xxxx.
Thank you! We've received your feedback.