Create a job

Last Updated: May 03, 2017

Create a job

To run a computing task, you first need to define a job according to the following steps:

  1. Log on to Alibaba Cloud E-MapReduce Console Job Page.

  2. Select the region where the job will be created.

  3. Click Create Job in the top right corner of the page.

  4. Input the job name.

  5. Select the job type.

  6. Input the application parameters of the job. The application parameters should include full information of the jar package run by the job, data input and output addresses of the job, and some command line parameters, that is, all your parameters in the command line should be completed in this field. If OSS path is used, you can click “Select OSS Path” to select the OSS resource path. For parameter configuration of all job types, refer to Job in the user guide.

  7. The actual executed command for the job on ECS will displayed. If you copy the displayed command, the command can be run directly in the command line environment of the E-MapReduce cluster.

  8. Select the policy for failed operations. Pausing the current execution plan will pause the entire execution plan after this job fails and will wait for your handling. Continuing to execute the next job will ignore this error and continue to execute the next job after this job fails.

  9. Click OK to complete the creation.

Job example

This is a Spark job, where relevant parameters, as well as input and output paths, are set in the application parameters.

sparkjob

  1. spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client --driver-memory 512m --num-executors 1 --executor-memory 1g --executor-cores 2 /opt/apps/spark-1.6.1-bin-hadoop2.7/lib/spark-examples-1.6.1-hadoop2.7.2.jar 100

oss and ossref

The oss:// prefix indicates that the data path is directed to an OSS path, which specifies the operation path similar to hdfs:// when reading/writing the data.

ossref:// is also directed to an OSS path, however, it will be used to download the corresponding code resource to the local disk, and then replace the path in the command line with the local path. It is intended for easily running some native codes without the need to log on to the computer to upload the code and the dependent resource package.

Note: The ossref cannot be used to download excessive data resources, otherwise it will lead to failure of the cluster job.

Thank you! We've received your feedback.