This topic describes how to configure a Spark Streaming job.

Prerequisites

  • A project is created. For more information, see Manage projects.
  • All required resources and data to be processed are obtained.

Procedure

  1. Log on to the Alibaba Cloud EMR console by using your Alibaba Cloud account.
  2. In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
  3. Click the Data Platform tab.
  4. In the Projects section of the page that appears, find the project you want to edit and click Edit Job in the Actions column.
  5. In the Edit Job pane on the left, right-click the folder on which you want to perform operations and select Create Job.
  6. In the Create Job dialog box, specify Name and Description, and select Spark Streaming from the Job Type drop-down list.
  7. Click OK.
  8. Specify the command line parameters required to submit the job in the Content field.
    You can use the following command syntax to submit a Spark Streaming job:
    spark-submit [options] --class [MainClass] xxx.jar args
    In the following example, a job with Name set to SlsStreaming is used to demonstrate the Content value:
    --master yarn-client --driver-memory 7G --executor-memory 5G --executor-cores 1 --num-executors 32 --class com.aliyun.emr.checklist.benchmark.SlsStreaming emr-checklist_2.10-0.1.0.jar <project> <logstore> <accessKey> <secretKey>
    Notice
    • If a job is stored in OSS as a JAR package, you can reference the JAR package by using the ossref://xxx/.../xxx.jar directory.
    • Click + Enter an OSS path in the lower part of the page. In the OSS File dialog box, set File Prefix to OSSREF and specify File Path. The system automatically completes the path of the Spark Streaming script in OSS.
  9. Click Save.