This topic describes how to configure a Spark Streaming job.

Prerequisites

  • A project is created. For more information, see Manage projects.
  • All required resources and data to be processed are obtained.

Procedure

  1. Go to the Data Platform tab.
    1. Log on to the Alibaba Cloud EMR console by using your Alibaba Cloud account.
    2. In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
    3. Click the Data Platform tab.
  2. In the Projects section of the page that appears, find the project that you want to edit and click Edit Job in the Actions column.
  3. Create a Spark Streaming job.
    1. In the Edit Job pane on the left, right-click the folder on which you want to perform operations and select Create Job.
    2. In the Create Job dialog box, specify Name and Description, and then select Spark Streaming from the Job Type drop-down list.
    3. Click OK.
  4. Edit job content.
    1. Configure the command line parameters required to submit the job in the Content field.
      You can use the following command syntax to submit a Spark Streaming job:
      spark-submit [options] --class [MainClass] xxx.jar args
      In the following example, a job with Name set to SlsStreaming is used to demonstrate the Content value:
      --master yarn-client --driver-memory 7G --executor-memory 5G --executor-cores 1 --num-executors 32 --class com.aliyun.emr.checklist.benchmark.SlsStreaming emr-checklist_2.10-0.1.0.jar <project> <logstore> <accessKey> <secretKey>
      Notice
      • If a job is stored in Object Storage Service (OSS) as a JAR package, you can reference the JAR package by using the ossref://xxx/.../xxx.jar directory.
      • Click + Enter an OSS path in the lower part of the page. In the OSS File dialog box, set File Prefix to OSSREF and specify File Path. The system completes the path of the Spark Streaming script in OSS.
    2. Click Save.