This topic describes how to configure a Spark SQL job.

Prerequisites

A project is created. For more information, see Manage projects.

Procedure

  1. Go to the Data Platform tab.
    1. Log on to the Alibaba Cloud EMR console by using your Alibaba Cloud account.
    2. In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
    3. Click the Data Platform tab.
  2. In the Projects section of the page that appears, find the project that you want to edit and click Edit Job in the Actions column.
  3. Create a Spark SQL job.
    1. In the Edit Job pane on the left, right-click the folder on which you want to perform operations and select Create Job.
    2. In the Create Job dialog box, specify Name and Description, and then select SparkSQL from the Job Type drop-down list.
      Note By default, a Spark SQL job is submitted in yarn-client mode.
      You can use the following command syntax to submit a Spark SQL job:
      spark-sql [options] [cli options] {SQL_CONTENT}                
      The following table describes the parameters in the command syntax.
      Parameter Description
      options The setting of the SPARK_CLI_PARAMS parameter that you configure by performing the following operations: Click Job Settings in the upper-right corner of the job page. In the Job Settings panel, click the Advanced Settings tab. Click the Add icon in the Environment Variables section and add the setting of the SPARK_CLI_PARAMS parameter, such as SPARK_CLI_PARAMS="--executor-memory 1g --executor-cores".
      cli options Examples:
      • -e <quoted-query-string> : indicates that the SQL statements enclosed in quotation marks are executed.
      • -f <filename>: indicates that the SQL statements in the file are executed.
      SQL_CONTENT The SQL statements that you enter.
    3. Click OK.
  4. Edit job content.
    1. Enter the Spark SQL statements in the Content field.
      Example:
      -- SQL statement example 
      -- The size of SQL statements cannot exceed 64 KB. 
      show databases;
      show tables;
      -- LIMIT 2000 is automatically added to the SELECT statement. 
      select * from test1;
    2. Click Save.