This topic describes how to configure a Spark SQL job.

Prerequisites

A project is created. For more information, see Manage projects.

Background information

Note By default, a Spark SQL job is submitted in yarn-client mode.

Procedure

  1. Log on to the E-MapReduce console by using an Alibaba Cloud account.
  2. Click the Data Platform tab.
  3. In the Projects section, click Edit Job in the Actions column of the target project.
  4. Right-click the folder where you want to create a job, and select Create Job.
    Note You can also right-click the folder to create a subfolder, rename the folder, or delete the folder.
  5. In the dialog box that appears, specify the Name and Description parameters, and select Spark SQL from the Job Type drop-down list.
    You can use the following command syntax to submit a Spark SQL job:
    spark-sql [options] [cli options] {SQL_CONTENT}                
    Parameter description:
    • options: the setting of the SPARK_CLI_PARAMS parameter that you configure by performing the following operations: Click Job Settings in the upper-right corner of the page. In the pane that appears, click the Advanced Settings tab. Click the add icon in the Environment Variables section and add the SPARK_CLI_PARAMS parameter.

      For example, set SPARK_CLI_PARAMS to "--executor-memory 1g --executor-cores".

    • cli options : For example, -e <quoted-query-string> indicates that the SQL statements within quotation marks are executed. -f <filename> indicates that the SQL statements in the file are executed.
    • SQL_CONTENT: the SQL statements that you enter.
  6. Click OK.
  7. Enter the Spark SQL statements in the Content field.
    Example:
    -- SQL statement example
    -- The size of SQL statements cannot exceed 64 KB.
    show databases;
    show tables;
    -- LIMIT 2000 is automatically used for the SELECT statement.
    select * from test1;
  8. Click Save.