This topic describes how to check whether a job is submitted and executed based on job logs. MaxCompute provides both Logview and Spark Web UI for you to check jobs.

Background information

A job is submitted by using spark-submit. Logs are also generated when you use DataWorks to execute Spark jobs. Example:
cd $SPARK_HOME
bin/spark-submit --master yarn-cluster --class  SparkPi /tmp/spark-2.x-demo/target/AliSpark-2.x-quickstart-1.0-SNAPSHOT-shaded.jar
After a job is submitted, MaxCompute creates an instance and displays the Logview and tracking URLs of the instance in a log.
19/01/05 20:36:47 INFO YarnClientImplUtil: logview url: http://logview.odps.aliyun.com/logview/?h=http://service.cn.maxcompute.aliyun.com/api&p=qn_beijing&i=xxx&token=xxx
<The operation succeeded if an output similar to the following result is displayed.>
19/01/05 20:37:34 INFO Client:
   client token: N/A
   diagnostics: N/A
   ApplicationMaster host: 11.220.xxx.xxx
   ApplicationMaster RPC port: 30002
   queue: queue
   start time: 1546691807945
   final status: SUCCEEDED
   tracking URL: http://jobview.odps.aliyun.com/proxyview/jobview/?h=http://service.cn.maxcompute.aliyun-inc.com/api&p=project_name&i=xxx&t=spark&id=application_xxx&metaname=xxx&token=xxx

Use Logview to check a job

A URL that starts with logview.odps.aliyun.com is a Logview URL. Logview is a distributed job tracking tool developed for MaxCompute. It can be used to:
  • Obtain the status of a job.
  • Obtain the startup, stop, and scheduling information of each node in a job.
  • Obtain the standard input and output logs of each node in a job. We recommend that the Spark output be written into stdout. By default, Spark log4j logs are written into stderr.
  • Store execution log data. The data is retained for three to five days. When the local disk is fully occupied, both stdout and stderr are cleared.
  1. Enter the Logview URL in the address bar of the browser and view the execution information of a job of the CUPID type.
    Logview
  2. On the page that appears, perform the following operations:
    1. Click Detail to view details about the job. master-0 indicates the node on which Spark Driver resides.
    2. Click master-0 and select All Tab to view the information of the node.
    3. Click StdOut to view the output of the node.
    4. Click StdErr to view the log4j log of the node.

Use the Spark Web UI to check a job

The tracking URL in a log indicates that your job is submitted to MaxCompute. You can use the tracking URL to log on to the Spark Web UI or HistoryServer. The Spark Web UI can be used to:
  • Obtain information of the native Spark Web UI.
  • View the information of a running job in real time.
  • Transfer events from Spark Driver to HistoryServer after a job is complete. This process may take one to three minutes. If you open the tracking URL immediately after a job is complete, error message Application application_1560240626712_2078769635 not found. may appear. If this happens, try again later.
  1. Enter the tracking URL in the address bar of the browser and view the execution information of a Spark job.
    Tracking URL
  2. On the page that appears, perform the following operations:
    1. Click the Environment tab to check whether the parameters of the Spark job are correctly configured.
    2. Click the Executors tab to check whether dead nodes exist and whether stdout and stderr logs are generated for Spark Driver.
    3. Click stdout to view the output of the node.
    4. Click stderr to view the log4j log of the node.