Oozie is an open source engine for big data workflow scheduling. It is used to schedule big data jobs and implement complex data production. This topic describes how to use Oozie in E-MapReduce (EMR).

Prerequisites

An EMR Hadoop cluster is created, and Oozie is selected from the optional services during the cluster creation. For more information, see Create a cluster.

Access the web UI of Oozie

You can use one of the following methods to access the web UI of Oozie:

Submit a workflow job

By default, ShareLib is installed in EMR clusters. When you submit an Oozie workflow job, you do not need to install ShareLib.

  1. In the job.properties file, specify NameNode and JobTracker (ResourceManager) based on the cluster type.
    • Non-HA cluster
      nameNode=hdfs://emr-header-1:9000
      jobTracker=emr-header-1:8032
    • HA cluster
      nameNode=hdfs://emr-cluster
      jobTracker=rm1,rm2
  2. Submit an Oozie workflow job.
    • Non-HA cluster
      1. Log on to the master node. For more information, see Connect to the master node of an EMR cluster in SSH mode.
        ssh root@Public IP address of the master node
      2. Download sample code.
        su oozie
        cd /tmp
        wget http://emr-sample-projects.oss-cn-hangzhou.aliyuncs.com/oozie-examples/oozie-examples.zip
        unzip oozie-examples.zip
      3. Synchronize the Oozie workflow code to HDFS.
        hadoop fs -copyFromLocal examples/ /user/oozie/examples
      4. Submit an Oozie workflow job.
        $OOZIE_HOME/bin/oozie job -config examples/apps/map-reduce/job.properties -run
        If the command is successfully run, the following information is returned:
        job: 0000000-160627195651086-oozie-oozi-W
      5. Access the web UI of Oozie. For more information, see Access the web UI of Oozie.

        You can view the submitted Oozie workflow job.

    • HA cluster
      1. Log on to the primary master node. For more information, see Connect to the master node of an EMR cluster in SSH mode.
        ssh root@Public IP address of the primary master node
      2. Download sample code.
        su oozie
        cd /tmp
        wget http://emr-sample-projects.oss-cn-hangzhou.aliyuncs.com/oozie-examples/oozie-examples-ha.zip
        unzip oozie-examples-ha.zip
      3. Synchronize the Oozie workflow code to HDFS.
        hadoop fs -copyFromLocal examples/ /user/oozie/examples
      4. Submit an Oozie workflow job.
        $OOZIE_HOME/bin/oozie job -config examples/apps/map-reduce/job.properties -run
        If the command is successfully run, the following information is returned:
        job: 0000000-160627195651086-oozie-oozi-W
      5. Access the web UI of Oozie. For more information, see Access the web UI of Oozie.

        You can view the submitted Oozie workflow job.