All Products
Search
Document Center

E-MapReduce:Quick start for Workflow

Last Updated:Mar 26, 2026

This guide walks you through running your first E-MapReduce (EMR) Workflow job end to end, using a HIVECLI node as an example.

By the end, you will have bound a cluster, created a project, defined a workflow with a HIVECLI node, run the job, and verified its logs.

Prerequisites

Before you begin, ensure that you have:

How it works

EMR Workflow organizes jobs as workflows made up of nodes. Each node runs a specific task — in this example, a HIVECLI node runs a Hive SQL script. Before you can run a workflow, you must bind a cluster to provide the compute resources and create a project to group your workflows.

The steps below follow this sequence: bind a cluster → create a project → define a workflow → run the workflow → review logs.

Step 1: Bind a cluster

  1. Log in to the EMR console.

  2. In the left-side navigation pane, choose EMR Studio > Workflow.

  3. Click the Security tab.

  4. On the Cluster Manage page, click Bind Cluster.

  5. In the Bind Cluster dialog box, set Cluster Type, Cluster ID, and vSwitch ID, then click Confirm. The binding process takes 5–10 minutes. Refresh the Cluster Manage page and wait until the State column shows Associated.

Step 2: Create a project

  1. Click the Project tab.

  2. Click Create Project.

  3. In the Create Project dialog box, enter a project name and click Confirm. This example uses project_test as the project name.

Step 3: Define a workflow

  1. On the Project tab, click project_test.

  2. In the left-side navigation pane, choose Workflow > Workflow Definition.

  3. On the Workflow Definition page, click Create Workflow.

  4. On the Create Workflow page, drag the HIVECLI node to the canvas. For the full list of available node types, see Node types.

  5. In the Current node settings dialog box, configure the required parameters and click Confirm. Leave all other parameters at their default values. For parameter details, see HIVECLI. Use the following script as the Script value:

    Parameter Required Example value
    Node Name Yes hivecli
    Script Yes See the script below
    create table if not exists mytable(a string, b int);
    insert into mytable values ('abc', 1), ('def', 2);
    select a, sum(b) from mytable group by a;
  6. Save the workflow.

    1. Click Save in the upper-right corner of the canvas.

    2. In the Basic Information dialog box, enter a workflow name and click Confirm. This example uses workflow_test as the workflow name.

Step 4: Run the workflow

  1. On the Workflow Definition page, find workflow_test and click the run icon (image..png) in the Operation column.

  2. Click the start icon (image..png).

  3. In the Please set the parameters before starting dialog box, select the cluster you bound in Step 1 from the Execution Cluster drop-down list, then click Confirm.

Step 5: View task logs

  1. In the left-side navigation pane, choose Workflow > Workflow Instance to confirm the workflow run started.

  2. Choose Task > Task Instance.

  3. On the Task Instance page, find the task instance and click the log icon (image..png) in the Operation column to view the run logs.

Step 6: (Optional) Take a workflow offline

On the Workflow Definition page, find the workflow and click the offline icon (image..png) in the Operation column.

What's next