All Products
Search
Document Center

E-MapReduce:Basic usage

Last Updated:Jun 18, 2025

This topic describes how to submit a Flink job and view the job status in the E-MapReduce (EMR) console.

Background information

In a Dataflow cluster, Flink is deployed on YARN. You can use SSH to log on to your Dataflow cluster and run a command in the CLI to submit a Flink job.

Dataflow clusters deployed in YARN mode support submitting Flink jobs in Session mode, Per-Job Cluster mode, and Application mode.

Mode

Description

Features

Session mode

In session mode, a Flink cluster is created based on the resource parameters you specify. All jobs are submitted to this cluster. The cluster is not automatically released after jobs are complete.

For example, if an exception occurs in a job and causes a Task Manager to shut down, all other jobs running on this Task Manager will fail. Additionally, because there is only one Job Manager in a cluster, the pressure on the Job Manager increases as the number of jobs increases.

  • Advantage: The time overhead caused by resource allocation when submitting jobs is smaller compared to other modes.

  • Disadvantage: Because all jobs run in the same cluster, there is competition for resources and jobs can affect each other.

Based on these features, this mode is suitable for deploying jobs that require shorter startup times and have relatively short running times.

Per-Job Cluster mode

When you use per-job cluster mode, YARN starts a new Flink cluster for each Flink job you submit and then runs the job. When the job is complete or canceled, the Flink cluster for this job is released.

  • Advantage: Resources are isolated between jobs, and abnormal behavior of one job does not affect other jobs.

    Because each job corresponds to one Job Manager, the issue of a Job Manager being overloaded due to running multiple jobs does not occur.

  • Disadvantage: A dedicated Flink cluster is started for each job, which increases the overhead of starting jobs.

Based on these features, this mode is typically suitable for jobs with longer running times.

Application mode

When you use application mode, YARN starts a new Flink cluster for each Flink application (an application can contain one or more jobs) you submit. When the application is complete or canceled, the Flink cluster for this application is released.

The difference between this mode and per-job mode is that the main() method in the JAR package of the application is executed on the Job Manager in the cluster.

If the submitted JAR package contains multiple jobs, all these jobs are executed in the cluster that belongs to the application.

  • Advantage: Reduces the burden on the client when submitting jobs.

  • Disadvantage: A dedicated Flink cluster is started for each Flink application, which increases the time overhead of starting applications.

Prerequisites

A Dataflow cluster that is in Flink mode is created. For more information, see Create a cluster.

Submit a job and view the job status

Note

The TopSpeedWindowing example in Flink is used in this topic. TopSpeedWindowing is a streaming job that runs for a long period of time.

You can select one of the following modes to submit jobs and view the status of the jobs based on your business requirements:

Session mode

  1. Log on to the master node of your cluster in SSH mode. For more information, see Log on to the master node of a cluster.

  2. Run the following command to start a YARN session.

    yarn-session.sh --detached

    After the command is executed successfully, the system returns an Application ID. For example, application_1750137174986_0001. In the following sections, <application_XXXX_YY> is used to represent this ID.

    image

  3. Run the following command to submit a job.

    flink run --detached /opt/apps/FLINK/flink-current/examples/streaming/TopSpeedWindowing.jar

    After the job is submitted, information similar to the following output is returned.

    image

    In this output, 3785db18d371326758d7843dd2a1**** is the job ID. In the following sections, <jobId> is used to represent this ID.

  4. Run the following command to view the job status.

    flink list -t yarn-session -Dyarn.application.id=<application_XXXX_YY>

    Information similar to the following output is returned.

    ------------------ Running/Restarting Jobs -------------------
    16.06.2025 18:20:55 : 3785db18d371326758d7843dd2a1**** : CarTopSpeedWindowingExample (RUNNING)

    You can also view the job status on the web UI of Flink. For more information, see View the job status on the web UI of Flink.

  5. Run the following command to stop a job.

    flink cancel -t yarn-session -Dyarn.application.id=<application_XXXX_YY> <jobId>

Per-job cluster mode

  1. Log on to the master node of your cluster in SSH mode. For more information, see Log on to the master node of a cluster.

  2. Run the following command to submit a job.

    flink run -t yarn-per-job --detached /opt/apps/FLINK/flink-current/examples/streaming/TopSpeedWindowing.jar

    After the job is submitted, information similar to the following output is returned.

    image

    In this output, application_1750125819948_**** is the Application ID. In the following sections, <application_XXXX_YY> is used to represent this ID. f5f980ac631192b02548235f1bbe**** is the job ID. In the following sections, <jobId> is used to represent this ID.

  3. Run the following command to view the job status.

    flink list -t yarn-per-job -Dyarn.application.id=<application_XXXX_YY>

    You can also view the job status on the web UI of Flink. For more information, see View the job status on the web UI of Flink.

  4. Run the following command to stop a job.

    flink cancel -t yarn-per-job -Dyarn.application.id=<application_XXXX_YY> <jobId>

Application mode

  1. Log on to the master node of your cluster in SSH mode. For more information, see Log on to the master node of a cluster.

  2. Run the following command to submit a job.

    flink run-application -t yarn-application /opt/apps/FLINK/flink-current/examples/streaming/TopSpeedWindowing.jar

    After the job is submitted, information similar to the following output is returned.

    image

    In this output, application_1750125819948_0004 is the YARN application ID of the submitted Flink job. In the following sections, <application_XXXX_YY> is used to represent this ID.

  3. Run the following command to view the job status.

    flink list -t yarn-application -Dyarn.application.id=<application_XXXX_YY>

    Information similar to the following output is returned. In this output, 4db32b5339e6d64de2a1096c4762**** is the job ID, which is represented by <jobId> in the following sections.

    ------------------ Running/Restarting Jobs -------------------
    16.06.2025 18:20:55 : 4db32b5339e6d64de2a1096c4762**** : CarTopSpeedWindowingExample (RUNNING)

    You can also view the job status on the web UI of Flink. For more information, see View the job status on the web UI of Flink.

  4. Run the following command to stop a job.

    flink cancel -t yarn-application -Dyarn.application.id=<application_XXXX_YY> <jobId>

Configure a job

Flink allows you to use one of the following methods to configure a job:

  • Method 1: Specify the values of parameters in the code of the job. For more information, see Configuration.

  • Method 2: When you run the flink run command to submit a job, you can use the -D argument to specify the values of parameters for the job. Example: flink run-application -t yarn-application -D state.backend=rocksdb....

  • Method 3: Specify the values of parameters in the /etc/taihao-apps/flink-conf/flink-conf.yaml file.

If you do not specify parameter values by using the preceding three methods, the default values are used. For more information, see Apache Flink Documentation.

View the job status on the web UI of Flink

  1. Access the web UI of Flink.

    1. Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.

    2. In the left-side navigation pane, choose EMR on ECS.

    3. In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.

    4. On the EMR on ECS page, click the Cluster ID of the target cluster.

    5. Click the Access Links And Ports tab.

    6. On the Access Links And Ports page, click the link for YARN UI.

      For more information about how to access the web UI, see Access the web UIs of open source components.

  2. Click the Application ID.

    Application ID

  3. Click the link of Tracking URL.

    application information

    The Apache Flink Dashboard page appears. You can view the status of jobs on this page.Apache Flink Dashboard

References

For more information about Flink on YARN, see Apache Hadoop YARN.