All Products
Search
Document Center

Realtime Compute for Apache Flink:View exception logs

Last Updated:Mar 26, 2026

When a job fails during startup or runtime, exception logs are your first stop for diagnosing the root cause. This topic covers three log types: JobManager exceptions, archived logs from failed TaskManagers, and logs from TaskManagers running slow checkpoints.

Log typeWhen to use it
JobManager exceptionsThe job failed due to a failover. Check this before diving into raw logs.
Failed TaskManager logsA TaskManager crashed and restarted. View its archived logs to find the cause.
TaskManager logs for slow checkpointsA checkpoint is taking too long. Trace the slow task back to its TaskManager logs.
If the JobManager fails to start rather than failing during execution, that is a startup failure, not a JobManager exception. Check the startup logs instead.

Prerequisites

Before you begin, ensure that you have:

  • A job instance in the Running state

Log pagination

Logs are paginated. Each page displays up to 1 MB of logs, or roughly 8,000–9,000 lines. In most cases, the first page contains enough information to identify the issue. If the cause is not on the first page, switch to other pages.

View JobManager exceptions

  1. Log on to the Realtime Compute for Apache Flink console.

  2. In the Actions column of the target workspace, click Console.

  3. In the navigation pane on the left, click Operation Center > Job O&M, then click the name of the target job.

  4. On the Job Log tab, click the Exception Information tab.

    In Exception History, you can view exceptions from the last 7 days and filter them by type.

    View runtime exception logs-1.jpg

View logs of failed TaskManagers

Failed TaskManager logs are only available if log archiving is enabled, and only within the configured retention period.

  1. Log on to the Realtime Compute for Apache Flink console.

  2. In the Actions column of the target workspace, click Console.

  3. In the navigation pane on the left, click Operation Center > Job O&M, then click the name of the target job.

  4. On the Job Log tab, click the Operational Log tab, then select a job instance.

    View startup and runtime logs 2.jpg

  5. Click the Failed Task Managers tab.

A job in the Normal state has no failed TaskManagers. For high-risk jobs, a TaskManager may fail and restart — use the archived logs to find the potential cause.

Locate slow checkpoints and view TaskManager logs

Use the End to End Duration column in Checkpoints History to identify slow checkpoints, then drill down to the TaskManager running the slow task.

  1. Log on to the Realtime Compute for Apache Flink console.

  2. In the Actions column of the target workspace, click Console.

  3. In the navigation pane on the left, click Operation Center > Job O&M, then click the name of the target job.

  4. On the Job Log tab, click the Checkpoints tab, then click Checkpoints History.

  5. Check the End to End Duration column to identify checkpoints with a long duration.

    View runtime exception logs.jpg

  6. Click the Plus sign icon to the left of the slow checkpoint's ID to expand the Operators node.

  7. Click the Plus sign icon to the left of the Operators node to view individual task details.

  8. Click the ID of the task with a long duration.

    View runtime exception logs 2.jpg

  9. On the Running Task Managers tab, view the logs for the TaskManager running the slow task.