On the Deployments page of the console of fully managed Flink, you can quickly view the numbers and lists of normal jobs, abnormal jobs, and jobs that have risks to learn the health status of the jobs in real time. This topic describes how to view the numbers and lists of abnormal jobs and jobs that have risks.
Background information
Job type | Description |
---|---|
Abnormal job | The system refreshes the job status every minute. If the state of a job is FAILED, the job is considered abnormal. |
Job that has risks | The system refreshes the job status every minute. If the system detects a risk for
a job that is running, the system displays the risk type for the job on the right
side of RUNNING. The risk type is Unstable, Failing, or ClusterUnreachable.
The system classifies jobs that have risks into the following three levels based on
the severity of the risks:
|
Precautions
After you filter jobs that have risks and abnormal jobs, you can click the name of the desired job to go to the job details page. Then, you can click Diagnosis to view the cause of the related risk or exception. You can fix the issue or tune the performance of the job based on the recommendations that are provided by the system to restore the job to normal. For more information about how to diagnose a job, see Job diagnostics.