This topic describes how to use Application Real-Time Monitoring Service (ARMS) to collect and view metrics of Flink jobs and how to configure alert rules based on the metrics.
Prerequisites
A Flink cluster is created on the EMR on ACK page in the new Alibaba Cloud E-MapReduce (EMR) console. For more information, see Getting started.
ARMS is activated. For more information, see Create a Prometheus instance to monitor an ACK cluster.
Configure Prometheus Service
Go to the Prometheus Monitoring page.
Log on to the EMR console. In the left-side navigation pane, click EMR on ACK.
On the EMR on ACK page, find the cluster that you want to manage and click the link in the ACK Cluster column.
In the left-side navigation pane, choose .
On the Prometheus Monitoring page, wait for the system to automatically install the component and check the dashboards.
After the installation is complete, you can click each tab to view the corresponding metrics.
On the Prometheus Monitoring page, click Go to ARMS Prometheus in the upper-right corner.
Enable service discovery.
In the left-side navigation pane, click Service Discovery.
On the Service Discovery page, click Configure.
On the Default Service Discovery tab, turn on the switch in the Actions column of kubernetes-pods.
In the dialog box that appears, click Enable.
Submit a Flink job. For more information, see Submit a Flink job.
ImportantYou must specify the annotations for Prometheus Metric Reporter in the podTemplate parameter of the YAML file of the Flink job.
Sample YAML file:
apiVersion: flink.apache.org/v1beta1 kind: FlinkDeployment metadata: name: basic-emr-example spec: flinkVersion: v1_13 flinkConfiguration: state.savepoints.dir: file:///flink-data/flink-savepoints state.checkpoints.dir: file:///flink-data/flink-checkpoints metrics.reporter.prom.factory.class: org.apache.flink.metrics.prometheus.PrometheusReporterFactory serviceAccount: flink podTemplate: metadata: annotations: prometheus.io/path: /metrics prometheus.io/port: "9249" prometheus.io/scrape: "true" spec: serviceAccount: flink containers: - name: flink-main-container volumeMounts: - mountPath: /flink-data name: flink-volume ports: - containerPort: 9249 name: metrics protocol: TCP volumes: - name: flink-volume emptyDir: {} jobManager: replicas: 1 resource: memory: "2048m" cpu: 1 taskManager: resource: memory: "2048m" cpu: 1 job: jarURI: local:///opt/flink/examples/streaming/StateMachineExample.jar parallelism: 2 upgradeMode: statelessAfter the job is run, go to the Targets tab of the Service Discovery page. You can view the status of pods and collect the metrics on the JobManager and TaskManager of the Flink job.
Optional:Configure a Grafana dashboard to view the metrics.
On the Prometheus Monitoring page, click the Others tab.
Click the Prometheus tab.
Click Open in New Window .
In the left-side navigation pane, choose .
ImportantYou can add a dashboard only if you use Grafana Expert Edition.
Click Add new panel.
In the Query section, select a cluster. In the A section, select the metric that you want to view from the Metrics drop-down list. For example, you can select flink_jobmanager_job_lastCheckpointDuration.
Enter the panel title and configure other parameters as required.
Click Save in the upper-right corner. In the dialog box that appears, enter a dashboard name, select your ACK cluster, and then click Save.
Configure and view alerts
Go to the Alerts Rules page.
Log on to the EMR console. In the left-side navigation pane, click EMR on ACK.
On the EMR on ACK page, find the cluster whose alert rules you want to view and click the link in the ACK Cluster column.
In the left-side navigation pane, choose .
On the Prometheus Monitoring page, wait for the system to automatically install the component and check the dashboards.
After the installation is complete, you can click each tab to view the corresponding metrics.
On the Prometheus Monitoring page, click Go to ARMS Prometheus in the upper-right corner.
In the left-side navigation pane, click Alerts Rules.
Configure alert rules.
In the upper-right corner of the Prometheus Alert Rules page, click Create Prometheus Alert Rule.
Create an alert rule.
On the Prometheus Alert Rules page, find the alert that you want to view and click Alert Event History in the Actions column.
The alert is triggered when the specified condition is met.
> Create