CloudMonitor is a monitoring and alerting service provided by Alibaba Cloud. You can create threshold-triggered alert rules in the CloudMonitor console to monitor the usage of E-MapReduce (EMR) resources. If the value of a metric exceeds the threshold that is specified in a rule, CloudMonitor automatically sends an alert notification. This way, you can receive the notification and handle the related exceptions at the earliest opportunity.
Prerequisites
An EMR cluster is created. For more information, see Create a cluster.
Procedure
Metrics
Service | Metric name | Description |
---|---|---|
HDFS | NameNodeIpcPortOpen | The availability of the IPC port of the NameNode.
|
TotalDFSUsedPercent | The total Hadoop Distributed File System (HDFS) capacity usage of a cluster. | |
DataNodeDfsUsedPercent | The HDFS capacity usage of a DataNode. | |
DataNodeIpcPortOpen | The availability of the IPC port of a DataNode.
|
|
JournalNodeRpcPortOpen | The availability of the RPC port of a JournalNode.
|
|
ZKFCPortOpen | The availability of the ZKFailoverController (ZKFC) port.
|
|
dfs.FSNamesystem.MissingBlocks | The number of missing blocks. | |
dfs.datanode.VolumeFailures | The number of damaged disks detected by HDFS. | |
YARN | ResourceManagerPortOpen | The availability of the service port of the ResourceManager.
|
JobHistoryPortOpen | The availability of the service port of Job History.
|
|
yarn.ClusterMetrics.NumUnhealthyNM | The number of unhealthy NodeManagers. | |
ProxyServerPortOpen | The availability of the WebAppProxy port.
|
|
TimelineServerPortOpen | The availability of the service port of Timeline Server.
|
|
Hive | MetastorePortOpen | The availability of the Hive Metastore port.
|
HiveServer2PortOpen | The availability of the service port of HiveServer2.
|
|
ThriftServerPortOpen | The availability of the service port of Thrift Server.
|
|
HBase | HMasterIpcPortOpen | The availability of the IPC port of HMaster.
|
HRegionServerIpcPortOpen | The availability of the IPC port of HRegionServer.
|
|
ZooKeeper | ZKClientPortOpen | The availability of the listening port of the ZooKeeper client.
|
Hue | HuePortOpen | The availability of the Hue port.
|
Storm | StormNimbusThriftPortOpen | The availability of the Thrift port of Storm Nimbus.
|
HOST | proc_total | The total number of processes. |
part_max_used | The maximum usage of a disk partition. | |
disk_free_percent_mnt_disk1 | The percentage of available disk space. | |
disk_free_percent_rootfs | The percentage of disk space that is occupied by the root file system. |