E-MapReduce (EMR) Doctor is an intelligent O&M diagnosis system developed by the Alibaba Cloud E-MapReduce (EMR) team for open source big data clusters. EMR Doctor allows you to obtain the health status of your clusters. If your clusters are in an abnormal state, EMR Doctor can provide suggestions for O&M and resource optimization. You can use EMR Doctor on the Health Check tab of the details page of a cluster.

If you are an O&M engineer of an EMR cluster, take note of the following items:
  • The overall cluster stability, which can be evaluated based on factors such as the status of key services in the cluster and exception handling for the services. The key services include YARN, Hadoop Distributed File System (HDFS), Hive, and Spark.
  • The overall effectiveness of the cluster, such as the loads on the cluster and effective memory and CPU utilization of the cluster.
  • The service level agreement (SLA) that needs to be maintained for the cluster user. You must make sure that sufficient resources are allocated to key tasks and the execution of the key tasks is complete as expected.
EMR Doctor is an open source big data cluster manager that provides the following capabilities:
  • Monitors the health status of the cluster in real time, and provides suggestions on the usage of key services. This helps reduce the cluster O&M costs and consistently improve cluster stability.
  • Provides information about the usage and allocation of cluster resources and allocates suitable hardware resources to improve the cluster resource utilization.
  • Helps optimize service components in the cluster and tasks that are run on the cluster, and provides optimization suggestions that can be used to ensure the effectiveness and stability of the overall data link and computing link.
EMR Doctor provides the following features:
  • Real-time check. This feature is used to detect and analyze the exceptions that occur on tasks in the cluster in real time, and monitor and analyze status of the service components in the cluster to identify potential issues and provide optimization suggestions. For more information, see Enable real-time check and analysis.
  • Daily cluster report. This feature is used to generate a score for the cluster based on the health status of the cluster and provide intelligent optimization suggestions. For more information, see View daily cluster reports and analysis results in the reports.
  • Information integration and intelligent diagnostics. EMR Doctor integrates various information in the cluster for analysis and uses intelligent algorithms for problem diagnostics. This reduces the heavy and repetitive big data workloads on the cluster.