This topic provides answers to frequently asked questions (FAQ) about Application Real-time Management Service (ARMS) Kubernetes Monitoring.

Installation-related FAQ

What do I do if a Kubernetes Monitoring agent fails to be installed?

Issue
  • Log on to the Container Service for Kubernetes console. In the left-side navigation pane of the details page of a cluster, choose Operations > Cluster Topology. On the Cluster Topology page, the "Failed to pass the component precheck" error message is displayed.
  • Log on to the Container Service for Kubernetes console. In the left-side navigation pane of the details page of a cluster, choose Operations > Components. On the Components page, the "Failed to pass the precheck" error message is displayed.
Solution
  1. On the Cluster Topology or Components page, click View Report.
  2. Check whether a Prometheus Monitoring agent of the latest version is installed.
    Note The metrics of Kubernetes Monitoring are collected by Prometheus Service. Therefore, you must install the Prometheus Service agent. For more information about the relationship between Kubernetes Monitoring and Prometheus Service, see Comparison between Kubernetes Monitoring and other ARMS services.
  3. Check whether the cluster is granted the required permissions.
    Check whether ARMS Addon Token exists in the cluster.
    • If ARMS Addon Token exists, ARMS automatically grants ACK the permissions to access ARMS resources. After you install a Prometheus Monitoring agent and a Kubernetes Monitoring agent, you can use the Kubernetes Monitoring agent.
    • If ARMS Addon Token does not exist, you must manually attach permission policies to ACK to grant ACK the permissions on ARMS and Tracing Analysis.
    View ARMS Addon Token
    1. On the Clusters page of the Container Service console, click the name of the cluster to go to the Cluster Details page.
    2. In the left-side navigation pane, choose Configuration Management > Secrets. In the top navigation bar, select Namespace as kube-system to check whether the addon.arms.token exists.
    Manually add ARMS and Tracing Analysis policies
    1. On the Clusters page of the Container Service console, click the name of the cluster to go to the Cluster Details page.
    2. In the left-side navigation pane, click Cluster Information. On the page that appears, click the Cluster Resources tab.
    3. Click the link next to Worker RAM Role. On the Role page, click Add Authorization.
    4. In the Select Permissions section of the Add Authorization panel, search for the following two policies by keyword, click the policy to add them to the Selected list on the right, and then click OK.
      • AliyunTracingAnalysisFullAccess: grants full permissions on Tracing Analysis.
      • AliyunARMSFullAccess: grants full permissions on ARMS.
    Note
    • ARMS Addon Token may not exist in some ACK managed clusters. If you use an ACK managed cluster, we recommend that you first check whether ARMS Addon Token exists. If ARMS Addon Token does not exist, you must manually grant permissions.
    • By default, ACK dedicated clusters do not have ARMS Addon Token. You must manually grant permissions.
    • By default, registered clusters do not have ARMS Addon Token. You must manually grant permissions. Registered clusters do not have a worker RAM role. You cannot manually attach permission policies on ARMS and Tracing Analysis to the RAM role. For more information about how to install the Kubernetes Monitoring agent for a registered cluster, see Install a Kubernetes Monitoring agent for a registered cluster.

Usage-related FAQ

What do I do if the container logs prompt that port 8888 is occupied?

The ot-collector container of the Kubernetes cmonitor-agent component occupies port 8888 of the cluster nodes. Close the process corresponding to port 8888 in the nodes.

What do I do if the "readiness probe failed" error message appears when I access a Cmonitor agent pod?

Issue
Log on to the Container Service for Kubernetes console. In the left-side navigation pane of the details page of a cluster, choose Workload > Pods. On the Pods page, click the cmonitor-agent-<xxx> pod in the arms-prom namespace. On the Events tab, the readiness probe failed error message is displayed. readiness probe failed
Solution

Check whether the cluster meets the environment requirements of Kubernetes Monitoring. For more information, see Environment requirements and limits.

What do I do if the memory usage and CPU utilization of a Cmonitor agent pod are excessively high?

Issue

Log on to the Container Service for Kubernetes console. In the left-side navigation pane of the details page of a cluster, choose Workload > Pods. On the Pods page, click the cmonitor-agent-<xxx> pod in the arms-prom namespace. The memory usage and CPU utilization of the cmonitor-agent-<xxx> pod are excessively high.

Solution
  1. Check whether 10,000 or more requests are processed per second on the node where the cmonitor-agent-<xxx> pod resides.
  2. Check whether the CPU utilization and memory usage of the node where the cmonitor-agent-<xxx> pod resides are excessively high.
  3. Join the Kubernetes Monitoring DingTalk group chat (ID: 35568145) for technical support.

What do I do if a topology does not show traffic relations?

Issue
On the Cluster Topology page, none of the nodes are connected with each other to indicate the traffic relations between the nodes. Cluster topology without connections
Solution
  1. Check the status of the Kubernetes Monitoring agent.

    Log on to the Container Service for Kubernetes console. Click the name of the cluster that you want to manage. In the left-side navigation pane, choose Workloads > DaemonSets. Select arms-prom from the Namespace drop-down list and check whether cmonitor-agent exists on the DaemonSets page.

    • If cmonitor-agent exists and all the pods are ready, the Kubernetes Monitoring agent is running as expected. Kubernetes Monitoring agent status
    • If cmonitor-agent does not exist or some pods are not ready, the Kubernetes Monitoring agent failed to be installed. Join the DingTalk group chat (ID:35568145) for Kubernetes Monitoring technical support.
  2. Check the logs of the agent.

    Click cmonitor-agent and then click the Logs tab.

    Check the status of the logs of the pods. If a pod runs as expected, logs that contain a version number are continuously generated and no error is reported. If you find error logs, join the DingTalk group chat (ID:35568145) for Kubernetes Monitoring technical support. Logs of a normal pod
  3. Check the configurations of the agent.

    In the left-side navigation pane, choose Configurations > ConfigMaps. Select arms-prom from the Namespace drop-down list and check whether otel-collector-config exists on the ConfigMap page. If otel-collector-config does not exist, join the DingTalk group chat (ID:35568145) for Kubernetes Monitoring technical support.

    Check agent configurations

What do I do if no request details are displayed in the node details panel of a topology?

Issue
  1. On the Cluster Topology page, click a node to open the node details panel.
  2. In the node details panel, click View List. On the page that appears, no API request details are displayed. API request details
Solution
  1. Check the status of the Kubernetes Monitoring agent.

    Log on to the Container Service for Kubernetes console. Click the name of the cluster that you want to manage. In the left-side navigation pane, choose Workloads > DaemonSets. Select arms-prom from the Namespace drop-down list and check whether cmonitor-agent exists on the DaemonSets page.

    • If cmonitor-agent exists and all the pods are ready, the Kubernetes Monitoring agent is running as expected. Kubernetes Monitoring agent status
    • If cmonitor-agent does not exist or some pods are not ready, the Kubernetes Monitoring agent failed to be installed. Join the DingTalk group chat (ID:35568145) for Kubernetes Monitoring technical support.
  2. Check the logs of the agent.

    Click cmonitor-agent and then click the Logs tab.

    Check the status of the logs of the pods. If a pod runs as expected, logs that contain a version number are continuously generated and no error is reported. If you find error logs, join the DingTalk group chat (ID:35568145) for Kubernetes Monitoring technical support. Logs of a normal pod
  3. Check the configurations of the agent.

    In the left-side navigation pane, choose Configurations > ConfigMaps. Select arms-prom from the Namespace drop-down list and check whether otel-collector-config exists on the ConfigMap page. If otel-collector-config does not exist, join the DingTalk group chat (ID:35568145) for Kubernetes Monitoring technical support.

    Check agent configurations
  4. View the sampling rate of API requests.
    1. Log on to the ARMS console. On the Kubernetes monitoring page, click the name of the cluster.
    2. In the left-side navigation pane, click Cluster management. On the Cluster Configurations tab, view the sampling rate of API requests in the Trace section.

      By default, the sampling rate of API requests is 10%. If the number of API requests is small, the request details may not be displayed. To display the request details, you can adjust the sampling rate.

What do I do if the API names in the node details panel of a topology are displayed as asterisks (*)?

Issue
In the node details panel of the Cluster Topology page, the API names are displayed as asterisks (*). API names are displayed as asterisks
Solution

Each API name is stored in Prometheus Service as a label. If a label has a large number of values, dimensional divergence may occur and affect the storage and query performance of Prometheus Service. To resolve this issue, the Kubernetes Monitoring API supports a maximum of two levels of paths. If more than 50 different API names are stored in a path, the API names are displayed as asterisks (*). All fields that contain digits and extensions, such as .png, and .html, are also displayed as asterisks (*).