ACK Pro clusters expose Prometheus dashboards for five control plane components: kube-apiserver, etcd, kube-scheduler, cloud-controller-manager, and kube-controller-manager. Use these dashboards to monitor resource utilization, identify anomalies, and maintain cluster stability.
Prerequisites
Before you begin, ensure that you have:
-
An ACK Pro cluster running Kubernetes 1.16 or later
-
Application Real-Time Monitoring Service (ARMS) activated. For more information, see Activate ARMS
-
The ack-arms-prometheus component installed. For more information, see Manage components
View control plane component dashboards
-
Log on to the ACK console. In the left-side navigation pane, click ACK consoleClusters.
-
On the Clusters page, find the cluster you want to manage and click its name. In the left-side pane, choose Operations > Prometheus Monitoring.
Best practices for accessing control plane components
For clusters with more than 100 nodes and a large number of Kubernetes resources, apply the following practices to maintain cluster stability.
-
Use Informer or Lister to retrieve data from the API server. This reduces the load on the API server and etcd.
-
Control how you list data. To retrieve data from the API server cache and avoid overloading etcd, add
resourceVersion=0to your request. To query etcd directly, use thelimitoption to paginate results. -
Use Protobuf as the API serialization protocol. Protobuf consumes less memory resources and data transfer than JSON. For more information, see Alternate representations of resources. The following example shows how to configure Protobuf in your client:
kubeConfig, err := clientcmd.BuildConfigFromFlags(s.Master, s.Kubeconfig) if err != nil { return nil, err } kubeConfig.AcceptContentTypes = strings.Join([]string{runtime.ContentTypeProtobuf, runtime.ContentTypeJSON}, ",") kubeConfig.ContentType = runtime.ContentTypeProtobuf client, err := clientset.NewForConfig(restclient.AddUserAgent(kubeConfig, "content-type-example")) ... -
Delete idle Kubernetes resources promptly. Unused ConfigMaps, Secrets, and persistent volume claims (PVCs) can create pending pods. When pending pods exceed 1,000, the stability of kube-apiserver, kube-controller-manager, and kube-scheduler is affected.
-
Monitor CPU and memory utilization of control plane components. Sustained high resource usage can cause out-of-memory errors. If usage remains high, delete invalid resources, optimize client behavior, and separate the workloads in the cluster.
-
Review open source components that increase control plane load. Some components can overwhelm the Kubernetes API under heavy traffic. For example, Argo Workflows provides a solution to resolve the issue of overwhelmed Kubernetes API when Argo is busy. For more information, see Running at massive scale.
What's next
Use the following reference topics to understand the metrics available in each dashboard and troubleshoot common anomalies.
| Control plane component | Dashboard | Description | Link |
|---|---|---|---|
| kube-apiserver | ACK Pro APIServer | Metrics supported by kube-apiserver, dashboard usage notes, and troubleshooting guidance for common metric anomalies | kube-apiserver |
| cloud-controller-manager | ACK Pro Cloud Controller Manager | Metrics supported by cloud-controller-manager, dashboard usage notes, and troubleshooting guidance for common metric anomalies | Metrics of cloud-controller-manager |
| etcd | ACK Pro ETCD | Metrics supported by etcd, dashboard usage notes, and troubleshooting guidance for common metric anomalies | Metrics of etcd |
| kube-controller-manager | ACK Pro Kube Controller Manager | Metrics supported by kube-controller-manager and dashboard usage notes | Metrics of kube-controller-manager |
| kube-scheduler | ACK Pro Scheduler | Metrics supported by kube-scheduler, dashboard usage notes, and troubleshooting guidance for common metric anomalies | Metrics of kube-scheduler |
| Custom Prometheus monitoring and alerting | Custom dashboard name | How to collect control plane metrics to a self-managed Prometheus instance and recommended alerting configurations | Use a self-managed Prometheus instance to collect metrics of control plane components and configure alerts |