The node scaling dashboard gives O&M engineers a single view of pod and node scaling activity — including real-time cluster state, historical trends, and event-level detail — so you can identify capacity problems and pinpoint the root cause of scaling failures without manually querying cluster logs.
Prerequisites
Before you begin, make sure that:
-
The node scaling dashboard is enabled for your cluster. To enable it, submit a ticket
-
The Kubernetes event center is enabled for your cluster. For more information, see Event monitoring
-
The audit log feature is enabled for your cluster. For more information, see Use cluster auditing
Dashboard layout
The node scaling dashboard consists of four areas: Overview, Pod details, Node details, and List of scaling activities.
Overview
The overview area shows five key metrics for quick cluster health assessment.
| Metric | What it shows |
|---|---|
| Total number of nodes | Total nodes in the cluster — indicates overall cluster capacity |
| Number of available nodes | Nodes in the KubeletReady state. If this differs from the total, some nodes are not in the KubeletNotReady state — meaning they are either being added to the cluster or have failed |
| Cluster scalability | Whether cluster-autoscaler can currently scale out. Displays NO when the number of nodes not in the Ready state exceeds the configured upper limit |
| Most recent scale-out activities | Count of scale-out activities in the selected time range |
| Most recent scale-in activities | Count of scale-in activities in the selected time range |
Pod details
| Chart | What it shows |
|---|---|
| Unschedulable pod trend | Number of pods in the Pending state over time — an increase typically signals that the cluster needs to scale out |
| Evicted pod trend | Number of evicted pods over time — a spike indicates that resource consumption on a node has reached its threshold |
Node details
| Chart | What it shows |
|---|---|
| Node status trend | Total nodes, KubeletReady nodes, and KubeletNotReady nodes over time. Nodes added within the last 10 minutes are excluded from the KubeletNotReady count |
| Node scale-out trend | Scale-out activity over time. Each data point maps to the number of ScaledUpGroup events generated — one event per scale-out action by cluster-autoscaler |
| Node scale-in trend | Scale-in activity over time. Each data point maps to the number of ScaleDown events generated — one event per scale-in action by cluster-autoscaler |
List of scaling activities
The scaling activity list displays all scaling-related events in chronological order. Search by pod name, node name, or event type to locate specific activities and inspect their details.
Identify issues
Check for abnormal nodes: Compare Total number of nodes with Number of available nodes. If they differ, some nodes are in an abnormal state and need attention.
Assess cluster sizing: Online workloads fluctuate between peak and off-peak hours, and auto scaling is designed to follow that pattern. Open the Node details area, select a time range that covers a recent peak period, and compare the scaling trend against your workload history. If the cluster failed to scale as expected, review your auto scaling configuration.
Troubleshoot scaling failures
Pending pods exist but nodes are not scaling out
-
Check the Cluster scalability metric in the Overview area.
-
If it shows NO, cluster-autoscaler is blocked from scaling out — troubleshoot the cluster state before continuing.
-
If it shows YES, move to the next step.
-
-
In the List of scaling activities, search for the pod name or the
NotTriggerScaleUpevent. -
Inspect the
reasonfield to identify why the scale-out was not triggered.
Scale-out triggered but failed to complete
-
In the List of scaling activities, search for the
FailedToScaleUpGroupevent. -
Inspect the
reasonfield to identify what prevented cluster-autoscaler from completing the scale-out.
Determine when a scale-out was triggered
In the List of scaling activities, search for the pod name or the NotTriggerScaleUp event, then check the event timestamp.
Determine when a scale-in was triggered
In the List of scaling activities, search for the node name or the ScaleDown event, then check the event timestamp.
Scale-in failed
In the List of scaling activities, search for the node name or the ScaleDownFailed event, then inspect the reason field.