This topic provides a summary of common operations for managing worker nodes in the Container Service for Kubernetes (ACK) Console. You can read this topic for detailed operations and the relevant usage notes.
Most operations are accessible on the Nodes page.
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of the one you want to change. In the left navigation pane, choose .
Node logon
For scenarios such as node troubleshooting, performance monitoring, or executing custom scripts, you can log on to the corresponding ECS instance of the node.
Workbench connection: In the Actions column of the node list, choose .
VNC connection: In the Actions column of the node list, select .
For additional remote connection methods to ECS instances, see Methods for connecting to an ECS instance.
If your operating system is ContainerOS, to mitigate security risks, ContainerOS does not support direct logon for untraceable operations and lacks SSH functionality. For necessary maintenance operations, see Maintain ContainerOS nodes.
Node draining and scheduling status
Node draining
In the Actions column of the node list, select , and follow the on-screen prompts to drain the node. This process involves evacuating the existing pods from the node and marking it as unschedulable, ensuring that no new pods will be scheduled on it.
Please note the following precautions.
Ensure sufficient resources on other nodes in the cluster to prevent application pods from becoming unschedulable.
Verify the node affinity rules and scheduling policies for pods on the node to be removed, to ensure their continued schedulability on other nodes after the node's removal.
Pods managed by DaemonSet will not be evicted.
Change node scheduling status
From the node list, select the desired node, and then click Set Scheduling Status at the page's bottom. Please read the precautions in the dialog box carefully, and follow the on-page prompts to finalize the operation.
Please note the following precautions.
You should perform this operation during off-peak hours as it may impact business operations.
Once a node is set to unschedulable, it will be labeled as SchedulingDisabled. While existing pods on the node will continue to serve externally, new pods will not be scheduled to this node.
Pods managed by DaemonSet will not be removed.
Node removal
If you no longer require a worker node, you can remove it from the node pool or cluster via the ACK Console during off-peak hours. In the Actions column of the node list, choose , or select the node and click Batch Remove in the lower part of the page. Then, simply follow the on-screen prompts to complete the process.
For related precautions and feature details, see Remove a node.
Node monitoring
Click Monitor in the Actions column to install the component and enable Managed Service for Prometheus for node resource dashboard viewing. For configuring monitoring alerts with Managed Service for Prometheus, see Connect to and configure Managed Service for Prometheus.
For creating custom PromQL alert rules for abnormal node status, see Best practices for configuring alert rules using Prometheus.
Node fault diagnosis
For diagnosing issues with an abnormal node, click Exception Diagnosis in the Actions column. This will initiate an inspection and provide a repair plan. For details of supported diagnostic scenarios, inspection items, and repair plans, see Node diagnostics.
Manage node labels and taints
To manage and schedule cluster resources via labels and taints, navigate to the Nodes page, click Manage Labels and Taints, and follow the guide to configure names and values for labels and taints. For more information, see Manage node labels and taints.
View node information
In the Actions column of the node list, select to view the YAML template of the node.
In the Actions column of the node list, click Details to view the node information.
CPU and memory usage
CPU request = sum(requested CPU resources by all pods on the node)/total CPU resources on the node
CPU utilization = sum(used CPU resources by all pods on the node)/total CPU resources on the node
Memory request = sum(requested memory resources by all pods on the node)/total memory resources on the node
Memory utilization = sum(used memory resources by all pods on the node)/total memory resources on the node
NoteAllocatable resources = Resource capacity - Reserved resources - Eviction threshold. For more details, see Resource reservation policy.
Basic node information
Includes node name, IP address, instance ID, container runtime version, operating system, kernel, etc.
Other informations
Details of node CPU and memory resource allocation (Request and Limit), node status, pod list, node events, and more.
Batch operations on nodes
You can perform batch operations on worker nodes in your cluster to improve O&M efficiency. Common use cases include securely updating the OS kernel or installing custom software packages for monitoring, security, or auditing. Before using this feature, you must activate CloudOps Orchestration Service (OOS) in the OOS console. OOS enables task automation through the execution of predefined templates.
This feature is not supported on clusters with Auto Mode enabled.
On the Nodes page of your cluster, select the target worker nodes from the list.
Below the node list, click Batch Operations.
In the dialog box that appears, select the desired operation and click OK.
Supported operations include:
Install operating system kernel security updates
Install custom packages
Install or uninstall YUM or APT packages
Run Shell scripts
You will be automatically redirected to the OOS console. Refer to Create an execution and follow the on-screen prompts to configure the basic information and required parameters for the task, then click Create to submit the execution.
After submission, you will be automatically redirected to the Task Execution Management page in the OOS console. Click the Execution ID of your task to monitor its status, review the individual steps, and see the results.
For more information about managing executions in OOS, see Overview.
References
You can use the resource profiling feature provided by ACK to get resource configuration suggestions for containers based on the historical data of resource usage. This simplifies the configuration of resource requests and limits for containers. For more information, see Resource profiling.
For more information about how to configure resources for application pods, see Create a stateless workload (Deployment).
To configure node labels and a node selector to schedule application pods to specific nodes, see Schedule application pods to the specified node.
For guidance on scaling up or down worker node resources, see Upgrade or downgrade the configurations of a worker node.
To add a data disk to a node for storing resources like the container runtime and kubelet, see Attach data disks to nodes.
For more information about how to resize the data disk or system disk, see Resize the system disk or data disk of a node.
Node upgrades, including kubelet and runtime versions, are managed at the node pool level. For more information, see Update a node pool.