After a scale-in operation in a registered cluster, nodes can remain in the NotReady state and waste cluster resources. Install the cloud-node-controller component to automatically detect and remove these nodes.
Prerequisites
Before you begin, ensure that you have:
-
A standard node pool that has been created and scaled out. For more information, see Create and Manage Node Pools.
-
kubectl connected to the registered cluster. For more information, see Connect to a Cluster by Using kubectl.
Step 1: Configure permissions for cloud-node-controller
Before installing the component, create a RAM user, grant the required policy, and store the credentials as a Kubernetes Secret.
-
Create a RAM user and grant the following custom policy. For more information, see Grant RAM Permissions to a RAM User.
{ "Version": "1", "Statement": [ { "Action": [ "ecs:DescribeInstances" ], "Resource": [ "*" ], "Effect": "Allow" } ] }The
ecs:DescribeInstancespermission allows the component to query ECS instance details to determine node health status. -
Set the AccessKey ID and AccessKey Secret of the RAM user as environment variables.
export ACCESS_KEY_ID=<ACCESS KEY ID> export ACCESS_KEY_SECRET=<ACCESS KEY SECRET>Replace
<ACCESS KEY ID>and<ACCESS KEY SECRET>with the credentials of the RAM user you created. -
Create a Secret named
alibaba-addon-secretin thekube-systemnamespace.kubectl -n kube-system create secret generic alibaba-addon-secret \ --from-literal="access-key-id=${ACCESS_KEY_ID}" \ --from-literal="access-key-secret=${ACCESS_KEY_SECRET}"NoteIf the Secret already exists, grant the custom policy to the RAM user associated with the existing Secret instead of creating a new one.
Step 2: Install the cloud-node-controller component
-
Log on to the Container Service Management Console. In the left navigation pane, click Clusters.
-
On the Clusters page, click the name of your cluster. In the left navigation pane, click Add-ons.
-
On the Add-ons page, go to the Core Components tab and find cloud-node-controller. Click Install in the lower-right corner of the card.
Step 3: Verify the component status
Run the following command to confirm the component is running.
kubectl get pods -n kube-system | grep cloud-node-controller
The expected output is similar to:
cloud-node-controller-abcXXX 1/1 Running 0 5m
Once the component is running, it automatically detects and removes NotReady nodes caused by scale-in operations.