All Products
Search
Document Center

Container Service for Kubernetes:Manually restrict the permissions of a Worker RAM role for an ACK managed cluster

Last Updated:Feb 28, 2026

To improve the security of nodes in an ACK managed cluster, you can manually adjust the permissions of the RAM role assigned to worker nodes based on the principle of least privilege.

The restriction process is iterative: upgrade components to versions that support token-based authentication, collect audit logs to identify which permissions are in use, then progressively remove the permissions that are not needed.

Prerequisites

Before you begin, make sure that you have:

Step 1: Assess current permissions

Determine whether the Worker RAM role has excess permissions that need to be restricted.

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, click the name of the target cluster. On the Basic Information tab, click the link next to Worker RAM Role to open the RAM console.

  3. On the Permissions tab of the Role page, check whether any access policies are attached.

    • If the list is empty, no restriction is needed. The Worker RAM role has no extra permissions.

    • If the list contains one or more policies (for example, k8sWorkerRolePolicy-db8ad5c7***), the role may have more permissions than necessary. Continue with Step 2.

Note

Review each attached policy to understand which Alibaba Cloud API actions it grants. This helps you set a baseline before making changes.

Step 2: Upgrade system components

Upgrade the system components in your cluster to the minimum required versions listed below. Newer component versions use token-based authentication through managed service roles instead of relying on the Worker RAM role directly.

Important

Upgrade components one at a time. Confirm that each upgrade succeeds before starting the next. Read the remarks for each component before upgrading.

For more information about managing components, see Manage components.

Components managed through component management

On the Component Management page, upgrade each installed component to at least the minimum version shown below. For components that are already at or above the minimum version, redeploy them with the command listed in the table. You can also redeploy components from the console.

Monitoring and observability

ComponentMinimum versionRedeploy commandRemarks
metrics-serverv0.3.9.4-ff225cd-aliyunkubectl -n kube-system rollout restart deployment/metrics-serverNone
alicloud-monitor-controllerv1.5.5kubectl -n kube-system rollout restart deployment/alicloud-monitor-controllerNone
arms-prometheus1.1.11kubectl -n arms-prom rollout restart deployment/arms-prometheus-ack-arms-prometheusNone
ack-cost-exporter1.0.10kubectl -n kube-system rollout restart deployment/ack-cost-exporterBefore upgrading, grant permissions for the managed cost role.
ack-node-problem-detector1.2.16kubectl -n kube-system rollout restart deployment/ack-node-problem-detector-eventerNone

Networking

ComponentMinimum versionRedeploy commandRemarks
terwayv1.0.10.333-gfd2b7b8-aliyunkubectl -n kube-system rollout restart daemonset/terwayUpgrade the Terway variant that matches your cluster's Terway mode. For more information, see Use the Terway network plug-in. After upgrading, check the Terway component configuration. See Check the Terway component configuration.
terway-eniv1.0.10.333-gfd2b7b8-aliyunkubectl -n kube-system rollout restart daemonset/terway-eniSee remarks for terway.
terway-eniipv1.0.10.333-gfd2b7b8-aliyunkubectl -n kube-system rollout restart daemonset/terway-eniipSee remarks for terway.
terway-controlplanev1.2.1kubectl -n kube-system rollout restart deployment/terway-controlplaneNone
mse-ingress-controller1.1.5kubectl -n mse-ingress-controller rollout restart deployment/ack-mse-ingress-controllerBefore upgrading, grant permissions for the managed Microservices Engine (MSE) role.

Storage

ComponentMinimum versionRedeploy commandRemarks
csi-pluginv1.18.8.45-1c5d2cd1-aliyunkubectl -n kube-system rollout restart daemonset/csi-pluginNone
csi-provisionerv1.18.8.45-1c5d2cd1-aliyunkubectl -n kube-system rollout restart deployment/csi-provisionerNone
storage-operatorv1.18.8.55-e398ce5-aliyunkubectl -n kube-system rollout restart deployment/storage-auto-expander
kubectl -n kube-system rollout restart deployment/storage-cnfs
kubectl -n kube-system rollout restart deployment/storage-monitor
kubectl -n kube-system rollout restart deployment/storage-snapshot-manager
kubectl -n kube-system rollout restart deployment/storage-operator



None
alicloud-disk-controllerv1.14.8.51-842f0a81-aliyunkubectl -n kube-system rollout restart deployment/alicloud-disk-controllerNone
flexvolumev1.14.8.109-649dc5a-aliyunkubectl -n kube-system rollout restart daemonset/flexvolumeMigrate FlexVolume to CSI.

Logging

ComponentMinimum versionRedeploy commandRemarks
logtail-dsv1.0.29.1-0550501-aliyunkubectl -n kube-system rollout restart daemonset/logtail-ds
kubectl -n kube-system rollout restart deployment/alibaba-log-controller
None
loongcollectorv3.0.2kubectl -n kube-system rollout restart daemonset/loongcollector-ds
kubectl -n kube-system rollout restart deployment/loongcollector-operator
None

Security and image management

ComponentMinimum versionRedeploy commandRemarks
aliyun-acr-credential-helperv23.02.06.2-74e2172-aliyunkubectl -n kube-system rollout restart deployment/aliyun-acr-credential-helperBefore upgrading, grant permissions for the managed Container Registry (ACR) role. If you do not have custom RAM permissions and do not need to pull images across accounts, go to Component Management and set tokenMode to managedRole. If you do not need password-free image pulling for private images, uninstall this component.
ack-onepilot3.0.11kubectl -n ack-onepilot rollout restart deployment/ack-onepilot-ack-onepilotBefore upgrading, grant permissions for the managed MSE role.

Cluster Autoscaler (installed through a node pool)

ComponentMinimum versionRedeploy commandRemarks
cluster-autoscalerv1.3.1-bcf13de9-aliyunkubectl -n kube-system rollout restart deployment/cluster-autoscalerTo check the current version, use either method: Method 1: View the version in the ACK console. For more information, see How to view the version of the cluster-autoscaler component. Method 2: Run: kubectl -n kube-system get deployment/cluster-autoscaler -o yaml | grep acs/autoscaler. To upgrade, see [Component Upgrade] cluster-autoscaler Upgrade Announcement.

Check the Terway component configuration

If your cluster has the terway, terway-eni, or terway-eniip component installed, verify that the Terway ConfigMap contains the correct credential path.

  1. Run the following command to open the Terway ConfigMap for editing:

       kubectl edit cm eni-config -n kube-system
  2. In the eni_conf section, check for the following configuration item:

    • If this line is present, no change is needed.

    • If this line is missing, add it below the min_pool_size configuration item: ``json "credential_path": "/var/addon/token-config", ``

       "credential_path": "/var/addon/token-config",
  3. Redeploy the Terway component workload by running the corresponding redeploy command from the networking table above.

Step 3: Collect audit logs

Set up audit logging to track which Alibaba Cloud API operations the Worker RAM role performs. This data tells you which permissions are still in use and which can be safely removed.

Note

Collect audit logs for at least one week.

  1. Log on to the ActionTrail console.

  2. Create a single-account trail for the region where the cluster is located. When creating the trail, select Deliver Events To Simple Log Service (SLS). For more information, see Create a single-account trail.

For the full list of services that ActionTrail supports, see Supported Alibaba Cloud services.

Step 4: Test cluster features

After upgrading components and redeploying workloads, verify that core cluster features still work correctly.

Feature areaTest caseReference
ComputeScale nodes out and in.Manually scale a node pool
NetworkVerify that IP addresses are assigned to pods.Deployments and releases
StorageDeploy workloads that use external storage (if applicable).Storage - CSI
MonitoringConfirm that monitoring and alert data is available.Observability
ElasticityTrigger node autoscaling (if applicable).Enable node autoscaling
SecurityPull a private image without credentials (if applicable).Install and use the unmanaged password-free component
Important

In addition to the tests above, test the business logic of the applications deployed in your cluster. Confirm that application-level functionality works as expected.

Step 5: Analyze audit logs

After collecting logs for at least one week, query the data to identify which API operations the Worker RAM role performed.

  1. Log on to the Simple Log Service (SLS) console.

  2. In the Projects section, click the project you specified in Step 3.

    image

  3. On the Log Storage > Logstores tab, click the destination Logstore. The Logstore is named actiontrail_<trail_name>.

  4. Run the following query to list the API operations called through the Security Token Service (STS) token of the Worker RAM role. Replace <worker_role_name> with the name of the Worker RAM role for your cluster. The results show which Alibaba Cloud services and API operations are being called. Use this data in Step 6 to decide which permissions to keep and which to remove.

       * and event.userIdentity.userName: <worker_role_name> | select "event.serviceName", "event.eventName", count(*) as total GROUP BY "event.eventName", "event.serviceName"

Step 6: Restrict permissions

Based on the audit log analysis, remove permissions that the Worker RAM role does not need.

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, click the name of the target cluster. On the Basic Information tab, click the link next to Worker RAM Role to open the RAM console.

  3. On the Permissions tab of the Role page, click the destination access policy to go to the Policy Document tab. Click Edit Policy.

    Important

    Before you modify the policy document, back up the existing policy document. This lets you roll back the permission configuration if needed.

  4. When you modify the policy document, decide whether to delete unnecessary permissions based on your requirements and the analysis results of the audit logs from Step 5. For example, you can delete the Action permissions that do not appear in the statistics. If you confirm that no permissions are required, you can revoke all granted access policies.

  5. Redeploy the system component workloads. Use the redeploy commands listed in Step 2.

  6. Repeat Steps 4, 5, and 6 until the Worker RAM role is granted only the minimum permissions required by the components or applications.

References