To improve the security of nodes in an ACK managed cluster, you can manually adjust the permissions of the RAM role assigned to worker nodes based on the principle of least privilege.
Prerequisites
An ACK managed cluster (ACK Managed Cluster Pro or ACK Managed Cluster Basic) of version 1.18 or later is created. For more information, see Create an ACK managed cluster and Manually upgrade a cluster.
If you want to restrict the permissions for an ACK dedicated cluster, migrate it to an ACK Managed Cluster Pro. For more information, see Hot migrate an ACK dedicated cluster to an ACK Managed Cluster Pro.
The default service roles that the ACK managed cluster requires are granted. For more information, see Grant permissions to service roles with one click.
Step 1: Confirm whether restriction is needed
Log on to the ACK console. In the navigation pane on the left, click Clusters.
On the Clusters page, click the name of the target cluster. On the Basic Information tab, click the link next to Worker RAM Role to open the RAM console.
On the Permissions tab of the Role page, check whether any access policies exist.
If the list is empty, no restriction is needed.
If the list is not empty, for example, if it contains k8sWorkerRolePolicy-db8ad5c7***, the permissions of the Worker RAM role may need to be restricted. You can determine whether to proceed based on your business scenario and the principle of least privilege.
Step 2: Upgrade system components
The core system components installed in the ACK managed cluster must be upgraded to the required minimum version or the latest version. For more information, see Manage components.
Do not upgrade multiple components at the same time. Upgrade components one by one. Make sure that a component is successfully upgraded before you upgrade the next one.
Before you upgrade a component, read the remarks for the component.
Components are installed in two ways: through component management or through node pools. The requirements and upgrade methods are described below.
Components installed through component management
On the Component Management page, use the following table to upgrade the installed components in the cluster to the required minimum version or the latest version. For components that do not need to be upgraded, you must redeploy them using the redeploy command in the following table. You can also redeploy the components in the console.
Component Name | Minimum Component Version | Command to Redeploy the Component | Remarks |
metrics-server | v0.3.9.4-ff225cd-aliyun | | None |
alicloud-monitor-controller | v1.5.5 | | None |
logtail-ds | v1.0.29.1-0550501-aliyun | | None |
loongcollector | v3.0.2 | | None |
terway | v1.0.10.333-gfd2b7b8-aliyun | |
|
terway-eni | v1.0.10.333-gfd2b7b8-aliyun | | |
terway-eniip | v1.0.10.333-gfd2b7b8-aliyun | | |
terway-controlplane | v1.2.1 | | None |
flexvolume | v1.14.8.109-649dc5a-aliyun | | |
csi-plugin | v1.18.8.45-1c5d2cd1-aliyun | | None |
csi-provisioner | v1.18.8.45-1c5d2cd1-aliyun | | None |
storage-operator | v1.18.8.55-e398ce5-aliyun | | None |
alicloud-disk-controller | v1.14.8.51-842f0a81-aliyun | | None |
ack-node-problem-detector | 1.2.16 | | None |
aliyun-acr-credential-helper | v23.02.06.2-74e2172-aliyun | | Before you upgrade the component, you must first grant permissions.
|
ack-cost-exporter | 1.0.10 | | Before you upgrade the component, you must first grant permissions. |
mse-ingress-controller | 1.1.5 | | Before you upgrade the component, you must first grant permissions. |
arms-prometheus | 1.1.11 | | None |
ack-onepilot | 3.0.11 | | Before you upgrade the component, you must first grant permissions. |
cluster-autoscaler component installed through a node pool
Component Name | Minimum Component Version | Run the following command to redeploy | Remarks |
cluster-autoscaler | v1.3.1-bcf13de9-aliyun | | You can view the version of the cluster-autoscaler component in the following two ways. To upgrade the component version, see [Component Upgrade] cluster-autoscaler Upgrade Announcement.
|
Check the Terway component configuration
If the terway, terway-eni, or terway-eniip component is installed in your cluster, you must also manually check the Terway configuration file. To do this, check the content of the eni_conf configuration in the ConfigMap named eni-config in the kube-system namespace.
Run the following command to edit and view the Terway ConfigMap.
kubectl edit cm eni-config -n kube-systemIf the file contains the configuration item
"credential_path": "/var/addon/token-config",, no further action is required.If the file does not contain the configuration item
"credential_path": "/var/addon/token-config",, you must manually modify theeni_confconfiguration. Add the line"credential_path": "/var/addon/token-config",below themin_pool_sizeconfiguration item."credential_path": "/var/addon/token-config",
Redeploy the Terway component workload by running the corresponding deployment command.
Step 3: Collect audit logs
You must collect API operation audit logs to analyze the logs generated by the test cluster. This helps you check whether any applications in the cluster still depend on the access policies granted to the Worker RAM role. For more information about the Alibaba Cloud services that ActionTrail supports, see Supported Alibaba Cloud services.
Collect audit logs for at least one week.
In the ActionTrail console, create a single-account trail for the region where the cluster is located. When you create the trail, select Deliver Events To Simple Log Service (SLS). For more information, see Create a single-account trail.
Step 4: Test cluster features
After you complete the restriction operations, you must test the basic features of the cluster to ensure that the system components work as expected.
Basic Feature | Basic Test Case | References |
Compute | Nodes can be scaled out and in as expected. | |
Network | IP addresses can be assigned to pods as expected. | |
Storage | Workloads that use external storage can be deployed as expected (if this feature is used). | |
Monitoring | Monitoring and alert data can be obtained as expected. | |
Elasticity | Node autoscaling can be implemented as expected (if this feature is used). | |
Security | The password-free image pulling feature can be used as expected (if this feature is used). |
After you test the basic features of the cluster, you must also test the business logic of the applications deployed in the cluster. This ensures that your business works as expected.
Step 5: Analyze audit logs
Log on to the Simple Log Service console.
In the Projects section, click the one you want.

On the tab, click the destination Logstore.
Query all audit logs stored in the Logstore that belongs to the log project specified in Step 3. The Logstore is named actiontrail_<trail_name>.
Use the following query statement to count the OpenAPI operations that are called by applications in the cluster using the Security Token Service (STS) token of the Worker RAM role.
Replace
<worker_role_name>in the following statement with the name of the Worker RAM role of the cluster.* and event.userIdentity.userName: <worker_role_name> | select "event.serviceName", "event.eventName", count(*) as total GROUP BY "event.eventName", "event.serviceName"
Step 6: Restrict the permissions granted to the Worker RAM role
Log on to the ACK console. In the navigation pane on the left, click Clusters.
On the Clusters page, click the name of the target cluster. On the Basic Information tab, click the link next to Worker RAM Role to open the RAM console.
On the Permissions tab of the Role page, click the destination access policy to go to the Policy Document tab. Then, click Edit Policy.
ImportantBefore you modify the policy document, back up the existing policy document. This lets you roll back the permission configuration if needed.
When you modify the policy document, decide whether to delete unnecessary permissions based on your requirements and the analysis results of the audit logs from Step 5. For example, you can delete the
actionpermissions that do not appear in the statistics. If you confirm that no permissions are required, you can revoke all granted access policies.Redeploy the workloads of the system components. For more information, see the redeployment commands in Step 2.
Repeat the restriction operations in Step 4, Step 5, and Step 6 until the Worker RAM role is granted only the minimum permissions required by the components or applications.
References
For more information about the overall authorization system of ACK, see Best practices for authorization.