All Products
Search
Document Center

Container Service for Kubernetes:Manually limit the permissions of the worker RAM role of an ACK managed cluster

Last Updated:May 26, 2024

To ensure node security for an ACK managed cluster, you can manually limit the permissions of the worker Resource Access Management (RAM) role of the cluster based on the least privilege principle.

Prerequisites

Step 1: Confirm whether permission limits are required

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, click the name of the cluster that you want to manage. On the cluster details page, click the Cluster Resources tab. Click the hyperlink on the right side of Worker RAM Role to go to the RAM console.

  3. On the Permissions tab of the role details page, check whether policies are displayed.

    • If no policy is displayed, you do not need to limit the permissions of the worker RAM role.

    • If a policy is displayed, such as k8sWorkerRolePolicy-db8ad5c7***, you may need to limit the permissions of the worker RAM role. In this case, we recommend that you limit the permissions of the worker RAM role based on your requirements and the least privilege principle.

Step 2: Update system components

Update the key system components of the ACK managed cluster to the minimum required version or the latest version. For more information, see Manage system components.

Important
  • Do not update multiple components at the same time. Instead, update them one after one. Before you start to update a component, make sure that the previous component is successfully updated.

  • Before you update a component, we recommend that you read and understand the remarks of the component in the following table.

Components can be installed from the Add-ons page in the ACK console or by using node pools. The following table describes how to update components installed by using the preceding methods.

Components installed from the Add-ons page

Go to the Add-ons page and update the installed components based on the descriptions in the following table. If a component is already of the minimum required version or the latest version, redeploy the component by running the corresponding command in the following table or by clicking Redeploy in the ACK console.

Component

Minimum required version

Redeploy command

Remarks

metrics-server

v0.3.9.4-ff225cd-aliyun

kubectl -n kube-system rollout restart deployment/metrics-server

None

alicloud-monitor-controller

v1.5.5

kubectl -n kube-system rollout restart deployment/alicloud-monitor-controller

None

logtail-ds

v1.0.29.1-0550501-aliyun

kubectl -n kube-system rollout restart daemonset/logtail-ds
kubectl -n kube-system rollout restart deployment/alibaba-log-controller

terway

v1.0.10.333-gfd2b7b8-aliyun

kubectl -n kube-system rollout restart daemonset/terway

  • Update the Terway component that corresponds to the Terway mode that is enabled. For more information about the Terway modes, see Introduction to Terway.

  • After Terway is updated, you need to manually modify the configurations of Terway. For more information, see Check the configurations of Terway.

terway-eni

v1.0.10.333-gfd2b7b8-aliyun

kubectl -n kube-system rollout restart daemonset/terway-eni

terway-eniip

v1.0.10.333-gfd2b7b8-aliyun

kubectl -n kube-system rollout restart daemonset/terway-eniip

terway-controlplane

v1.2.1

kubectl -n kube-system rollout restart deployment/terway-controlplane

None

flexvolume

v1.14.8.109-649dc5a-aliyun

kubectl -n kube-system rollout restart daemonset/flexvolume

Upgrade from FlexVolume to CSI.

csi-plugin

v1.18.8.45-1c5d2cd1-aliyun

kubectl -n kube-system rollout restart daemonset/csi-plugin

None

csi-provisioner

v1.18.8.45-1c5d2cd1-aliyun

kubectl -n kube-system rollout restart deployment/csi-provisioner

None

storage-operator

v1.18.8.55-e398ce5-aliyun

kubectl -n kube-system rollout restart deployment/storage-auto-expander
kubectl -n kube-system rollout restart deployment/storage-cnfs
kubectl -n kube-system rollout restart deployment/storage-monitor
kubectl -n kube-system rollout restart deployment/storage-snapshot-manager
kubectl -n kube-system rollout restart deployment/storage-operator

None

alicloud-disk-controller

v1.14.8.51-842f0a81-aliyun

kubectl -n kube-system rollout restart deployment/alicloud-disk-controller

None

ack-node-problem-detector

1.2.16

kubectl -n kube-system rollout restart deployment/ack-node-problem-detector-eventer

None

aliyun-acr-credential-helper

v23.02.06.2-74e2172-aliyun

kubectl -n kube-system rollout restart deployment/aliyun-acr-credential-helper

Before you start the update, you must grant permissions.

  • If you do not need to acquire custom RAM permissions or pull images across Alibaba Cloud accounts, go to the Add-ons page and modify the configurations of the component by setting the tokenMode parameter to managedRole.

  • If you do not need to use the password-free image pulling feature, you can uninstall the component.

ack-cost-exporter

1.0.10

kubectl -n kube-system rollout restart deployment/ack-cost-exporter

Before you start the update, you must grant permissions.

mse-ingress-controller

1.1.5

kubectl -n mse-ingress-controller rollout restart deployment/ack-mse-ingress-controller

Before you start the update, you must grant permissions.

arms-prometheus

1.1.11

kubectl -n arms-prom rollout restart deployment/arms-prometheus-ack-arms-prometheus

None

ack-onepilot

3.0.11

kubectl -n ack-onepilot rollout restart deployment/ack-onepilot-ack-onepilot

Before you start the update, you must grant permissions.

cluster-autoscaler installed by using node pools

Component

Minimum required version

Redeploy command

Remarks

cluster-autoscaler

v1.3.1-bcf13de9-aliyun

kubectl -n kube-system rollout restart deployment/cluster-autoscaler

You can use the following methods to view the version of cluster-autoscaler. For more information about how to update cluster-autoscaler, see [Component updates] Update cluster-autoscaler.

  • View the version of cluster-autoscaler in the ACK console. For more information, see View the version of cluster-autoscaler.

  • View the version of cluster-autoscaler by running the following command:

    kubectl -n kube-system get deployment/cluster-autoscaler -o yaml | grep acs/autoscaler

Check the configurations of Terway

If terway, terway-eni, or terway-eniip is installed in your cluster, you need to manually check the configuration file of Terway, which is the eni_conf ConfigMap in the kube-system namespace.

  1. Run the following command to view and modify the eni_conf ConfigMap:

    kubectl edit cm eni-config -n kube-system
    • If the "credential_path": "/var/addon/token-config", setting is included in the eni-conf ConfigMap, no additional action is required.

    • If the "credential_path": "/var/addon/token-config", setting is not included in the eni_conf ConfigMap, you need to add a new row below the min_pool_size parameter and specify "credential_path": "/var/addon/token-config", in the row.

      "credential_path": "/var/addon/token-config",
  2. Run the corresponding command in the preceding table to redeploy Terway.

Step 3: Use ActionTrail to collect cluster logs

Use ActionTrail to collect API audit logs to analyze the API operations performed in the cluster. This way, you can identify the applications that rely on the RAM policy attached to the worker RAM role of the cluster. For more information about the Alibaba Cloud services that work with ActionTrail, see Services that work with ActionTrail.

Note

We recommend that you collect audit logs that are generated within more than at least one week.

Go to the ActionTrail console and create a single-account trail in the region where the cluster resides. When you create the single-account trail, select Delivery to Simple Log Service. For more information, see Create a single-account trail.

Step 4: Perform a functional test on the cluster

After the preceding steps are completed, perform a functional test on the cluster to check whether the cluster works as expected.

Test item

Description

Reference

Computing

Whether the cluster can scale nodes as expected.

Scale a node pool

Network

Whether the cluster can assign IP addresses to pods as expected.

Application deployment

Storage

Whether the cluster can deploy workloads that use external storage as expected if external storage is enabled.

Storage - CSI

Monitoring

Whether the cluster can generate alerts as expected.

Observability

Scalability

Whether the cluster can automatically scale nodes as expected if auto scaling is enabled.

Auto scaling of nodes

Security

Whether the cluster can use the password-free image pulling feature as expected if the feature is enabled.

Use the aliyun-acr-credential-helper component to pull images without a password

Important

After the functional test is completed, verify the logic of the business deployed in your cluster to ensure that the business runs as expected.

Step 5: Analyze the logs collected by ActionTrail

  1. Log on to the Simple Log Service console.

  2. In the Projects section, click the project that you want to manage.

    image

  3. On the details page of the project, choose Log Storage > Logstores and click the Logstore that you want to manage on the Logstores tab.

    The name of the Logstore that you use to store the logs collected by ActionTrail in Step 3 is in the actiontrail_<trail name> format

  4. Use the following query statement to retrieve the API operations that the worker RAM role of the cluster performs by using STS tokens.

    Replace <worker_role_name> with the name of the worker RAM role of the cluster.

    * and event.userIdentity.userName: <worker_role_name> | select "event.serviceName", "event.eventName", count(*) as total GROUP BY "event.eventName", "event.serviceName"

Step 6: Limit the permissions of the worker RAM role

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, click the name of the cluster that you want to manage. On the cluster details page, click the Cluster Resources tab. Click the hyperlink on the right side of Worker RAM Role to go to the RAM console.

  3. On the Permissions tab of the role details page, click the RAM policy that you want to manage. On the Policy Content tab, click Modify Policy Document.

    Important

    Before you modify the policy, make a copy of the original policy content in case you need to roll back the policy.

    Delete permissions from the policy based on your business requirements and the analysis result generated in Step 5. For example, you can delete the API operations that are not included in the analysis result from the Action section of the policy content. If you confirm that all API operations in the policy content are not required, you can detach the RAM policy from the worker RAM role.

  4. Redeploy the system component. For more information, see the redeploy commands in Step 2.

  5. Repeat Step 4, Step 5, and Step 6 until the worker RAM role provides only the minimum permissions required by the components and applications in your cluster.

References

For more information about the authorization system of ACK, see Best practices of authorization.