When a component operation fails in the Container Service for Kubernetes (ACK) console, an error code appears on the screen. This page describes all component error codes, their causes, and how to resolve them.
Error code reference
| Error code | Description |
|---|---|
| AddonOperationFailed.ResourceExists | A resource required by the component already exists in the cluster |
| AddonOperationFailed.ReleaseNameInUse | A Helm release with the same name as the component already exists |
| AddonOperationFailed.WaitForAddonReadyTimeout | Component pods cannot reach the Ready state after the update request is submitted |
| AddonOperationFailed.APIServerUnreachable | ACK cannot access the Kubernetes API server |
| AddonOperationFailed.ResourceNotFound | Resources required by the component cannot be found |
| AddonOperationFailed.TillerUnreachable | Helm V2 Tiller is inaccessible |
| AddonOperationFailed.FailedCallingWebhook | A mutating webhook for a component resource cannot be called |
| AddonOperationFailed.UserForbidden | Tiller lacks the required role-based access control (RBAC) permissions |
| AddonOperationFailed.TillerNotFound | No Tiller pod is running in the cluster |
| AddonOperationFailed.ErrPatchingClusterRoleBinding | A ClusterRoleBinding required by the component exists but has a conflicting configuration |
| AddonOperationFailed.ErrApplyingPatch | The component's YAML manifests are incompatible between versions |
AddonOperationFailed.ResourceExists
Symptoms
The console displays an error message similar to:
Addon status not match, failed upgrade helm addon arms-cmonitor for cluster c3cf94b952cd34b54b71b10b7********, err: rendered manifests contain a resource that already exists. Unable to continue with update: ConfigMap "otel-collector-config" in namespace "arms-prom" exists and cannot be imported into the current releaseCause
A resource required by the component already exists in the cluster, preventing installation. This typically happens when:
Another version of the component (such as the open-source version) is already installed by a different method.
The component was installed with Helm V2, and its resources were not removed before migrating to Helm V3.
A resource with the same name as one required by the component was created manually.
Solution
Delete the conflicting resources identified in the error message, then retry the installation or update.
The error message tells you which resource to delete. The following sections list the exact commands for specific components.
arms-prometheus
For arms-prometheus, delete the namespace in which arms-prometheus is installed. In most cases, arms-prometheus is installed in the arms-prom namespace. Run the following commands, then install or update arms-prometheus again.
kubectl delete ClusterRole arms-kube-state-metrics
kubectl delete ClusterRole arms-node-exporter
kubectl delete ClusterRole arms-prom-ack-arms-prometheus-role
kubectl delete ClusterRole arms-prometheus-oper3
kubectl delete ClusterRole arms-prometheus-ack-arms-prometheus-role
kubectl delete ClusterRole arms-pilot-prom-k8s
kubectl delete ClusterRoleBinding arms-node-exporter
kubectl delete ClusterRoleBinding arms-prom-ack-arms-prometheus-role-binding
kubectl delete ClusterRoleBinding arms-prometheus-oper-bind2
kubectl delete ClusterRoleBinding kube-state-metrics
kubectl delete ClusterRoleBinding arms-pilot-prom-k8s
kubectl delete ClusterRoleBinding arms-prometheus-ack-arms-prometheus-role-binding
kubectl delete Role arms-pilot-prom-spec-ns-k8s
kubectl delete Role arms-pilot-prom-spec-ns-k8s -n kube-system
kubectl delete RoleBinding arms-pilot-prom-spec-ns-k8s
kubectl delete RoleBinding arms-pilot-prom-spec-ns-k8s -n kube-systemack-node-local-dns
Your workloads are not affected after the resource is deleted. Do not add pods between the deletion and the update. If you do, delete and recreate those pods after the component update to reinject the DNS cache.
kubectl delete MutatingWebhookConfiguration ack-node-local-dns-admission-controllerAfter the resource is deleted, update ack-node-local-dns.
arms-cmonitor
kubectl delete ConfigMap otel-collector-config -n arms-prom
kubectl delete ClusterRoleBinding arms-prom-cmonitor-role-binding
kubectl delete ClusterRoleBinding arms-prom-cmonitor-install-init-role-binding
kubectl delete ClusterRole arms-prom-cmonitor-role
kubectl delete ClusterRole arms-prom-cmonitor-install-init-role
kubectl delete ServiceAccount cmonitor-sa-install-init -n kube-systemAfter the resources are deleted, install or update arms-cmonitor.
AddonOperationFailed.ReleaseNameInUse
Cause
A Helm release with the same name as the component already exists in the cluster. This prevents Helm from installing or updating the component. Common causes:
Another version of the component is installed by a different method.
A leftover Helm release remains from a previous installation attempt.
Solution
Log on to the ACK console. In the left navigation pane, click ACK consoleACK consoleACK consoleACK consoleClusters.
On the Clusters page, click the name of the target cluster. In the left pane, choose Applications > Helm.
Find the Helm release named after the component. In the Actions column, click Delete. In the dialog box, select Clear Release Records and click OK.
Install or update the component.
AddonOperationFailed.WaitForAddonReadyTimeout
Cause
The update request was submitted, but the component pods cannot reach the Ready state within the timeout period.
Troubleshooting
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of the target cluster. In the left pane, choose Operations > Event Center.
On the Events (Cluster Resource Events) tab, set Level to Warning, select the namespace where the component is deployed, and set Type to Pod. Review the event details to identify the cause. The following section describes common causes and their solutions.
Common causes and solutions
Cause 1: Pods cannot be scheduled (FailedScheduling)
The nodes in the cluster do not meet the scheduling requirements for the component pods. Check the event details for one of the following messages:
| Event message | Cause | Solution |
|---|---|---|
Insufficient memory or Insufficient cpu | Nodes lack sufficient resources | Delete unneeded pods, add nodes to the cluster, or upgrade node configurations |
the pod didn't tolerate | Node taints are not tolerated by the component pods | Remove the taints from nodes |
didn't match pod anti-affinity rules | Anti-affinity rules cannot be satisfied | Add nodes to the cluster |
After resolving the scheduling issue, update the component again.
Cause 2: Pod sandbox cannot be created (FailedCreatePodSandBox)
The network plugin cannot allocate IP addresses to pods. Check the event details:
If the message contains
vSwitch have insufficient IP, add new pod vSwitches in Terway mode.If the message contains
transport: Error while dialing, troubleshoot the pod to check whether the cluster's network plugin is working correctly.
AddonOperationFailed.APIServerUnreachable
Cause
ACK cannot reach the Kubernetes API server. The most common cause is that the Server Load Balancer (SLB) instance exposing the API server is misconfigured or not working as expected.
Solution
AddonOperationFailed.ResourceNotFound
Cause
Resources required by the component are missing—likely deleted or modified externally—so the component cannot be updated in place.
Solution
Uninstall the component and install the latest version.
AddonOperationFailed.TillerUnreachable
Cause
The component uses Helm V2, which depends on Tiller for installation and updates. Tiller has encountered an error and is inaccessible.
Solution
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of the target cluster. In the left pane, choose Workloads > Pods.
Select the
kube-systemnamespace. Find and delete the tiller pod. The system automatically recreates it.After the Tiller pod reaches the Ready state, retry the component operation.
AddonOperationFailed.FailedCallingWebhook
Symptoms
The console displays an error message similar to:
failed to create: Internal error occurred: failed calling webhook "rancher.cattle.io": failed to call webhook: Post "https://rancher-webhook.cattle-system.svc:443/v1/webhook/mutation?timeout=10s": no endpoints available for service "rancher-webhook"Cause
A mutating webhook configured for a component resource cannot be called, blocking resource updates.
Solution
Troubleshoot the failing webhook and fix the issue, then update the component again. You can identify the webhook that cannot be called from the error message.
In the example above, the rancher-webhook webhook in the cattle-system namespace is unavailable.
AddonOperationFailed.UserForbidden
Cause
The cluster uses Helm V2, but Tiller lacks the RBAC permissions needed to query and update resources, preventing component installation or updates.
Solution
Grant the required RBAC permissions to Tiller. For details, see Role-based access control.
AddonOperationFailed.TillerNotFound
Cause
The cluster uses Helm V2, but no Tiller pod is running normally in the cluster.
Solution
Troubleshoot the tiller-deploy pod in the kube-system namespace. After the pod runs normally, retry the component operation. For troubleshooting steps, see Troubleshoot pod issues.
AddonOperationFailed.ErrPatchingClusterRoleBinding
Cause
A ClusterRoleBinding required by the component already exists in the cluster, but its configuration conflicts with what the component expects. This typically occurs when an open-source version of the component is installed separately.
Solution
Uninstall the open-source version of the component from the cluster:
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of the target cluster. In the left pane, choose Applications > Helm.
Find the Helm release named after the component. In the Actions column, click Delete. In the dialog box, select Clear Release Records and click OK.
Install or update the component.
AddonOperationFailed.ErrApplyingPatch
Symptoms
The console displays an error message similar to:
spec.template.spec.initContainers[1].name: Duplicate value: "install-cni"Cause
The YAML manifests of the currently installed component version are incompatible with the target version, preventing the update. This can happen when:
Another version of the component (such as the open-source version) is installed by a different method.
The component's YAML manifests were modified manually.
The currently installed version is no longer supported.
Solution
Modify the component's YAML manifests based on the error message. If you need assistance, submit a ticket.
Example: Flannel container name conflict
If a discontinued Flannel version is installed, the update may fail with:
spec.template.spec.initContainers[1].name: Duplicate value: "install-cni"To fix this, edit the Flannel DaemonSet manifest:
kubectl -n kube-system edit ds kube-flannel-dsIn the manifest, find the install-cni container definition under spec.template.spec.containers and delete it (lines 7 to 21 in the example below):
containers:
- name: kube-flannel
image: registry-vpc.{{.Region}}.aliyuncs.com/acs/flannel:{{.ImageVersion}}
command: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr" ]
...
# Irrelevant lines are not shown. Delete comment lines 7 to 21.
# - command:
# - /bin/sh
# - -c
# - set -e -x; cp -f /etc/kube-flannel/cni-conf.json /etc/cni/net.d/10-flannel.conf;
# while true; do sleep 3600; done
# image: registry-vpc.cn-beijing.aliyuncs.com/acs/flannel:v0.11.0.1-g6e46593e-aliyun
# imagePullPolicy: IfNotPresent
# name: install-cni
# resources: {}
# terminationMessagePath: /dev/termination-log
# terminationMessagePolicy: File
# volumeMounts:
# - mountPath: /etc/cni/net.d
# name: cni
# - mountPath: /etc/kube-flannel/
# Irrelevant lines are not shown. Delete comment lines 7 to 21.
name: flannel-cfg
...
Deleting these lines does not interrupt running workloads. A rolling update starts automatically. After it completes, update Flannel from the ACK console. For details, see Manage components.