This topic provides answers to some frequently asked questions about using Container Service for Kubernetes (ACK) clusters. This topic also describes the troubleshooting procedure.
ACK cluster exceptions
How do I resolve the issues that occur when I add nodes to a cluster?
How do I troubleshoot network issues of Kubernetes clusters?
For more information about the frequently asked questions about ACK, see FAQ (earlier version).
Troubleshoot application issues in ACK
Pods remain in the Pending state
Pods that remain in the Pending state cannot be scheduled to nodes. This is because the cluster does not have sufficient resources to run the pods. You can run the
kubectl describe pod
command to view the events and troubleshoot the issues.Pods remain in the Waiting state
If a pod remains in the Waiting state, the pod is scheduled to a node but cannot run as normal. This is because the private image or public image fails to be pulled or the image address is invalid. For more information, see Pods remain in the Waiting state.
Pods keep restarting but remain in the Crashing or Unhealthy state
If a pod remains in the Crashing or Unhealthy state, the pod is scheduled to a node but fails to start. This issue is caused by configuration errors or permission issues. You can view the container log and check if the application in the pod encounters an error. For more information, see Pods remain in the Crashing or Unhealthy state.
Pods remain in the Running state but do not run as normal
This is because the YAML file contains some invalid fields. You can verify the Deployment of the pod to identify the cause. For more information, see Pods remain in the Running state but do not run as normal.
Services cannot run as normal
If the issue is not caused by the network plug-in, the issue is probably caused by invalid
labels
. In this case, you can check theendpoints
to identify the cause. For more information, see Troubleshoot Services.
How do I upgrade an ACK cluster?
You can upgrade an ACK cluster by using one of the following methods:
Upgrade the Kubernetes version of the ACK cluster. For more information, see Update an ACK cluster.
Upgrade an ACK standard cluster to an ACK Pro cluster. For more information, see Hot migration from ACK basic clusters to ACK Pro clusters.
Troubleshooting procedure and common causes
Check whether ECS instances can communicate with each other. For more information, see Fail to ping ECS instances.
Check whether the security group is properly configured. For more information, see Check security group rules.
For more information about how to configure ECS security groups, see Configure security groups in different scenarios.
Check whether the RAM user is granted the required permissions. For more information, see Grant permissions to a RAM user.
Check whether the running environment is normal when you run the
docker run
command.Check whether kubectl can be used to log on to a cluster when errors occur in the cluster. Check whether you can run the
kubectl get event
command as normal. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.Check whether a pod can access another pod in the Kubernetes cluster. For more information, see Pod fails to access pods on another node.
Check whether Services can be used to access applications. For more information, see Use an existing SLB instance to expose an application.
Check whether Ingresses can be used to access applications. For more information, see Access Services by using an ALB Ingress.
Check whether errors are recorded in the logs of the API server, scheduler, and controller.
Check whether errors are recorded in the log of the Docker daemon.
If the
docker daemon is not running
error message is returned, you need only to start the Docker daemon in cmd.exe.If you use Windows, run the following command to start the Docker daemon:
cd C:\Program Files\Docker\Docker DockerCli.exe -SwitchDaemon
If you use Linux, run the following command to start the Docker daemon:
service docker restart
How do I troubleshoot errors based on log data?
You can run the following commands to view logs and troubleshoot errors.
Run the
kubectl describe ****
command to view events.Run the
journalctl -u docker -f
command to query the log of Docker.Run the
journalctl -u kubelet -f
command to query the log of kubelet.Run the
docker logs <api server container id>
command to query the log of the API server.NoteThis command is used to query the log of the API server in ACK dedicated clusters. If you use an ACK managed cluster, see Collect the logs of control plane components in ACK Pro clusters.
Run the
docker logs <scheduler container id>
command to query the log of the scheduler.Run the
docker logs <worker proxy container id>
command to query the log of the worker proxy.Run the
docker logs <master proxy container id>
command to query the log of the master proxy.Run the
docker logs <controller container id>
<controller container id>
command to query the logs of controllers. The controllers are kube-controller, alicloud-monitor-controller, alicloud-disk-controller, and cloud-controller.
We recommend that you import the logs to Log Service and analyze the logs in Log Service. For more information, see Getting Started.