When you set the service type to Type=LoadBalancer, the ACK Cloud Controller Manager (CCM) automatically creates or configures an Alibaba Cloud Classic Load Balancer (CLB) instance for the service. CCM manages the CLB instance, its listeners, and backend server groups. For more information, see Notes on Server Load Balancer configurations for services.
Prerequisites
Before you begin, make sure your CCM component version is V1.9.3.276-g372aa98-aliyun or later. For upgrade instructions, see Upgrade the CCM component. For release notes, see Cloud Controller Manager.
Diagnostic process
Use the following steps to identify the source of a LoadBalancer service issue.
-
Identify the service associated with the CLB instance. Replace
XXX.XXX.XXX.XXXwith the load balancer IP address.kubectl get svc -A | grep -i LoadBalancer | grep {XXX.XXX.XXX.XXX}A healthy service shows output similar to:
default my-svc LoadBalancer 10.x.x.x XXX.XXX.XXX.XXX 80:32xxx/TCP 5d -
Check whether the service has error events.
-
If error events exist, match the error message in Service error events and solutions.
-
If no error events exist, use the symptom-based guide in Troubleshooting methods.
kubectl -n {your-namespace} describe svc {your-svc-name}Look at the Events section at the bottom of the output. A service with errors shows output similar to:
Events: Type Reason Age From Message ---- ------ --- ---- ------- Warning SyncLoadBalancerFailed 2m service-controller <error message here> -
Service error events and solutions
Run kubectl -n {your-namespace} describe svc {your-svc-name} and match the error message in the Events section to the table below.
| Error message | Cause | Solution |
|---|---|---|
The backend server number has reached to the quota limit of this load balancers |
The CLB instance has reached the 200-backend-server quota limit. | Do one of the following: <br>1. Request a quota increase on the SLB Quota Management page. <br>2. Set externalTrafficPolicy: Local to reduce backend server consumption. If you keep Cluster mode, add the service.beta.kubernetes.io/alibaba-cloud-loadbalancer-backend-label annotation to limit which nodes are added as backends. <br>3. Create a new CLB instance instead of sharing one across multiple services. |
The loadbalancer does not support backend servers of eni type |
Shared CLB instances do not support Elastic Network Interface (ENI) backends. | Add the annotation service.beta.kubernetes.io/alibaba-cloud-loadbalancer-spec: "slb.s1.small" to use a high-performance CLB instance. Verify that the annotation is supported by your CCM version. For the annotation-to-version mapping, see Use annotations to configure a Classic Load Balancer (CLB) instance. |
There are no available nodes for LoadBalancer |
The CLB instance has no backend servers. | Check the pod status: <br>- If no pod is associated with the service, add a matching application pod. <br>- If the pod is unhealthy, resolve the pod issue first. For more information, see Troubleshoot pod issues. <br>- If the pod is running but not added as a backend, check whether the pod is on a master node and evict it to a worker node. |
alicloud: not able to find loadbalancer named [%s] in openapi, but it's defined in service.loaderbalancer.ingress... or alicloud: can not find loadbalancer, but it's defined in service |
The CLB instance referenced by the service cannot be found. | Search for the CLB instance in the Server Load Balancer console using the service's EXTERNAL-IP. <br>- If the CLB instance no longer exists and the service is not needed, delete the service. <br>- If the CLB instance exists and was created manually, add the service.beta.kubernetes.io/alibaba-cloud-loadbalancer-id annotation. See Use annotations to configure a Classic Load Balancer (CLB) instance. <br>- If the CLB instance was created by CCM, add the kubernetes.do.not.delete label to the service. See How do I rename an SLB instance if I am using an earlier version of CCM?. |
ORDER.ARREARAGE Message: The account is arrearage. |
The account has an overdue payment. | Settle the overdue payment. |
PAY.INSUFFICIENT_BALANCE Message: Your account does not have enough balance. |
The account balance is insufficient.
Your account balance is insufficient. |
Top up the account balance. |
Status Code: 400 Code: Throttlingxxx |
The CLB OpenAPI is being throttled. | 1. Check your CLB quota on the SLB Quota Management page. <br>2. Check for service errors and resolve them: kubectl -n {your-namespace} describe svc {your-svc-name}. |
Status Code: 400 Code: RspoolVipExist Message: there are vips associating with this vServer group. |
The listener linked to the vServer group cannot be deleted. | 1. Check whether the service annotation contains a CLB ID: service.beta.kubernetes.io/alibaba-cloud-loadbalancer-id: {your-clb-id}. If it does, the CLB instance is being reused. <br>2. In the CLB console, delete the listener that corresponds to the port defined in the service. See Configure listener forwarding rules. |
Status Code: 400 Code: NetworkConflict |
The internal-facing CLB instance is in a different Virtual Private Cloud (VPC) than the cluster. | Move the CLB instance to the same VPC as the cluster, or create a new CLB instance in the correct VPC. |
Status Code: 400 Code: VSwitchAvailableIpNotExist Message: The specified VSwitch has no available ip. |
The vSwitch has no available IP addresses. | Add the annotation service.beta.kubernetes.io/alibaba-cloud-loadbalancer-vswitch-id: "${YOUR_VSWITCH_ID}" to specify a different vSwitch in the same VPC. |
The specified Port must be between 1 and 65535. |
ENI mode does not support string values for targetPort. |
Change targetPort to an integer in the service YAML, or upgrade CCM. See Upgrade the CCM component. |
Status Code: 400 Code: ShareSlbHaltSales Message: The share instance has been discontinued. |
Older CCM versions create shared CLB instances by default, which are now discontinued. | Upgrade the CCM component. |
can not change ResourceGroupId once created |
The CLB resource group cannot be changed after instance creation. | Remove the service.beta.kubernetes.io/alibaba-cloud-loadbalancer-resource-group-id:"rg-xxxx" annotation from the service. |
can not find eniid for ip x.x.x.x in vpc vpc-xxxx |
The ENI IP address cannot be found in the VPC. This occurs when service.beta.kubernetes.io/backend-type: eni is set but the cluster uses the Flannel network plugin, which does not support ENI mode. |
Remove the service.beta.kubernetes.io/backend-type: eni annotation from the service. |
The operation is not allowed because the instanceChargeType of loadbalancer is PayByCLCU. or User does not have permission modify InstanceChargeType to spec. |
The CLB billing method cannot change from pay-as-you-go (PayByCLCU) to pay-by-specification. | Do one of the following: <br>- Remove the service.beta.kubernetes.io/alibaba-cloud-loadbalancer-spec annotation. <br>- If the service has the service.beta.kubernetes.io/alibaba-cloud-loadbalancer-instance-charge-type annotation, set its value to PayByCLCU. |
SyncLoadBalancerFailed the loadbalancer xxx can not be reused, can not reuse loadbalancer created by kubernetes. |
The CLB instance was created by CCM and cannot be reused via the service.beta.kubernetes.io/alibaba-cloud-loadbalancer-id annotation. |
1. Find the CLB ID in the service.beta.kubernetes.io/alibaba-cloud-loadbalancer-id annotation of the service YAML. <br>2. Resolve based on service status: <br> - Service is pending: Replace the CLB ID in the annotation with the ID of a CLB instance you created manually in the Classic Load Balancer (CLB) console. <br> - Service is not pending, CLB IP matches the service EXTERNAL-IP: Delete the service.beta.kubernetes.io/alibaba-cloud-loadbalancer-id annotation. <br> - Service is not pending, CLB IP does not match: In the CLB console, find the CLB instance matching the service's EXTERNAL-IP and update the annotation with its ID. If no match is found, replace the annotation with the ID of a manually created CLB instance and recreate the service. |
alicloud: can not change LoadBalancer AddressType once created. delete and retry |
The CLB instance type cannot be changed after creation. | Delete the service and recreate it. |
the loadbalancer lb-xxxxx can not be reused, service has been associated with ip [xxx.xxx.xxx.xxx], cannot be bound to ip [xxx.xxx.xxx.xxx] |
The service is already bound to a CLB instance and cannot be rebound to a different one by changing the annotation. | Delete the service and recreate it with the correct CLB instance ID. |
Troubleshooting methods
For issues that do not produce error events, use the following symptom-based guide.
| Issue | Symptom | Solution |
|---|---|---|
| CLB access issues | Uneven load distribution across backends | Uneven load distribution across CLB backends |
| 503 error during application updates | 503 error during application updates | |
| CLB inaccessible from within the cluster | CLB inaccessible from within the cluster | |
| CLB inaccessible from outside the cluster | CLB inaccessible from outside the cluster | |
| "The plain HTTP request was sent to HTTPS port" error | Cannot connect to the backend HTTPS service | |
| CLB configuration issues | Service annotations do not take effect | What do I do if service annotations do not take effect? |
| CLB configuration is unexpectedly modified | Why is the configuration of my CLB instance modified? | |
| Reusing an existing CLB instance does not take effect | Service FAQ | |
| No listener configured when reusing an existing CLB instance | Why is no listener configured when I reuse an existing CLB instance? | |
| Inconsistent CLB backends | What do I do if the SLB vServer group is not updated? | |
| CLB deletion issues | CLB instance is unexpectedly deleted | When is an SLB instance automatically deleted? |
| CLB instance is not deleted after the service is deleted | When is an SLB instance automatically deleted? |
Uneven load distribution across CLB backends
Cause: The CLB scheduling algorithm is not suited to the traffic pattern.
Symptom: Request load is unevenly distributed across backend servers.
Solution:
-
For services with
externalTrafficPolicy: Local, add theservice.beta.kubernetes.io/alibaba-cloud-loadbalancer-scheduler:"wrr"annotation to use weighted round-robin scheduling. -
For services using persistent connections, add the
service.beta.kubernetes.io/alibaba-cloud-loadbalancer-scheduler:"wlc"annotation to use weighted least connections scheduling. This prevents a single long-lived connection from receiving a disproportionate share of traffic.
503 error during application updates
Cause: Connection draining is not configured on the CLB listener, or graceful termination is not configured on the pod. During a rolling update, CLB may route traffic to pods that are already shutting down.
Symptom: A 503 error is returned when accessing the CLB instance during an application update.
Solution:
-
Configure connection draining on the CLB listener by adding the
service.beta.kubernetes.io/alibaba-cloud-loadbalancer-connection-drainannotation. For the full list of listener annotations, see Common operations to manage listeners. -
Configure
readinessProbeandpreStopon the pod: Example pod configuration:-
`readinessProbe`: A pod is added to the service Endpoint only after it passes the readiness probe. After ACK detects the Endpoint change, it attaches the node to the CLB backend. Set the probe frequency, delay, and failure threshold to match your application's startup time. If the timeout is too short, pods may restart repeatedly.
-
`preStop` and `terminationGracePeriodSeconds`: Set
preStopto the time your application needs to drain in-flight requests. SetterminationGracePeriodSecondsto at least 30 seconds longer thanpreStop.
apiVersion: v1 kind: Pod metadata: name: nginx namespace: default spec: containers: - name: nginx image: nginx # Liveness probe livenessProbe: failureThreshold: 3 initialDelaySeconds: 30 periodSeconds: 30 successThreshold: 1 tcpSocket: port: 5084 timeoutSeconds: 1 # Readiness probe readinessProbe: failureThreshold: 3 initialDelaySeconds: 30 periodSeconds: 30 successThreshold: 1 tcpSocket: port: 5084 timeoutSeconds: 1 # Graceful termination lifecycle: preStop: exec: command: - sleep - 30 terminationGracePeriodSeconds: 60 -
CLB inaccessible from within the cluster
Cause: externalTrafficPolicy: Local is set on the service. With this setting, kube-proxy only forwards traffic to pods running on the same node as the request origin. If the node receiving the request has no backend pod for the service, the connection fails. This affects in-cluster traffic routed to the CLB address. For details, see kube-proxy adds external-lb address to node-local iptables rule.
Symptom: The CLB instance is accessible from outside the cluster but connection fails from within the cluster.
Solution: Use one of the following approaches:
-
Access via ClusterIP or service name (recommended for in-cluster access): From within the cluster, use the service's ClusterIP or DNS name instead of the CLB address. For the Ingress service, the service name is
nginx-ingress-lb.kube-system. -
Switch to `externalTrafficPolicy: Cluster`: This lets in-cluster traffic reach the service regardless of pod placement, but the client's original source IP is not preserved. To modify the Ingress service:
If you use an Ingress CLB instance, pods can only access services exposed through the Ingress or CLB from the node where the Ingress pod is running.
kubectl edit svc nginx-ingress-lb -n kube-system -
Use `externalTrafficPolicy: Cluster` with ENI pass-through (Terway only): If your cluster uses Terway with ENIs or multiple IPs per ENI, set
externalTrafficPolicy: Clusterand add theservice.beta.kubernetes.io/backend-type: "eni"annotation. This preserves the source IP and allows in-cluster access without issue. For more information, see Use annotations to configure a Classic Load Balancer (CLB) instance.apiVersion: v1 kind: Service metadata: annotations: service.beta.kubernetes.io/backend-type: eni labels: app: nginx-ingress-lb name: nginx-ingress-lb namespace: kube-system spec: externalTrafficPolicy: Cluster
CLB inaccessible from outside the cluster
Cause: An access control list (ACL) is blocking the client IP, the CLB vServer group has no backend servers, or the CLB health check is failing.
Symptom: The CLB instance cannot be reached from outside the cluster.
Solution:
-
Check for service error events and resolve them. See Service error events and solutions.
kubectl -n {your-namespace} describe svc {your-svc-name} -
Check whether an ACL is configured on the CLB instance. If one is configured, verify that it allows inbound traffic from the client IP address. For ACL configuration details, see Resource Access Management.
-
Check whether the CLB vServer group is empty. If it is empty, verify that an application pod is associated with the service and running as expected. If the pod is unhealthy, resolve the pod issue first. See Troubleshoot pod issues.
-
Check whether the CLB listener health check is passing. If it is failing, verify that the application pod is responding correctly. For health check troubleshooting, see CLB health check FAQ.
Cannot connect to the backend HTTPS service
Cause: When a certificate is configured on the CLB listener, CLB terminates TLS and forwards decrypted HTTP traffic to the backend pod. If targetPort is set to an HTTPS port (for example, 443), the pod receives an HTTP request on an HTTPS port and rejects it with "The plain HTTP request was sent to HTTPS port."
Symptom: Connections to the backend service fail after configuring HTTPS on the CLB listener.
Solution: Set targetPort for the HTTPS listener port to the HTTP port that the backend pod listens on. For example, if Nginx listens on HTTPS port 443 internally, set targetPort to 80.
apiVersion: v1
kind: Service
metadata:
annotations:
service.beta.kubernetes.io/alibaba-cloud-loadbalancer-protocol-port: "https:443"
service.beta.kubernetes.io/alibaba-cloud-loadbalancer-cert-id: "${YOUR_CERT_ID}"
name: nginx
namespace: default
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
- port: 443
protocol: TCP
targetPort: 80
selector:
run: nginx
type: LoadBalancer