If you encounter issues with health checks when you use Server Load Balancer (SLB), refer to this topic for solutions.
This topic answers the following questions:
Category | FAQ |
Principles and configuration | |
Troubleshooting | |
Log issues |
How do health checks work?
Health checks verify the availability of backend servers by periodically sending requests.
Health checks refer to sending requests periodically to backend servers to check the servers' status.
SLB uses the 100.64.0.0/10 CIDR block for health checks. Backend servers must not block this CIDR block. You do not need to add an allow rule in your ECS security group. However, if you have configured security policies such as iptables, you must allow access from this CIDR block. The 100.64.0.0/10 CIDR block is a reserved address space of Alibaba Cloud and poses no security risk.
For more information, see SLB health checks.
What are the recommended health check configurations?
We recommend the following health check configurations.
Configuration | Recommended value for TCP, HTTP, and HTTPS listeners | Recommended value for UDP listeners |
Response Timeout Period | 5 seconds | 10 seconds |
Health Check Interval | 2 seconds | 5 seconds |
Healthy Threshold | 3 times | 3 times |
Unhealthy Threshold | 3 times | 3 times |
To prevent frequent status changes from affecting service availability, the health status of a backend server is changed only after multiple consecutive successes or failures occur within a specified time window. For more information, see Configure and manage SLB health checks.
This configuration helps services and applications quickly reach a stable state. If you require faster failovers, you can reduce the response timeout period. However, you must ensure that the service processing time is shorter than the specified timeout period.
Can I disable health checks?
Yes, you can. For more information, see Disable health checks.
After you disable health checks, SLB continues to forward requests to unhealthy ECS instances. This can cause service interruptions.
If your business is sensitive to load, frequent health checks may affect normal service access. You can reduce the impact by decreasing the health check frequency, increasing the interval, or switching to a Layer 4 health check. However, to ensure continuous service availability, we do not recommend that you disable health checks.
How do I select a health check method for a TCP listener?
TCP listeners support HTTP and TCP health checks:
A TCP health check sends SYN handshake messages to detect whether the server port is active.
An HTTP health check sends HEAD or GET requests to simulate browser access and verify that the server application is running as expected.
TCP health checks consume fewer server performance resources. If your backend servers are highly sensitive to load and you only need to confirm that the port is active, select the TCP health check method. To more accurately confirm the health of your application, select the HTTP health check method.
What are the effects of setting the weight of an ECS instance to zero on health checks?
SLB stops forwarding traffic to an ECS instance whose weight is set to zero. However, the health check status of the backend server does not change to abnormal. Setting the weight of a backend ECS instance to zero is equivalent to removing the instance from the SLB instance. This operation is typically performed during maintenance activities, such as restarting or reconfiguring an ECS instance.
What is the default method that an HTTP listener uses to perform health checks on backend ECS instances?
HEAD method
If the service on the backend ECS instance does not support the HEAD method, the health check fails. We recommend that you run the following command on the ECS instance to test access to its own IP address using the HEAD method:
curl -v -0 -I -H "Host:" -X HEAD http://IP:portWhat are the IP addresses that an HTTP listener uses to perform health checks on backend ECS instances?
SLB uses the 100.64.0.0/10 CIDR block for health checks. Backend servers must not block this CIDR block. You do not need to add an allow rule in your ECS security group. However, if you have configured security policies such as iptables, you must allow access from this CIDR block. The 100.64.0.0/10 CIDR block is a reserved address space of Alibaba Cloud and poses no security risk.
When does the CLB health check start?
A health check starts immediately after it is configured for an SLB instance. SLB then sends health check requests periodically based on the configured health check interval.
How do I handle a health check failure caused by a faulty backend database?
Problem
An ECS instance hosts two websites:
www.example.com(a static website) andaliyundoc.com(a dynamic website). Both websites use SLB. Accessingwww.example.comreturns a 502 error because the backend database service is abnormal.Cause
The check domain name in the SLB health check configuration is
aliyundoc.com. Due to a fault in the ApsaraDB RDS or self-managed database, access toaliyundoc.comfails, which causes the health check to fail.Solution
Set the SLB health check domain name to
www.example.com.
Why do backend service logs show network connection errors even when the TCP port passes health checks?
Problem
After a TCP service port is configured on the SLB backend, network connection error messages frequently appear in the backend service logs. Packet capture analysis shows that the requests come from the SLB server and that SLB actively sends RST packets.

Cause
This issue is related to the SLB health check mechanism. For a TCP service port, the SLB health check only performs a TCP three-way handshake and then sends an RST packet to close the connection. The procedure is as follows:
The SLB server sends a SYN request packet.
The backend server returns a SYN+ACK acknowledgement packet.
After the SLB server receives the acknowledgement, it considers the port to be active and the health check is successful.
The SLB server sends an RST packet to close the connection and does not send any service data.
Because the connection is disconnected immediately after a successful health check, the upper-layer service, such as a Java connection pool, may interpret this as an abnormal connection. This results in error messages such as
Connection reset by peer.Solutions
Change the protocol from TCP to HTTP.
Filter out requests from the SLB IP address range at the application level to ignore these error messages in your logs.
Why does a health check show as abnormal when the service is running correctly?
Problem
The SLB HTTP health check always fails, but testing with
curlreturns a normal status code.Cause
If the returned status code does not match the normal status code configured in the console, the health check fails. For example, if the normal status code is configured as http_2xx, all non-HTTP 2xx status codes cause the health check to fail.
In a Tengine/Nginx configuration, the
curlcommand works correctly. However, theechocommand matches the default site, causing the test.html file to return a 404 error.
Solutions
You can modify the main configuration file to comment out the default site.
Add a domain name to the health check configuration.
Why is the health check frequency different from what is recorded in web logs?
The SLB health check service also runs in a cluster to prevent single points of failure. The SLB service is distributed across multiple nodes. Each node performs health checks independently. This design increases the total number of health check requests that your backend server receives. As a result, the health check frequency observed in logs does not match the frequency set in the console. This is normal.
How can I effectively distinguish between health check logs and service logs on backend servers?
Problem
On backend servers, health check request logs are mixed with normal service logs. This makes log files large and difficult to filter.
Cause
Health checks verify the availability of backend services by sending HTTP, TCP, or UDP requests. Backend services record these health check requests in the same way as normal service requests, which are mixed with normal service logs.
Solutions
Reduce the health check frequency: Increase the Health Check Interval to reduce the frequency of health checks and the volume of health check logs.
Adjust the health check path (for HTTP health checks): Set the Health Check Path to a non-service path, such as
/health. You can then filter your logs by this path to separate health check logs from service logs.Disable health checks (not recommended): If you are certain that you do not need the health check feature, disable it to prevent health check logs from being generated.
Failed CLB health check records are not displayed in the Health Check Log console
Health check logs are generated on an hourly basis and are retained for three days by default. If the health status of an SLB listener does not change within a one-hour period, no health check log is generated for that period. For a longer retention period, you can store health check logs in OSS.