This topic provides answers to some frequently asked questions about health checks for Classic Load Balancer (CLB).

How do health checks work?

CLB performs health checks to check the availability of backend Elastic Compute Service (ECS) instances. If the health check feature is enabled and the check result shows that a backend ECS instance is unhealthy, the CLB instance does not forward requests to the ECS instance until the instance becomes healthy.

CLB uses the 100.64.0.0/10 CIDR block for health checks. Make sure that the backend ECS instances do not block this CIDR block. You do not need to perform additional configurations on the ECS security group. However, if security policies such iptables are configured, make sure that the security policies allow requests from 100.64.0.0/10.
Note 100.64.0.0/10 is reserved by Alibaba Cloud and other users cannot use this CIDR block. Therefore, requests from 100.64.0.0/10 are safe.

For more information, see Health check overview.

What are the recommended configurations for health checks?

To avoid impacts on system availability caused by frequent switchover due to failed health checks, the health check status switches only when health checks successively succeed or fail for a specified number of times within a certain time window. For more information, see Configure health check.

The following table describes the recommended health check configurations for TCP, HTTP, and HTTPS listeners.

Configuration Recommended value
Response Timeout 5. Unit: seconds.
Health Check Interval 2. Unit: seconds.
Unhealthy Threshold 3

The following table describes the recommended health check configurations for UDP listeners.

Configuration Recommended value
Response Timeout 10. Unit: seconds.
Health Check Interval 5. Unit: seconds.
Unhealthy Threshold 3
Healthy Threshold 3
Note The recommended configurations help restore the service in a timely manner if the health check of a backend server fails. If you have higher requirements, you can specify a lower response timeout value. However, you must make sure the response time in the normal status is less than the timeout value that you have specified.

Can I disable the health check feature?

Yes, you can disable the health check feature. For more information, see Disable the health check feature.

Note If you disable health checks, requests may be distributed to unhealthy ECS instances and cause impacts on your business. We recommend that you enable health checks.

How do I select a health check method for TCP listeners?

For TCP listeners, both the HTTP and TCP health check methods are supported:

  • A TCP health check sends SYN handshake packets to an instance to check whether the status of the instance is healthy.
  • An HTTP health check simulates a process that uses a web browser to access resources by sending HEAD or GET requests to an instance and check whether the instance is healthy.

A TCP health check consumes less server resources. If the traffic load on backend servers is high, select TCP health checks. Otherwise, select HTTP health checks.

How are health checks impacted if the weight of an ECS instance is zero?

If you set the weight of an ECS instance to zero, CLB no longer forwards traffic to this ECS instance, and the ECS instance is determined healthy in a health check.

After you set the weight of an ECS instance to zero, the ECS instance is removed from CLB. We recommend that you set the weight of an ECS instance to zero only when you restart or manage the ECS instance.

What health check method is used for HTTP listeners on backend ECS instances?

The HEAD method.

If you do not use the HEAD method for backend ECS instances, the backend ECS instances fail the health checks. We recommend that you access your own IP address on an ECS instance by using the HEAD method for testing. Run the following commands on an ECS instance to access your IP address:
curl -v -0 -I -H "Host:" -X HEAD http://IP:port

What is the IP address that HTTP listeners use to perform health checks on backend ECS instances?

CLB uses the 100.64.0.0/10 CIDR block for health checks. Make sure that the backend ECS instances do not block this CIDR block. You do not need to perform additional configurations on the ECS security group. However, if security policies such iptables are configured, make sure that the security policies allow requests from 100.64.0.0/10.
Note 100.64.0.0/10 is reserved by Alibaba Cloud and other users cannot use this CIDR block. Therefore, requests from 100.64.0.0/10 are safe.

Why do the console and web logs display a different health check frequency?

Health checks are performed in clusters to avoid single points of failure. Proxies of CLB are deployed on multiple nodes. Therefore, the health check frequency recorded in web logs is different from the frequency configured in the console.

Do health checks consume server resources?

No, health checks do not consume server resources because CLB uses private IP addresses to perform health checks.

How do I handle a health check failure caused by a faulty backend database?

  • Symptom

    The static website www.example.com and the dynamic website aliyundoc.com are deployed on the ECS instance. Both websites have load balancing configured. An HTTP 502 error code is returned due to a backend database error when www.example.com is accessed.

  • Cause

    The domain name configured for the CLB health check is aliyundoc.com. The error of an ApsaraDB RDS instance or a custom database interrupts access to aliyundoc.com. Therefore, the health check failed.

  • Solution

    Change the domain name for the CLB health check to www.example.com.

Why is a network connection exception recorded in the backend service logs, but the TCP health check is displayed as successful?

  • Symptom

    After a backend TCP port is configured on a CLB listener, the backend service logs frequently display a network connection exception. The requests are sent from the CLB instance and the CLB instance also sends RST packets to the backend server at the same time.

  • Cause

    The problem is related to the health check mechanism.

    TCP does not interrupt the upper-level services and is used to reduce the cost of health checks and the impacts on backend services. TCP health checks perform only a simple three-way handshake and then directly send RST packets to terminate the TCP connection. The following section describes the data exchange process:
    1. The CLB server sends an SYN request packet to the backend CLB port.
    2. The backend servers reply with an SYN-ACK package if the backend port is normal.
    3. After the CLB instance receives the response from the backend port, the CLB instance determines that the listener and the backend servers are healthy.
    4. The CLB instance sends a RST packet to the backend port to terminate the connection. A health check is complete.

    After the health check succeeds, the CLB instance directly sends RST packets to terminate the connection. No data is sent afterwards. As a result, upper-level services such as the Java connection pool determine that the connection is abnormal and errors such as Connection reset by peer occur.

  • Solution
    • Change the protocol from TCP to HTTP.
    • Filter the logs for requests from the CIDR block of the CLB server and ignore related error messages.

Why is the health check result returned as unhealthy even though the service is running as expected?

  • Symptom
    The HTTP health check always fails, but when you run the curl -I command, the status code returned is normal.
    echo -e 'HEAD /test.html HTTP/1.0\r\n\r\n' | nc -t 192.168.0.1 80
  • Cause

    If the returned status code is different from the healthy status code configured in the console, the backend ECS instances are determined unhealthy. For example, assume that you specify that the HTTP status code 2xx indicates a healthy status. In this case, if the HTTP status codes returned are not 2xx, ECS instances are determined unhealthy.

    When you run the curl command to configure Tengine or NGINX, no error occurs. However, when you run the echo command, the default site is matched. This causes the test file test.html to return an HTTP 404 error code, as shown in the following figure.

  • Solution
    • Modify the main configuration file and comment out the default site.
    • Add the domain name that is used for health checks in the health check configurations.