All Products
Search
Document Center

Alibaba Cloud DNS:4. Health Check Template

Last Updated:Dec 23, 2025

You can create a Health Check Template and apply it to an endpoint to monitor its health status. If an endpoint becomes unavailable, Global Traffic Manager uses its switchover feature to automatically remove the abnormal node. This ensures that your service runs smoothly.

Terms

Health Check Template: A template that contains preset configurations to perform health checks on an address. You can create Health Check Template that use the PING, TCP, HTTP, or HTTPS protocols.

Health check task: A task that is created when you apply a Health Check Template to an address. You can apply multiple Health Check Template to a single address. A single Health Check Template can also be used for multiple addresses.

Create a Health Check Template

  1. Go to Alibaba Cloud DNS - Global Traffic Manager.

  2. Click Health Check Template > Create Health Check Template. In the dialog box that appears, configure the parameters and submit the form.

    Ping health checks

    Form Item

    Description

    Template Name

    The name of the Health Check Template. We recommend that you name the template based on its health check protocol for easy identification.

    Type

    The IP version of the detection nodes. Valid values: IPv4 and IPv6.

    Protocol

    Select ping to monitor metrics such as network reachability, packet loss rate, and latency to the IP address.

    ICMP Packages Sent

    The number of ICMP data packets to send for each ping health check. This value is used to calculate the packet loss rate and is fixed at 20.

    Packet Loss Rate

    The packet loss rate is calculated for each ping health check using the following formula: Packet loss rate = (Lost packets / Total ICMP packets sent) × 100%. An alert is triggered if the packet loss rate reaches the specified threshold. Valid values: 10%, 30%, 80%, 90%, and 100%.

    Interval

    The interval between two consecutive health checks. The default value is 1 minute. The minimum interval is 15 seconds. This feature is available only for Ultimate Edition instances.

    Timeout Period

    The timeout period for a response to a data packet. If a response is not received within this period, the health check is considered to have timed out. Valid values: 2 seconds, 3 seconds, 5 seconds, and 10 seconds.

    Retries

    The number of consecutive times a health check can fail before the system marks the application service as abnormal. This setting helps prevent false positives that are caused by transient network jitter. Valid values: 1, 2, and 3.

    • 1: The application service is marked as abnormal after one failed health check.

    • 2: The application service is marked as abnormal after two consecutive failed health checks.

    • 3: The application service is marked as abnormal after three consecutive failed health checks.

    Important

    If the destination address is unreachable or returns an ICMP destination unreachable message, these events are not counted towards the failure rate and do not trigger alerts. In this case, you must fix the network issue for the address or switch to an HTTP health check task.

    Detection Node

    The system provides the following default detection nodes based on the address type:

    IPv4 detection:

    • Carrier nodes: Changsha Telecom, Nanjing Unicom, Dalian Mobile, Qingdao Telecom, Tianjin Unicom, Dalian Unicom, Zhengzhou Telecom, Shenzhen Mobile, Xi'an Telecom, Nanjing Mobile

    • BGP nodes: Qingdao, Shanghai, Zhangjiakou, Hohhot, Shenzhen, Hangzhou, Beijing

    • International nodes: Malaysia, Japan, Singapore, California, Hong Kong SAR, Germany

    IPv6 detection:

    • BGP nodes: Shanghai, Hohhot, Shenzhen, Beijing

    • International nodes: Hong Kong SAR

    Important
    • If all addresses in the address pool are Alibaba Cloud IP addresses and you use a blackhole filtering policy for fault testing, select carrier nodes. This is because blackhole filtering is an access control list (ACL) policy that is applied on the Internet between Alibaba Cloud and carrier networks. However, traffic between Alibaba Cloud IP addresses flows primarily within the Alibaba Cloud internal network, which can reduce the effectiveness of the detection.

    • Difference between BGP nodes and carrier nodes: Border Gateway Protocol (BGP) nodes can dynamically select the optimal carrier network line. For example, if the China Mobile line in Shanghai fails, a BGP detection point in Shanghai might switch to the China Telecom line for detection. This switchover occurs unless all available lines are down. In contrast, a carrier node uses only its specified carrier network and does not have a line optimization mechanism.

    • To configure a whitelist for access sources on your server-side, click View Monitoring Node IP Addresses to obtain the IP addresses of the detection points.

    • If the IP address that you want to monitor is located outside China, select international nodes.

    Detected Node Failure Rate

    The threshold for the percentage of failed detection points. If the percentage of failed detection points among all selected detection points exceeds this threshold, the application service is marked as abnormal. Valid values: 20%, 50%, 80%, and 100%.

    TCP health checks

    Form Item

    Description

    Template Name

    The name of the Health Check Template. We recommend that you name the template based on its health check protocol for easy identification.

    Type

    The IP version of the detection nodes. Valid values: IPv4 and IPv6.

    Protocol

    Select tcp to use the TCP protocol to monitor metrics such as network reachability, port availability, and latency of the target IP address.

    Interval

    The interval between two consecutive health checks. The default value is 1 minute. The minimum interval is 15 seconds. This feature is available only for Ultimate Edition instances.

    Timeout Period

    The timeout period for a response to a data packet. If a response is not received within this period, the health check is considered to have timed out. Valid values: 2 seconds, 3 seconds, 5 seconds, and 10 seconds.

    Retries

    The number of consecutive times a health check can fail before the system marks the application service as abnormal. This setting helps prevent false positives that are caused by transient network jitter. Valid values: 1, 2, and 3.

    • 1: The application service is marked as abnormal after one failed health check.

    • 2: The application service is marked as abnormal after two consecutive failed health checks.

    • 3: The application service is marked as abnormal after three consecutive failed health checks.

    Detection Node

    The system provides the following default detection nodes based on the address type:

    IPv4 detection:

    • Carrier nodes: Changsha Telecom, Nanjing Unicom, Dalian Mobile, Qingdao Telecom, Tianjin Unicom, Dalian Unicom, Zhengzhou Telecom, Shenzhen Mobile, Xi'an Telecom, Nanjing Mobile

    • BGP nodes: Qingdao, Shanghai, Zhangjiakou, Hohhot, Shenzhen, Hangzhou, Beijing

    • International nodes: Malaysia, Japan, Singapore, California, Hong Kong SAR, Germany

    IPv6 detection:

    • BGP nodes: Shanghai, Hohhot, Shenzhen, Beijing

    • International nodes: Hong Kong SAR

    Important
    • If all addresses in the address pool are Alibaba Cloud IP addresses and you use a blackhole filtering policy for fault testing, select carrier nodes. This is because blackhole filtering is an access control list (ACL) policy that is applied on the Internet between Alibaba Cloud and carrier networks. However, traffic between Alibaba Cloud IP addresses flows primarily within the Alibaba Cloud internal network, which can reduce the effectiveness of the detection.

    • Difference between BGP nodes and carrier nodes: Border Gateway Protocol (BGP) nodes can dynamically select the optimal carrier network line. For example, if the China Mobile line in Shanghai fails, a BGP detection point in Shanghai might switch to the China Telecom line for detection. This switchover occurs unless all available lines are down. In contrast, a carrier node uses only its specified carrier network and does not have a line optimization mechanism.

    • To configure a whitelist for access sources on your server-side, click View Monitoring Node IP Addresses to obtain the IP addresses of the detection points.

    • If the IP address that you want to monitor is located outside China, select international nodes.

    Detected Node Failure Rate

    The threshold for the percentage of failed detection points. If the percentage of failed detection points among all selected detection points exceeds this threshold, the application service is marked as abnormal. Valid values: 20%, 50%, 80%, and 100%.

    HTTP/HTTPS health checks

    Form item

    Description

    Template Name

    The name of the Health Check Template. We recommend that you name the template based on its health check protocol for easy identification.

    Type

    The IP version of the detection nodes. Valid values: IPv4 and IPv6.

    Protocol

    Select http or https to monitor metrics of the web server at the target IP address, such as network reachability, service availability, and time to first byte.

    Host Settings

    Specifies the Host field in the header of the HTTP(S) request. This identifies the specific HTTP website to access. The default value is the primary domain name. If the target website has specific Host requirements, modify this field.

    HTTP Path

    The URL path for the HTTP(S) health check. The system default is "/".

    Verification Content

    When performing an HTTP(S) health check, the system determines whether the web server is working correctly based on the return code. If the return code exceeds the alert threshold, the system considers the application service to be abnormal:

    • Failure code is greater than or equal to 400: Bad Request. If an HTTP(S) request contains incorrect parameters, the web server returns a code of 400 or higher. If you set the validation content to "Failure code is greater than or equal to 400", make sure to enter the exact URL access path parameters in the HTTP Path field.

    • Failure code is greater than or equal to 500: Server Error. If the web server encounters an error, it returns a code of 500 or higher. By default, the system uses a failure code of 500 or higher as the alert threshold.

    • Verify Response: Required. Site monitoring matches this content against the first 64 KB of the HTTP server's response body. If the response message does not contain this content, the health check fails. The content can be in Chinese or English. Regular expressions are not supported.

    Interval

    The interval between two consecutive health checks. The default value is 1 minute. The minimum interval is 15 seconds. This feature is available only for Ultimate Edition instances.

    Timeout Period

    The timeout period for a response to a data packet. If a response is not received within this period, the health check is considered to have timed out. Valid values: 2 seconds, 3 seconds, 5 seconds, and 10 seconds.

    Retries

    The number of consecutive times a health check can fail before the system marks the application service as abnormal. This setting helps prevent false positives that are caused by transient network jitter. Valid values: 1, 2, and 3.

    • 1: The application service is marked as abnormal after one failed health check.

    • 2: The application service is marked as abnormal after two consecutive failed health checks.

    • 3: The application service is marked as abnormal after three consecutive failed health checks.

    Enable SNI

    Server Name Indication (SNI) is an extension to the Transport Layer Security (TLS) protocol that allows a client to specify the hostname it wants to connect to at the start of the TLS handshake. Because the TLS handshake occurs before any HTTP request data is sent, SNI allows the server to know which service the client is trying to access before sending the certificate. This lets the server present the correct certificate to the client. When enabled, this feature is supported.

    Follow 3XX Redirection

    Enabled: If the monitoring node receives a 3xx status code (301, 302, 303, 307, or 308), it follows the redirection. Disabled: The node does not follow the redirection.

    Detection Node

    The system provides the following default detection nodes based on the address type:

    IPv4 detection:

    • Carrier nodes: Changsha Telecom, Nanjing Unicom, Dalian Mobile, Qingdao Telecom, Tianjin Unicom, Dalian Unicom, Zhengzhou Telecom, Shenzhen Mobile, Xi'an Telecom, Nanjing Mobile

    • BGP nodes: Qingdao, Shanghai, Zhangjiakou, Hohhot, Shenzhen, Hangzhou, Beijing

    • International nodes: Malaysia, Japan, Singapore, California, Hong Kong SAR, Germany

    IPv6 detection:

    • BGP nodes: Shanghai, Hohhot, Shenzhen, Beijing

    • International nodes: Hong Kong SAR

    Important
    • If all addresses in the address pool are Alibaba Cloud IP addresses and you use a blackhole filtering policy for fault testing, select carrier nodes. This is because blackhole filtering is an access control list (ACL) policy that is applied on the Internet between Alibaba Cloud and carrier networks. However, traffic between Alibaba Cloud IP addresses flows primarily within the Alibaba Cloud internal network, which can reduce the effectiveness of the detection.

    • Difference between BGP nodes and carrier nodes: Border Gateway Protocol (BGP) nodes can dynamically select the optimal carrier network line. For example, if the China Mobile line in Shanghai fails, a BGP detection point in Shanghai might switch to the China Telecom line for detection. This switchover occurs unless all available lines are down. In contrast, a carrier node uses only its specified carrier network and does not have a line optimization mechanism.

    • To configure a whitelist for access sources on your server-side, click View Monitoring Node IP Addresses to obtain the IP addresses of the detection points.

    • If the IP address that you want to monitor is located outside China, select international nodes.

    Detected Node Failure Rate

    The threshold for the percentage of failed detection points. If the percentage of failed detection points among all selected detection points exceeds this threshold, the application service is marked as abnormal. Valid values: 20%, 50%, 80%, and 100%.

    image