You can create a Health Check Template and apply it to an endpoint to monitor its health status. If an endpoint becomes unavailable, Global Traffic Manager uses its switchover feature to automatically remove the abnormal node. This ensures that your service runs smoothly.
Terms
Health Check Template: A template that contains preset configurations to perform health checks on an address. You can create Health Check Template that use the PING, TCP, HTTP, or HTTPS protocols.
Health check task: A task that is created when you apply a Health Check Template to an address. You can apply multiple Health Check Template to a single address. A single Health Check Template can also be used for multiple addresses.
Create a Health Check Template
Click Health Check Template > Create Health Check Template. In the dialog box that appears, configure the parameters and submit the form.
Ping health checks
Form Item
Description
Template Name
The name of the Health Check Template. We recommend that you name the template based on its health check protocol for easy identification.
Type
The IP version of the detection nodes. Valid values: IPv4 and IPv6.
Protocol
Select
pingto monitor metrics such as network reachability, packet loss rate, and latency to the IP address.ICMP Packages Sent
The number of ICMP data packets to send for each ping health check. This value is used to calculate the packet loss rate and is fixed at 20.
Packet Loss Rate
The packet loss rate is calculated for each ping health check using the following formula: Packet loss rate = (Lost packets / Total ICMP packets sent) × 100%. An alert is triggered if the packet loss rate reaches the specified threshold. Valid values: 10%, 30%, 80%, 90%, and 100%.
Interval
The interval between two consecutive health checks. The default value is 1 minute. The minimum interval is 15 seconds. This feature is available only for Ultimate Edition instances.
Timeout Period
The timeout period for a response to a data packet. If a response is not received within this period, the health check is considered to have timed out. Valid values: 2 seconds, 3 seconds, 5 seconds, and 10 seconds.
Retries
The number of consecutive times a health check can fail before the system marks the application service as abnormal. This setting helps prevent false positives that are caused by transient network jitter. Valid values: 1, 2, and 3.
1: The application service is marked as abnormal after one failed health check.
2: The application service is marked as abnormal after two consecutive failed health checks.
3: The application service is marked as abnormal after three consecutive failed health checks.
ImportantIf the destination address is unreachable or returns an ICMP destination unreachable message, these events are not counted towards the failure rate and do not trigger alerts. In this case, you must fix the network issue for the address or switch to an HTTP health check task.
Detection Node
The system provides the following default detection nodes based on the address type:
IPv4 detection:
Carrier nodes: Changsha Telecom, Nanjing Unicom, Dalian Mobile, Qingdao Telecom, Tianjin Unicom, Dalian Unicom, Zhengzhou Telecom, Shenzhen Mobile, Xi'an Telecom, Nanjing Mobile
BGP nodes: Qingdao, Shanghai, Zhangjiakou, Hohhot, Shenzhen, Hangzhou, Beijing
International nodes: Malaysia, Japan, Singapore, California, Hong Kong SAR, Germany
IPv6 detection:
BGP nodes: Shanghai, Hohhot, Shenzhen, Beijing
International nodes: Hong Kong SAR
ImportantIf all addresses in the address pool are Alibaba Cloud IP addresses and you use a blackhole filtering policy for fault testing, select carrier nodes. This is because blackhole filtering is an access control list (ACL) policy that is applied on the Internet between Alibaba Cloud and carrier networks. However, traffic between Alibaba Cloud IP addresses flows primarily within the Alibaba Cloud internal network, which can reduce the effectiveness of the detection.
Difference between BGP nodes and carrier nodes: Border Gateway Protocol (BGP) nodes can dynamically select the optimal carrier network line. For example, if the China Mobile line in Shanghai fails, a BGP detection point in Shanghai might switch to the China Telecom line for detection. This switchover occurs unless all available lines are down. In contrast, a carrier node uses only its specified carrier network and does not have a line optimization mechanism.
To configure a whitelist for access sources on your server-side, click View Monitoring Node IP Addresses to obtain the IP addresses of the detection points.
If the IP address that you want to monitor is located outside China, select international nodes.
Detected Node Failure Rate
The threshold for the percentage of failed detection points. If the percentage of failed detection points among all selected detection points exceeds this threshold, the application service is marked as abnormal. Valid values: 20%, 50%, 80%, and 100%.
TCP health checks
Form Item
Description
Template Name
The name of the Health Check Template. We recommend that you name the template based on its health check protocol for easy identification.
Type
The IP version of the detection nodes. Valid values: IPv4 and IPv6.
Protocol
Select
tcpto use the TCP protocol to monitor metrics such as network reachability, port availability, and latency of the target IP address.Interval
The interval between two consecutive health checks. The default value is 1 minute. The minimum interval is 15 seconds. This feature is available only for Ultimate Edition instances.
Timeout Period
The timeout period for a response to a data packet. If a response is not received within this period, the health check is considered to have timed out. Valid values: 2 seconds, 3 seconds, 5 seconds, and 10 seconds.
Retries
The number of consecutive times a health check can fail before the system marks the application service as abnormal. This setting helps prevent false positives that are caused by transient network jitter. Valid values: 1, 2, and 3.
1: The application service is marked as abnormal after one failed health check.
2: The application service is marked as abnormal after two consecutive failed health checks.
3: The application service is marked as abnormal after three consecutive failed health checks.
Detection Node
The system provides the following default detection nodes based on the address type:
IPv4 detection:
Carrier nodes: Changsha Telecom, Nanjing Unicom, Dalian Mobile, Qingdao Telecom, Tianjin Unicom, Dalian Unicom, Zhengzhou Telecom, Shenzhen Mobile, Xi'an Telecom, Nanjing Mobile
BGP nodes: Qingdao, Shanghai, Zhangjiakou, Hohhot, Shenzhen, Hangzhou, Beijing
International nodes: Malaysia, Japan, Singapore, California, Hong Kong SAR, Germany
IPv6 detection:
BGP nodes: Shanghai, Hohhot, Shenzhen, Beijing
International nodes: Hong Kong SAR
ImportantIf all addresses in the address pool are Alibaba Cloud IP addresses and you use a blackhole filtering policy for fault testing, select carrier nodes. This is because blackhole filtering is an access control list (ACL) policy that is applied on the Internet between Alibaba Cloud and carrier networks. However, traffic between Alibaba Cloud IP addresses flows primarily within the Alibaba Cloud internal network, which can reduce the effectiveness of the detection.
Difference between BGP nodes and carrier nodes: Border Gateway Protocol (BGP) nodes can dynamically select the optimal carrier network line. For example, if the China Mobile line in Shanghai fails, a BGP detection point in Shanghai might switch to the China Telecom line for detection. This switchover occurs unless all available lines are down. In contrast, a carrier node uses only its specified carrier network and does not have a line optimization mechanism.
To configure a whitelist for access sources on your server-side, click View Monitoring Node IP Addresses to obtain the IP addresses of the detection points.
If the IP address that you want to monitor is located outside China, select international nodes.
Detected Node Failure Rate
The threshold for the percentage of failed detection points. If the percentage of failed detection points among all selected detection points exceeds this threshold, the application service is marked as abnormal. Valid values: 20%, 50%, 80%, and 100%.
HTTP/HTTPS health checks
Form item
Description
Template Name
The name of the Health Check Template. We recommend that you name the template based on its health check protocol for easy identification.
Type
The IP version of the detection nodes. Valid values: IPv4 and IPv6.
Protocol
Select
httporhttpsto monitor metrics of the web server at the target IP address, such as network reachability, service availability, and time to first byte.Host Settings
Specifies the Host field in the header of the HTTP(S) request. This identifies the specific HTTP website to access. The default value is the primary domain name. If the target website has specific Host requirements, modify this field.
HTTP Path
The URL path for the HTTP(S) health check. The system default is "/".
Verification Content
When performing an HTTP(S) health check, the system determines whether the web server is working correctly based on the return code. If the return code exceeds the alert threshold, the system considers the application service to be abnormal:
Failure code is greater than or equal to 400: Bad Request. If an HTTP(S) request contains incorrect parameters, the web server returns a code of 400 or higher. If you set the validation content to "Failure code is greater than or equal to 400", make sure to enter the exact URL access path parameters in the HTTP Path field.
Failure code is greater than or equal to 500: Server Error. If the web server encounters an error, it returns a code of 500 or higher. By default, the system uses a failure code of 500 or higher as the alert threshold.
Verify Response: Required. Site monitoring matches this content against the first 64 KB of the HTTP server's response body. If the response message does not contain this content, the health check fails. The content can be in Chinese or English. Regular expressions are not supported.
Interval
The interval between two consecutive health checks. The default value is 1 minute. The minimum interval is 15 seconds. This feature is available only for Ultimate Edition instances.
Timeout Period
The timeout period for a response to a data packet. If a response is not received within this period, the health check is considered to have timed out. Valid values: 2 seconds, 3 seconds, 5 seconds, and 10 seconds.
Retries
The number of consecutive times a health check can fail before the system marks the application service as abnormal. This setting helps prevent false positives that are caused by transient network jitter. Valid values: 1, 2, and 3.
1: The application service is marked as abnormal after one failed health check.
2: The application service is marked as abnormal after two consecutive failed health checks.
3: The application service is marked as abnormal after three consecutive failed health checks.
Enable SNI
Server Name Indication (SNI) is an extension to the Transport Layer Security (TLS) protocol that allows a client to specify the hostname it wants to connect to at the start of the TLS handshake. Because the TLS handshake occurs before any HTTP request data is sent, SNI allows the server to know which service the client is trying to access before sending the certificate. This lets the server present the correct certificate to the client. When enabled, this feature is supported.
Follow 3XX Redirection
Enabled: If the monitoring node receives a 3xx status code (301, 302, 303, 307, or 308), it follows the redirection. Disabled: The node does not follow the redirection.
Detection Node
The system provides the following default detection nodes based on the address type:
IPv4 detection:
Carrier nodes: Changsha Telecom, Nanjing Unicom, Dalian Mobile, Qingdao Telecom, Tianjin Unicom, Dalian Unicom, Zhengzhou Telecom, Shenzhen Mobile, Xi'an Telecom, Nanjing Mobile
BGP nodes: Qingdao, Shanghai, Zhangjiakou, Hohhot, Shenzhen, Hangzhou, Beijing
International nodes: Malaysia, Japan, Singapore, California, Hong Kong SAR, Germany
IPv6 detection:
BGP nodes: Shanghai, Hohhot, Shenzhen, Beijing
International nodes: Hong Kong SAR
ImportantIf all addresses in the address pool are Alibaba Cloud IP addresses and you use a blackhole filtering policy for fault testing, select carrier nodes. This is because blackhole filtering is an access control list (ACL) policy that is applied on the Internet between Alibaba Cloud and carrier networks. However, traffic between Alibaba Cloud IP addresses flows primarily within the Alibaba Cloud internal network, which can reduce the effectiveness of the detection.
Difference between BGP nodes and carrier nodes: Border Gateway Protocol (BGP) nodes can dynamically select the optimal carrier network line. For example, if the China Mobile line in Shanghai fails, a BGP detection point in Shanghai might switch to the China Telecom line for detection. This switchover occurs unless all available lines are down. In contrast, a carrier node uses only its specified carrier network and does not have a line optimization mechanism.
To configure a whitelist for access sources on your server-side, click View Monitoring Node IP Addresses to obtain the IP addresses of the detection points.
If the IP address that you want to monitor is located outside China, select international nodes.
Detected Node Failure Rate
The threshold for the percentage of failed detection points. If the percentage of failed detection points among all selected detection points exceeds this threshold, the application service is marked as abnormal. Valid values: 20%, 50%, 80%, and 100%.
