This topic describes how to perform a stress test on a Server Load Balancer (SLB) instance. Layer 4 Server Load Balancer (SLB) instances use Linux Virtual Server (LVS) and Keepalived to balance traffic loads, whereas Layer 7 SLB instances use Tengine.
Considerations
Take note of the following considerations when you perform a stress test on an SLB instance:
- Use short-lived connections if you want to test the forwarding capacity of an SLB
instance.
Typically, a CLB stress test is used to benchmark the forwarding capacity of a CLB instance, in addition to its session persistence and load balancing capabilities. Short-lived connections are suitable for benchmarking the capacities of both the CLB instance and its backend servers. You must make sure that the CLB instance has sufficient frontend ports to connect to clients if you choose to use short-lived connections in a stress test.
- Use persistent connections if you want to test the throughput of the SLB instance.
Choose persistent connections if you want to test the bandwidth limit or the backend
application requires persistent connections.
Set the timeout period on the stress testing tool to a small value, such as 5 seconds. If you set the timeout period to a large value, the average round-trip time (RTT) shown in the testing result may increase. As a result, it is difficult to judge whether the SLB instance is under extreme stress conditions. If you set the timeout period to a small value, you can judge whether the SLB instance is able to withstand the load based on the request success rate shown in the testing result.
- Host a static page on the backend server for stress testing. This can minimize the impacts of the application logic on the stress testing result.
- We also recommend that you use the following listener configurations:
- Disable session persistence in case the SLB instance distributes network traffic to only some of the backend servers.
- Disable health checks to reduce the number of requests sent to the backend servers.
- Associate at least five elastic IP addresses (EIPs) with the SLB instance if the SLB instance supports up to 5,000 concurrent connections.
Recommended testing tool
We recommend that you do not use Apache ab to perform stress tests. In high concurrency scenarios, the waiting time of Apache ab increases by increments of 3 seconds, such as 3 seconds, 6 seconds, and 9 seconds. Apache ab determines whether requests are successful based on the specified content length. If the SLB instance that you want to benchmark is associated with multiple backend servers, the actual sizes of the response content returned by these backend servers may be different from the specified content length. This makes the stress testing results inaccurate.
Possible causes of low scores on stress tests
Requests distributed by a Layer 4 listener pass through an LVS and then reach a backend server. Requests distributed by a Layer 7 listener must pass through an LVS and a Tengine server, and then finally reach a backend server.
If you use Layer 7 listeners in a stress test, the benchmark scores may be low. This may be caused due to the following reasons:
- The SLB instance does not have sufficient frontend ports.
During a stress test, clients fail to establish connections with the SLB instance if the SLB instance does not have sufficient frontend ports. The SLB instance removes the timestamps of TCP connections by default. As a result, the tw_reuse flag in the Linux stack becomes invalid. The tw_reuse flag is used to reuse connections that are in the time_wait state. Therefore, if this flag becomes invalid, connections in the time_wait state will accumulate and occupy the frontend ports of the SLB instance.
Solution: Set clients to establish long-lived connections instead of short-lived connections. In addition, use Reset (RST) packets to close connections by setting the SO_LINGER socket option.
- The accept queue on the backend server is full.
If the accept queue on the backend server is full, the backend server can no longer return syn_ack packets. As a result, the client times out.
Solution: Run the
sysctl -w net.core.somaxconn=1024
command to change the value of net.core.somaxconn and restart the application on the backend server. The default value of net.core.somaxconn is 128. - Excessive connections are established to the backend server.
Due to the architecture design of Layer 7 SLB, a Tengine server establishes short-lived connections to a backend server instead of persistent connections. As a result, the backend server may have excessive connections, which degrade the performance of the SLB instance in stress tests.
- The dependency of the application on the backend server becomes the performance bottleneck.
The traffic loads on the backend server are below the performance limit of the backend server. However, the application on the backend server may depend on another application, such as a database. Therefore, the dependency may also limit the performance of the SLB instance in stress tests.
- The health status of the backend server is abnormal.
If the backend server is declared unhealthy or the health status of the backend server changes frequently, this may degrade the performance of the SLB instance in stress tests.