This topic explains the features of malicious crawlers and describes how to use WAF to block them.
It is noteworthy that, professional crawlers constantly change their crawling methods to bypass anti-crawling policies set by the website administrators. It is impossible to achieve perfect protection by applying fixed rules. In addition, anti-crawling has a strong association with the characteristics of your own business. Therefore, you must regularly review and update the protection policies to achieve relatively ideal results.
Distinguish malicious crawlers
However, malicious crawlers may send a large number of requests to a specific URL/interface of a domain name during a specific period of time. It may be an HTTP flood attack disguised as a crawler, or a crawler that crawls targeted sensitive information disguising as a third party. When the number of requests sent by a malicious crawler is large enough, it can usually cause a sharp rise in CPU usage, failure to open the website, and service interruptions.
WAF performs Risk warning against malicious crawlers, and alerts you about yesterday’s crawler requests. You can configure one or more of the following rules based on your actual business situation, to block the corresponding crawler requests.
Configure HTTP ACL policy to block specific crawlers
Configure custom HTTP flood policies to block malicious requests
Using custom HTTP flood protection rules allows you to set a few specific URLs blocking rules under certain access frequency.