The bot management module in Web Application Firewall (WAF) provides the scenario-specific configuration feature to monitor crawlers targeting your domain, block requests from known malicious IP addresses, and configure custom anti-crawler rules — without requiring in-house security expertise.
How it works
WAF maintains IP address libraries of malicious crawlers, updated in real time using Alibaba Cloud's network-wide threat intelligence across public clouds and data centers. When a request arrives, WAF checks the source IP against these libraries and applies the action defined in your scenario-specific configuration rule — allowing normal crawlers or blocking malicious ones.
Identifying normal crawlers
Legitimate crawlers include the xxspider keyword in the User-Agent header and show consistent behavioral patterns: low request rate, scattered URLs, and wide time range. To find the source IP of a crawler request, run a reverse nslookup or tracert command on the request. For example, if you run the reverse nslookup command with the IP address of the Baidu crawler, you can obtain the source IP address of the crawler.

Identifying malicious crawlers
Malicious crawlers send large volumes of requests to a specific URL or port within a specific period. HTTP flood attacks are often disguised as crawler traffic or third-party requests to scrape sensitive data. Left unblocked, these attacks cause increased CPU utilization, website access failures, and service interruptions.
Prerequisites
Before you begin, ensure that you have:
A WAF instance running the Pro, Business, or Enterprise edition
The bot management module enabled for your instance
The bot management value-added service is enabled for the Pro, Business, or Enterprise edition.
Limitations
Each domain name supports up to 50 scenario-specific configuration rules.