The bot threat intelligence feature provides information about suspicious IP addresses used by dialers, on-premises data centers, and malicious scanners. This feature also maintains an IP address library of malicious crawlers. You can configure bot threat intelligence rules to prevent malicious crawlers from accessing all pages under your domain name or specific directories.

Prerequisites

  • A WAF instance is purchased, and the Bot Manager feature is enabled.
  • Your website is added to WAF. For more information, see Tutorials.

Background information

Bot threat intelligence rules can block requests from crawlers that are recorded in the Alibaba Cloud crawler library. The Alibaba Cloud crawler library is updated in real time based on Alibaba Cloud threat intelligence and the analysis of network traffic that flows through Alibaba Cloud. It can effectively detect IP addresses of malicious crawlers and provide the characteristics of sources from which malicious requests are sent. The Alibaba Cloud crawler library provides the characteristics of malicious crawlers of more than 700 types.
Note The Alibaba Cloud crawler library covers public clouds and on-premises data centers.

When you configure bot threat intelligence rules, you can specify actions based on the types of threat intelligence libraries. For example, you can specify actions, such as block, JavaScript verification, or CAPTCHA verification. You can also configure bot threat intelligence rules to protect important endpoints against certain threats. This helps you minimize the negative impacts on the services.

Procedure

  1. Log on to the WAF console.
  2. In the top navigation bar, select the resource group and the region to which the WAF instance belongs. The region can be Chinese Mainland or Outside Chinese Mainland.
  3. In the left-side navigation pane, choose Protection Settings > Website Protection.
  4. In the upper part of the Website Protection page, select the domain name for which you want to configure a whitelist. Switch Domain Name
  5. Click the Bot Management tab, find the Bot Threat Intelligence section. Then, turn on Status and click Settings.
    Note After the bot threat Intelligence feature is enabled, all requests destined for your website are checked by the feature. You can configure the bot management allowlist so that the requests that match required conditions bypass the check of the feature. For more information, see Configure a whitelist for Bot Management.
  6. In the Bot Threat Intelligence rule list, find the threat intelligence library you want to use, and turn on the switch in the Status column.
    The following table lists the bot threat intelligence libraries that WAF supports.
    Intelligence library Description
    Malicious Scanner Fingerprint BlacklistThis library contains the characteristics of tens of thousands of scanners based on traffic analysis.
    Malicious Scanner IP BlacklistThis library contains malicious IP addresses that are dynamically updated based on the source IP addresses of scan attacks detected on Alibaba Cloud.
    Credential Stuffing IP BlacklistThis library contains hundreds of thousands of malicious IP addresses that are updated based on the source IP addresses of credential stuffing and brute-force attacks detected on Alibaba Cloud.
    Fake Crawler BlacklistThis library identifies crawlers that use the User-Agent of authorized search engines, such as BaiduSpider, to disguise as authorized programs.
    Important Before you enable this library, make sure that a crawler allowlist is configured. Otherwise, false positives may occur. For more information, see Configure the allowed crawlers function.
    Malicious Crawler BlacklistThis library contains millions of malicious IP addresses that are dynamically updated based on the source IP addresses of crawlers detected on Alibaba Cloud. This library is categorized into three severity levels: low, medium, and high. A higher severity indicates more IP addresses in the library and a higher false positive rate.
    Note We recommend that you set up two-factor authentication, such as CAPTCHA and JavaScript verification, for the high-severity library.

    In scenarios in which two-factor authentication cannot be implemented, we recommend that you configure threat intelligence rules based on the low-severity library.

    IDC IP ListsThese libraries contain IP addresses of public clouds and on-premises data centers, including Alibaba Cloud, Tencent Cloud, Meituan Open Services, and 21Vianet. Attackers typically use CIDR blocks of public clouds or on-premises data centers to deploy crawlers or as proxies to access websites. Regular users rarely access websites in this way.
    After you enable a default rule, WAF performs the Monitor action on requests initiated from IP addresses in the threat intelligence library that correspond to the rule to the directories of the protected domain name. This action allows the requests to the destination directories and records the requests in logs.

    If you need to modify a default rule, see the following section on how to configure a custom threat intelligence rule. For example, if you want to specify the protected URL or action, see the following section, step7.

  7. Optional:Configure a custom threat intelligence rule.
    1. Find the rule that you want to modify and click Edit in the Actions column.
    2. In the Edit Intelligence dialog box, configure the following parameters.
      ParameterDescription
      Protected PathThe URL that you want to protect, such as /abc, /login/abc, or forward slash (/) that indicates all directories. You also need to select a value for Matching. Valid values:
      • Precise Match: The destination URL must be an exact match of the protected URL.
      • Prefix Match: The prefix of the destination URL matches the protected URL.
      • Regular Expression Match: The destination URL matches the specified regular expression of the protected URL.

      You can click Add Protected URL to add more URLs. You can add up to 10 URLs.

      ActionThe action that you want to perform after the match conditions of the rule are met. Valid values:
      • Monitor: allows requests to the destination directory and records the requests in logs.
      • Block: blocks requests to the destination directory.
      • JavaScript Validation: requires a client to perform JavaScript verification. Requests are forwarded to the destination directory only after the client passes the verification.
      • Captcha: requires a client to perform slider CAPTCHA verification. Requests are forwarded to the destination directory only after the client passes the verification.
        Note Slider CAPTCHA supports only synchronous requests. To verify asynchronous requests, such as Ajax requests, contact the Alibaba Cloud security team. If you cannot determine whether the protected URL supports slider CAPTCHA, we recommend that you create an IP address or URL-based custom protection policy (ACL) to run a test. For more information, see Create a custom protection policy.
      • Strict Captcha: requires a client to perform strict slider CAPTCHA verification. The request is forwarded to the destination directory only after the client passes the verification. Strict slider CAPTCHA verification has a stricter standard to verify visitor identities.
    3. Click OK.