The bot management module of Web Application Firewall (WAF) provides the scenario-specific configuration feature. You can use this feature to configure custom anti-crawler rules based on your business requirements and protect your business from malicious crawlers. This topic describes how to configure anti-crawler rules for apps.

Background information

The scenario-specific configuration feature allows you to configure anti-crawler rules based on your business requirements. You can use this feature together with intelligent algorithms to identify crawler traffic in a more precise manner. The feature can also automatically handle the crawler traffic that matches the configured anti-crawler rules. After you configure anti-crawler rules, you can verify the rules in a test environment. This prevents adverse effects on your websites or apps caused by inappropriate rule configurations or compatibility issues. The adverse effects include false positives and undesired protection results.

Prerequisites

  • Subscription WAF instance: If your WAF instance runs the Pro, Business, or Enterprise edition, the Bot Management module is enabled.
  • Your website is added to WAF. For more information, see Tutorials.
  • Anti-Bot SDK is integrated into the apps that you want to protect. For more information, see Integrate Anti-Bot SDK into apps.

Configure anti-crawler rules for apps

  1. Log on to the WAF console. In the top navigation bar, select the resource group and the region to which your WAF instance belongs. The region can be Chinese Mainland or Outside Chinese Mainland.
  2. In the left-side navigation pane, choose Protection Settings > Website Protection.
  3. In the upper part of the Website Protection page, select the domain name for which you want to configure a whitelist. Switch Domain Name
  4. If you have not created an anti-crawler rule, click the Bot Management tab. In the Scenario-specific Configuration section, click Start to create an anti-crawler rule. If you have created an anti-crawler rule, click Add in the upper-right corner of the Bot Management tab to create an anti-crawler rule.
    Note You can create up to 50 anti-crawler rules for a domain name.
  5. In the Configure Scenarios step, configure the basic information about the scenario in which you want to protect apps and click Next.
    ParameterDescription
    ScenarioSpecify the type of scenario in which you want to protect the apps. Examples: logon, registration, and order placement.
    Service TypeSelect Apps to protect native iOS and Android apps.
    Note HTML5 apps are not native iOS or Android apps. If you want to protect HTML5 apps, set the Service Type parameter to Websites.
    Traffic CharacteristicsAdd match conditions to identify traffic destined for the apps that you want to protect. To add a match condition, you must specify the matching field, logical operator, and matching content. The matching field is a header field of HTTP requests. For more information about the fields in match conditions, see Fields in match conditions. You can specify up to five match conditions.
    Important If you enter an IP address, you must press Enter.
  6. In the Configure Protection Rules step, configure the anti-crawler rule and click Next.
    ParameterDescription
    Check Invalid App SignatureYou can use this feature to detect and control requests that have invalid signatures or do not have signatures. You cannot disable this feature. You can configure Action to handle the requests that have invalid signatures or do not have signatures. If you set the Action parameter to Monitor, WAF allows these requests and records them in security reports and logs. If you set the Action parameter to Block, WAF blocks these requests.
    Check Abnormal Device BehaviorAfter you enable this feature, WAF detects and controls the requests from the devices that have abnormal characteristics.
    The following characteristics of a device are considered abnormal characteristics:
    • Use Simulators: A simulator is used.
    • Use Proxies: A proxy is used.
    • Use Rooted Device: A rooted device is used.
    • Debugging Mode: The debugging mode is enabled.
    • Hooking: Hooking techniques are used.
    • Multiboxing: Multiple protected app processes run on the device at the same time.

    You can set the Action parameter to Monitor or Block based on your business requirements.

    ActionYou can set this parameter to Monitor or Block. This setting takes effect for Check Invalid App Signature and Check Abnormal Device Behaviors.
    IP Address ThrottlingAfter you enable this feature, you can configure throttling conditions to filter out abnormal requests. This way, HTTP flood attacks can be mitigated.

    You can specify throttling conditions for IP addresses. If the number of requests from an IP address within the specified time period exceeds the threshold, WAF performs the monitor or block action on subsequent requests. You can also specify the period during which the monitor or block action is performed. You can specify up to three conditions. For more information, see Create a custom protection policy.

    Device ThrottlingAfter you enable this feature, you can configure throttling conditions to filter out abnormal requests. This way, HTTP flood attacks can be mitigated.

    You can specify throttling conditions for devices. If the number of requests from the same device within the specified time period exceeds the threshold, WAF performs the monitor or block action on subsequent requests. You can also specify the period during which the monitor or block action is performed. You can specify up to three conditions.

    Custom Session-based ThrottlingAfter you enable this feature, you can configure custom throttling conditions to filter out abnormal requests. This way, HTTP flood attacks can be mitigated.

    You can specify throttling conditions for sessions. If the number of requests from the same session within the specified time period exceeds the threshold, WAF performs the monitor or block action on subsequent requests. You can also specify the period during which the monitor or block action is performed. You can specify up to three conditions. For more information, see Create a custom protection policy.

  7. Optional:In the Verify Actions step, check whether the anti-crawler rule is in effect.
    This step is optional. To skip this step, you can click Skip in the lower-left corner. Before you publish the rule, we recommend that you complete this step.
    Description:
    • Step 1: Enter a public IP address.: Enter the public IP address of your test device such as a mobile phone. During the test, the anti-crawler rule takes effect only for the public IP address. Therefore, the test does not affect your business.
      Important If you want to obtain the public IP address of your test device, you can click Alibaba Network Diagnose Tool. On the page that appears, search for Local IP. The value of Local IP is the public IP address of your test device. You can also use a browser to search for the IP address of your test device.
    • Step 2: Verify the SDK signature.: Click Start Test to verify that the SDK signature of the app is valid.
      Note Make sure that Anti-Bot SDK is integrated into the test device. If the Anti-Bot SDK is not integrated into the device, the signature verification fails, normal requests are blocked, and the test cannot be completed.
    • Step 3: Select an action.: Check whether the Block action is in effect. After you click Start Test, WAF immediately delivers the anti-crawler rule to the test device. In the dialog box that appears, WAF provides the test procedure, expected result, and demonstration. We recommend that you carefully read them.

      After the test is complete, you can click I Have Completed Test to go to the next step. If the test result shows exceptions, you can click Go Back to optimize the anti-crawler rule. Then, perform the test again.

      For more information about the exceptions that may occur during a test and about the solutions to handle these exceptions, see FAQ.

  8. In the Preview and Publish Protection Rules step, confirm the content of the anti-crawler rule and click Publish.
    After the anti-crawler rule is published, the rule immediately takes effect.
    Note If this is your first time to create an anti-crawler rule, you cannot view the rule ID until the rule is published. The rule ID is displayed on the Bot Management tab of the Security Report page. You can use the ID of an anti-crawler rule to check for requests that match the rule in Log Service for WAF.

FAQ

If an exception occurs during the Verify Protection Effect step, refer to the following table to resolve the issue.

ErrorCauseSolution
No valid test requests are detected. See WAF documentation or contact us to analyze the possible causes. The test request failed to send or is not sent to WAF. Make sure that the test request is sent to the IP address that maps the CNAME provided by WAF.
The header fields in the test request do not match the header fields that you configured for Traffic Characteristics in the anti-crawler rule. Modify the settings of Traffic Characteristics in the anti-crawler rule.
The originating IP address of the test request is different from the public IP address that you specified in the anti-crawler rule. Use the correct public IP address. We recommend that you click Alibaba Network Diagnose Tool to obtain your public IP address.
The test requests failed the verification. See WAF documentation or contact us to analyze the possible causes. No real user access is simulated. For example, the debugging mode or automation tools are used. Simulate real user access during the test.
An incorrect service type is selected. For example, Websites is selected when you configure an anti-crawler rule for apps. Change the value of the Service Type parameter.
An intermediate domain name is used, but an incorrect intermediate domain name is selected in the anti-crawler rule. Select Use Intermediate Domain Name. Then, select the correct intermediate domain name from the drop-down list.
Compatibility issues occur in the frontend. Contact customer service in the DingTalk group or submit a ticket.
No verification is triggered. See WAF documentation or contact us to analyze the possible causes. No test rules are generated. Perform the test several times until a test rule is generated.
No valid test requests are detected or blocked. See WAF documentation or contact us to analyze the possible causes. The test request failed to send or is not sent to WAF. Make sure that the test request is sent to the IP address that maps the CNAME provided by WAF.
The header fields in the test request do not match the header fields that you configured for Traffic Characteristics in the anti-crawler rule. Modify the settings of Traffic Characteristics in the anti-crawler rule.
The originating IP address of the test request is different from the public IP address that you specified in the anti-crawler rule. Use the correct public IP address. We recommend that you click Alibaba Network Diagnose Tool to obtain your public IP address.