The bot management module of Web Application Firewall (WAF) allows you to configure anti-crawler rules for websites and apps. You can configure anti-crawler rules for your native iOS or Android apps to protect your services against crawlers. HTML5 apps are not native iOS or Android apps. This topic describes how to configure anti-crawler rulers for apps.

Prerequisites

Create an anti-crawler rule template for apps

  1. Log on to the WAF 3.0 console.In the top navigation bar, select the resource group and the region to which the WAF instance belongs. You can select the Chinese Mainland or Outside Chinese Mainland region.
  2. In the left-side navigation pane, choose Protection Configuration > Protection Rules.
  3. Create a template.
    • If no bot management rule template exists, you can click Configure Now in the Bot Management card in the upper part of the Protection Rules page. You can also click Create Template in the Bot Management section in the lower part of the Protection Rules page.
    • If a bot management rule template exists, you can only click Create Template in the Bot Management section in the lower part of the Protection Rules page.
  4. In the Configure Scenarios step, configure the basic information about the app that you want to protect and click Next.
    Parameter Description
    Template Name Enter a name for the template.

    The name can contain letters, digits, and underscores (_).

    Template Description Enter a description for the template.
    Service Type Select App to protect native iOS and Android apps.
    App SDK Integration WAF provides the Anti-Bot SDK to enhance protection capabilities for native Android and iOS apps. After the Anti-Bot SDK is integrated, the Anti-Bot SDK collects the risk characteristics of clients and generates security signatures in requests. WAF identifies and blocks requests that are identified as unsafe based on the signatures. To obtain the SDK package, click Obtain and Copy AppKey and then submit a ticket or join the DingTalk group. For more information, see Integrate the Anti-Bot SDK into Android apps and Integrate the Anti-Bot SDK into iOS apps.
    Traffic Characteristics Add match conditions to identify traffic destined for the domain name of the apps that you want to protect. To add a condition, you must specify the match field, logical operator, and match content. The match field is a header field of HTTP requests. For more information about the match fields, see Fields in match conditions. You can add up to five match conditions.
    Notice After you enter an IP address, you must press the Enter key.
  5. In the Configure Protection Rules step, configure anti-crawler rules and click Next.
    Parameter Description
    Bot Characteristic Detection By default, Invalid App Signature is selected and cannot be cleared. It blocks requests that include invalid signatures or do not include signatures after the Anti-Bot SDK is integrated.
    Abnormal Device Behavior
    If this feature is enabled, WAF detects and controls the requests from the devices that have abnormal behaviors. The following behaviors are considered abnormal:
    • Expired Signature: The signature expires. This behavior is selected by default.
    • Using Simulator: A simulator is used.
    • Using Proxy: A proxy is used.
    • Rooted Device: A rooted device is used.
    • Debugging Mode: The debugging mode is used.
    • Hooking: Hooking techniques are used.
    • Multiboxing: Multiple protected app processes run on the device at the same time.
    Custom Signature Field Select Header, Parameter, or Cookie from the Field Name drop-down list and enter your custom signature in the Value field.

    If the custom signature is empty or has special characters or its length exceeds the limit, you can hash the signature or process it in other ways and enter the processing result in the Value field.

    Action
    Select Monitor or Block based on your business requirements.
    • Monitor: triggers alerts and does not block requests.
    • Block: blocks requests.
    Secondary Packaging Detection Click Advanced Protection and select Secondary Packaging Detection.
    Requests that are sent from apps whose package names or signatures are not in the allowlists are considered secondary packaging requests. You can specify valid application packages.
    • Valid Package Name: Enter the valid application package name. Example: example.aliyundoc.com.
    • Signature: Contact Alibaba Cloud technical support to obtain the signature. This parameter is optional if the package signature does not need to be verified. In this case, WAF verifies only the package name.
      Notice The Signature is not the signature of the application certificate.

    You can add up to five valid iOS or Android application packages and the package names must be unique.

    Select Monitor or Block based on your business requirements.

    Throttling You can configure custom throttling conditions to filter out the requests that are frequently initiated for crawling to prevent HTTP flood attacks.
    • IP Address Throttling (Default):

      You can configure throttling conditions for IP addresses. If the number of requests from the same IP address within the value specified by Statistical Interval (Seconds) exceeds the value of Threshold (Times), WAF performs the specified action on subsequent requests. The action can be specified by selecting Block or Monitor from the Action drop-down list. You can also set Throttling Interval (Seconds) which specifies the period during which the specified action is performed. You can configure up to three throttling conditions. For more information, see Configure the custom rule module.

    • Device Throttling

      You can configure throttling conditions for devices. If the number of requests from the same device within the value specified by Statistical Interval (Seconds) exceeds the value of Threshold (Times), WAF performs the specified action on subsequent requests. The action can be specified by selecting Block or Monitor from the Action drop-down list. You can also set Throttling Interval (Seconds) which specifies the period during which the specified action is performed. You can configure up to three throttling conditions. For more information, see Configure the custom rule module.

    • Custom Session Throttling

      You can configure throttling conditions for sessions. You can set Session Type to specify the session type. If the number of requests from the same session within the value specified by Statistical Interval (Seconds) exceeds the value of Threshold (Times), WAF performs the specified action on subsequent requests. The action can be specified by selecting Block or Monitor from the Action drop-down list. You can also set Throttling Interval (Seconds) which specifies the period during which the specified action is performed. For more information, see Configure the custom rule module.

  6. In the Configure Effective Scope step, select the object or object group that you want to protect and click add to add the object or object group to the Selected Objects section on the right. Then, click Next.
  7. Optional:In the Verify Protection Effect step, test the effectiveness of the anti-crawler rule.
    Before you publish the anti-crawler rule, we recommend that you verify the protection effect to prevent false positives caused by improper rule configurations or compatibility issues. If you are certain that the rule configurations are correct, click Skip to skip this step.
    Test steps:
    1. Step 1: Enter a public IP address.: Enter the public IP address of your test device, such as a computer or mobile phone. The test of the anti-crawler rule takes effect only for the public IP address. The test does not affect your business.
      Notice Do not enter the IP address that you obtained by running the ipconfig command. This command returns an internal IP address. If you are not sure about the public IP address of your test device, you can use a tool or website to query the public IP address.
    2. Step 2: Select an action.: Test the effectiveness of the protection action that you specified in the Configure Protection Rules step. WAF generates a test rule only for the specified IP address. The action can be JavaScript Validation, Dynamic Token-based Authentication, Slider CAPTCHA Verification, or Block Verification.

      After you click Test for an action, WAF immediately delivers the test rule to the test device. In the dialog box that appears, WAF provides the test procedure, expected result, and demonstration. We recommend that you carefully read them.

      After the test is complete, you can click I Have Completed the Test to go to the next step. If the test result shows exceptions, you can click Go Back to optimize the anti-crawler rule. Then, perform the test again.

FAQ

If an exception occurs during the Verify Protection Effect step, refer to the following table to resolve the issue.

Error Cause Solution
No valid test requests are detected. See WAF documentation or contact us to analyze the possible causes. The test request failed to send or is not sent to WAF. Make sure that the test request is sent to the IP address that maps the CNAME provided by WAF.
The header fields in the test request do not match the header fields that you configured for Traffic Characteristics in the anti-crawler rule. Modify the settings of Traffic Characteristics in the anti-crawler rule.
The originating IP address of the test request is different from the public IP address that you specified in the anti-crawler rule. Use the correct public IP address. We recommend that you click Alibaba Network Diagnose Tool to obtain your public IP address.
The test requests failed the verification. See WAF documentation or contact us to analyze the possible causes. No real user access is simulated. For example, the debugging mode or automation tools are used. Simulate real user access during the test.
An incorrect service type is selected. For example, Websites is selected when you configure an anti-crawler rule for apps. Change the value of the Service Type parameter.
An intermediate domain name is used, but an incorrect intermediate domain name is selected in the anti-crawler rule. Select Use Intermediate Domain Name. Then, select the correct intermediate domain name from the drop-down list.
Compatibility issues occur in the frontend. Contact customer service in the DingTalk group or submit a ticket.
No verification is triggered. See WAF documentation or contact us to analyze the possible causes. No test rules are generated. Perform the test several times until a test rule is generated.
No valid test requests are detected or blocked. See WAF documentation or contact us to analyze the possible causes. The test request failed to send or is not sent to WAF. Make sure that the test request is sent to the IP address that maps the CNAME provided by WAF.
The header fields in the test request do not match the header fields that you configured for Traffic Characteristics in the anti-crawler rule. Modify the settings of Traffic Characteristics in the anti-crawler rule.
The originating IP address of the test request is different from the public IP address that you specified in the anti-crawler rule. Use the correct public IP address. We recommend that you click Alibaba Network Diagnose Tool to obtain your public IP address.