This topic describes how to enable the log query feature. The log query and log analysis features provided by Log Service are integrated into Web Application Firewall (WAF). You can query and analyze logs on the Log Service page of the WAF console. After you enable WAF log collection for a specific domain name, you can query and analyze the collected log entries in real time, view or edit dashboards, and configure monitoring and alerting rules on the Log Service page in the WAF console.

Procedure

  1. Log on to the Web Application Firewall console.
  2. In the top navigation bar, select the resource group to which the instance belongs and the region, Mainland China or International, in which the instance is deployed.
  3. In the left-side navigation pane, choose Log Management > Log Service .
  4. On the Log Service page, select the required domain name. Make sure that Status next to the domain name is turned on.
  5. Click the Log Query tab.
    The log query and analysis page in the Log Service console is integrated into this tab. The WAF console automatically provides a query statement. For example, the console provides the matched_host: "<yourDomainName>" statement, which is used to query all the log entries for the domain name that you select. Log Query
  6. Enter a query and analysis statement, select a time range, and then click Search & Analyze.

More operations on the Log Query tab

On the Log Query tab, you can perform the following operations on log entries:
  • Use custom statements to query and analyze log entries

    Log Service supports a wide range of query and analysis statements that can be used in complex scenarios. For more information, see Use custom statements to query and analyze log entries.

  • View the distribution of log entries within a specific time range
    The column chart below the search box shows the distribution of log entries within the time range that are returned by the query statement. The x-axis indicates the time range, and the y-axis indicates the number of log entries. The total number of log entries is displayed below the chart.
    Note You can drag a column in the column chart to narrow down the time range. The time picker is automatically updated to the time range after narrow-down, and the query results are also updated.
  • View raw logs

    On the Raw Logs tab, the details of each log entry are displayed in a separate section. The details include the time when the log entry was generated and each field in the log content. You can click Display Content Column to set the display mode to Full Line or New Line for long strings in the Content column. You can also click Column Settings to select the columns that you want to view and click the download icon to download the query results to your computer.

  • View analysis charts

    Log analysis results are displayed in charts. You can select various types of charts on the Graph tab. For more information, see Overview.

    Graph tab
  • Perform quick analysis

    On the Raw Logs tab, you can view the distribution of a log field within a specific time range with a few clicks. This helps reduce the time required to query key data. For more information, see Quick analysis.

    Quick Analysis

Use custom statements to query and analyze log entries

A query and analysis statement consists of a search clause and an analytics clause that are separated by a vertical bar (|).

$Search | $Analytics
Clause Description
Search Specifies the keyword, fuzzy string, numeric value, range, or a combination of these items as the query condition. If the statement is empty or contains only an asterisk (*), all log entries are queried.
Analytics Calculates and analyzes the query results or all log entries.
Note Both the search and analytics clauses are optional.
  • If the search clause is empty, all log entries within a specific time range are queried and analyzed.
  • If the analytics clause is empty, the query results are returned but are not analyzed.

Search syntax

Log Service supports full-text query and field-based query. Statements in the search box can be displayed in multiple lines and highlighted.
  • Full-text query

    When you enter keywords to query log entries, you do not need to specify fields. If you want to query the log entries that contain the entire keyword, you can enter the keyword in a pair of double quotation marks ("). If you enter multiple keywords, separate them with spaces or and.

    Examples
    • Query log entries based on multiple keywords

      You can query the log entries that contain both www.aliyun.com and error.

      www.aliyun.com error or www.aliyun.com and error

    • Query log entries based on a condition

      You can query the log entries that contain www.aliyun.com and contain error or 404.

      www.aliyun.com and (error or 404)
    • Query log entries based on a prefix

      You can query the log entries that contain www.aliyun.com and start with failed_.

      www.aliyun.com and failed_*
      Note The asterisk (*) can be added only as a suffix. It cannot be added as a prefix. For example, *_error is not supported.
  • Field-based query

    You can query log entries based on fields.

    You can specify a numeric field and value in the format of Field name: Value or Field name >= Value. You can also use operators to specify a combination of fields. The operators include and and or. Furthermore, you can use field-based query together with full text query.
    Note In Log Service for WAF, you can use field-based query to obtain the log entries that record access requests, operations, and attacks on a specific domain name. For more information about the definition, type, and format of each field, see Log fields.
    Examples
    • Query log entries based on multiple fields

      You can query the log entries that record HTTP flood attacks blocked by WAF on www.aliyun.com.

      matched_host: www.aliyun.com and cc_blocks: 1

      You can query the log entries that record access requests from the client whose IP address is 1.2.3.4 to www.aliyun.com. The requests are responded with HTTP status code 404.

      real_client_ip: 1.2.3.4 and matched_host: www.aliyun.com and status: 404
      Note In this example, the matched_host, cc_blocks, real_client_ip, and status fields are defined in WAF logs.
    • Query log entries based on numeric fields

      You can query the log entries that record slow requests whose response time exceeds 5 seconds.

      request_time_msec > 5000

      You can also query log entries based on a time range. For example, you can specify the following time range to query the log entries that record the requests whose response time exceeds 5 seconds but does not exceed 10 seconds:

      request_time_msec in (5000 10000]
      Note You can use the following condition to obtain the same query results: request_time_msec > 5000 and request_time_msec <= 10000.
    • Query log entries based on field existence

      You can query the log entries that contain or do not contain a specific field.

      • You can query all the log entries that contain the ua_browser field.
        ua_browser: *
      • You can query all the log entries that do not contain the ua_browser field.
        not ua_browser: *
For more information about the query statements supported by Log Service, see Overview.

Analytics syntax

You can execute SQL-92 statements to analyze log entries.

For more information about the statement syntax and functions supported by Log Service, see Real-time log analysis.
Note
  • You can omit the from Table name clause in standard SQL statements. This clause is equivalent to from log.
  • By default, the first 100 log entries are returned. If you want to adjust this number, you can execute a LIMIT statement. For more information, see LIMIT syntax.

Examples of log query and analysis

Time-based log query and analysis

Each log entry has a time field, which indicates the time when the log entry is generated. The time is in the format of YYYY-MM-DDThh:mm:ss+Time zone. For example, the time is 2018-05-31T20:11:58+08:00, and the time zone is UTC+8.

Each log entry has a built-in field __time__. This field also indicates the time when the log entry is generated. The time is a UNIX timestamp and is used in time-based calculation. The value of this field indicates the number of seconds that have elapsed since the UTC time 00:00:00, January 1, 1970. For more information, see UNIX time. If you want to display recognizable calculation results, you must convert the format first.
  • Select and display the time
    You can use the time field to display the time information of logs. For example, you can query the last 10 log entries that record the HTTP flood attacks blocked by WAF on www.aliyun.com within a specific time range. You can also specify the time, real_client_ip, and http_user_agent fields in the query statement.
    matched_host: www.aliyun.com and cc_blocks: 1 
    |  select time, real_client_ip, http_user_agent
        order by time desc
        limit 10
  • Calculate the time
    You can use the __time__ field to calculate the time. For example, you can calculate the number of days that have elapsed since a domain name experienced an HTTP flood attack.
    matched_host: www.aliyun.com  and cc_blocks: 1 
    |  select time, round((to_unixtime(now()) - __time__)/86400, 1) as "days_passed", real_client_ip, http_user_agent
          order by time desc
          limit 10

    In this example, round((to_unixtime(now()) - __time__)/86400, 1) is used to calculate the number of days that have elapsed since the domain name experienced an HTTP flood attack. First, now() is used to obtain the current time, and to_unixtime is used to convert the current time into a UNIX timestamp. Then, the value of the built-in field __time__ is subtracted from the converted timestamp to obtain the number of seconds that have elapsed. Finally, the number of seconds is divided by 86400 (the total number of seconds in a day), and round(data, 1) is used to round the calculation result. Only one decimal place is retained. The final result is the number of days that have elapsed since each attack log entry was generated.

  • Perform statistical analysis by using groupings and a time unit
    You can query the log entries within a specific time range to view the trends of HTTP flood attacks on a domain name.
    matched_host: www.aliyun.com  and cc_blocks: 1 
    | select date_trunc('day', __time__) as dt, count(1) as PV 
          group by dt 
          order by dt

    In this example, the built-in field __time__ is used by the date_trunc('day', ..)function to align the time of the log entries by day. Each log entry is grouped based on the day the log entry was generated. The total number of log entries in each group is measured by count(1). Then, these log entries are ordered by group. You can use other time units for the first parameter of the date_trunc function to group the log entries. The other time units include second, minute, hour, week, month, and year. For more information about this function, see Date and time functions.

    We recommend that you click the Line Chart icon to view the results.

  • Perform statistical analysis by using groupings and a time interval
    If you want to use more flexible groupings to analyze log entries based on a time interval, the required calculations are more complex. For example, you can query log entries to view the trends of HTTP flood attacks on a domain name at an interval of 5 minutes.
    matched_host: www.aliyun.com  and cc_blocks: 1 
    | select from_unixtime(__time__ - __time__% 300) as dt, 
             count(1) as PV 
          group by dt 
          order by dt 
          limit 1000

    In this example, __time__ - __time__% 300 is used to align the time, and the from_unixtime function converts the format of the result. Each log entry is grouped based on a time interval of 5 minutes (300 seconds), and the total number of log entries in each group is measured by count(1). Finally, the query results are ordered by time interval, and the first 1,000 log entries are returned.

    We recommend that you click the Line Chart icon to view the results.

The date_parse and date_format functions convert the time format. For more information about the functions that can be used to parse the time, see Date and time functions.

Client IP address-based log query and analysis

A WAF log contains the real_client_ip field, which indicates the actual IP address of a client. If your website is accessed by using a proxy server or the IP address in a request header is invalid, WAF may fail to obtain the actual IP address of the visitor. In this situation, the remote_addr field of the client that is directly connected to WAF can be used.
  • Classify attackers by country
    You can query log entries to view the distribution of HTTP flood attacks by country.
    matched_host: www.aliyun.com  and cc_blocks: 1 
    | SELECT ip_to_country(if(real_client_ip='-', remote_addr, real_client_ip)) as country, 
             count(1) as "Number of attacks" 
             group by country

    In this example, the if(condition, option1, option2) function returns the actual IP address of a client. If real_client_ip is -, the function returns the value of remote_addr. Otherwise, the function returns real_client_ip. Then, the ip_to_country function returns the country information of the actual IP address.

    We recommend that you click the World Map icon to view the results.

  • Classify visitors by province in China
    If you want to view the distribution of visitors by province in China, you can use the ip_to_province function to obtain the province information of actual IP addresses.
    matched_host: www.aliyun.com  and cc_blocks: 1 
    | SELECT ip_to_province(if(real_client_ip='-', remote_addr, real_client_ip)) as province, 
             count(1) as "Number of attacks" 
             group by province

    In this example, the ip_to_province function returns the province information of the actual IP address of a client. If the actual IP address is in a region outside China, the function attempts to obtain the information about the specific state or province of the IP address. If you choose to display the results on a map of China, the IP addresses that are in regions outside China are not displayed.

    We recommend that you click the Maps icon to view the results.

  • Classify attackers on a heat map
    If you want to view the distribution of attackers on a heat map, you can use the ip_to_geo function to obtain the geographic information of the actual IP addresses of clients that launched attacks. The information includes the latitudes and longitudes, and can be used to generate a heat map.
    matched_host: www.aliyun.com  and cc_blocks: 1 
    | SELECT ip_to_geo(if(real_client_ip='-', remote_addr, real_client_ip)) as geo, 
             count(1) as "Number of attacks" 
             group by geo
             limit 10000

    In this example, the ip_to_geo function returns the latitude and the longitude of the actual IP address of a client. The first 10,000 log entries are returned.

    You can select AMap and click Show Heat Map.

For more information about IP address-based functions, see IP functions. For example, you can use the ip_to_provider function to obtain the information about the provider of an IP address. You can also use the ip_to_domain function to determine whether an IP address is public or internal.