All Products
Search
Document Center

Simple Log Service:SQL syntax and functions

Last Updated:Dec 28, 2024

Simple Log Service projects support SQL statements for query result analysis. This topic provides an overview of the basic SQL syntax for analytic statements.

Basic syntax

Query statements and analytic statements are delineated by |. The structure is as follows:

query statement|analytic statement

Query statements can function independently, while analytic statements must accompany query statements. The log analysis feature enables data analysis within search results or across all data in a logstore.

Important
  • We recommend limiting a query statement to 30 conditions or fewer.

  • In the absence of a FROM or WHERE clause in an analytic statement, the current logstore's data is analyzed by default. Analytic statements are case-insensitive, do not support the use of offset, and do not require a semicolon to conclude.

Statement type

Description

Query statement

A query statement defines one or more search conditions. It can consist of keywords, numeric values, ranges, a space, or an asterisk (*).

Specifying a space or asterisk (*) implies no conditions.

Analytic statement

Analytic statements are used to aggregate or analyze data from search results or all logs. For detailed information on analysis functions and syntax supported by Simple Log Service, see the following topics:

Examples of SQL analytic statements:

* | SELECT status, count(*) AS PV GROUP BY status

SQL functions and SQL clauses

SQL functions are commonly used for calculations, transformations, and formatting of data, such as summing, averaging, string manipulation, and date processing. SQL functions are typically embedded within SQL clauses.

SQL clauses are used to construct complete SQL queries or data manipulation statements, determining data sources, conditions, grouping, sorting, and more.

Example 1. Query logs from the previous day

Use the current_date function to return the current date. Then, use the date_add function to subtract a specified time interval from the current date. Display the results using a pie chart for a more intuitive view of the data. (Try Demo)

  • Query and analysis statements

    * |
    SELECT
      *
    FROM  log
    WHERE
      __time__ < to_unixtime(current_date)
      AND __time__ > to_unixtime(date_add('day', -1, current_date))
  • Result displayimage

Example 2. View the distribution of log source IPs

Use the ip_to_province function to obtain the province address corresponding to the IP. Use group by to aggregate addresses and the count function to calculate the occurrence of each address. Display the results using a pie chart. (Try Demo)

Query and analysis statements:

* |
select  
  count(1) as c,
  ip_to_province(remote_addr) as address
group by
  address
limit
  100

Example 3. Analyze Nginx inbound and outbound traffic

Use the date_trunc function to snap __time__ to the hour (__time__ is a system field representing the log collection time, defaulting to a timestamp in seconds). Use the date_format function to format the snapped result. Use group by to aggregate the snapped time and the sum function to calculate the total traffic per hour. Display the results using a line chart, setting the x-axis to time and the left y-axis to net_out and net_in. (Try Demo)

  • Query and analysis statements

    * |
    select
      sum(body_bytes_sent) as net_out,
      sum(request_length) as net_in,
      date_format(date_trunc('hour', __time__), '%m-%d %H:%i') as time
    group by
      date_format(date_trunc('hour', __time__), '%m-%d %H:%i')
    order by
      time
    limit
      10000
  • Result display

    image

Example 4. View the top 10 Nginx access paths

Use the split_part function to split request_uri by ? into an array and retrieve the first string to obtain the request path. Aggregate by this path using group by and calculate the number of accesses for each path using the count function. Sort the counts using order by, with desc indicating descending order. Display the results using a column chart. (Try Demo)

  • Query and analysis statements

    * |
    select
      count(1) as pv,
      split_part(request_uri, '?', 1) as path
    group by
      path
    order by
      pv desc
    limit
      10
  • Result display

    image

Example 5. View the trend of request method classification PV

Use the date_trunc function to snap the time to the minute, then group and aggregate PV with request_method. Continue sorting by time and display using a flow chart, with the x-axis as time, they-axis as PV, and the aggregated column as request_method. (Try Demo)

  • Query and analysis statements

    * |
    select
      date_format(date_trunc('minute', __time__), '%m-%d %H:%i') as t,
      request_method,
      count(*) as pv
    group by
      t,
      request_method
    order by
      t asc
    limit
      10000
  • Result display

    image

Example 6. Compare today's PV with yesterday's

First, calculate the total PV using the count function, then use the compare function to obtain the year-on-year comparison of today's PV with yesterday's. (Try Demo)

  • Query and analysis statements

    * |
    select
      diff [1] as today,
      round((diff [3] -1.0) * 100, 2) as growth
    FROM
      (
        SELECT
          compare(pv, 86400) as diff
        FROM
          (
            SELECT
              COUNT(1) as pv
            FROM
              log
          )
      )
  • Result displayimage

Example 7. Predict Nginx access log PV

time - time % 60 (subtract the remainder of time timestamp divided by 60 from the time timestamp) to obtain the time aligned by minute stamp. Use group by to aggregate stamp and the count function to calculate the number per minute. Use the obtained result as a subquery and the ts_predicate_simple function to predict the situation for the next 6 points. Display the query results using a time series chart. (Try Demo)

  • Query and analysis statements

    * |
    select
      ts_predicate_simple(stamp, value, 6)
    from
      (
        select
          __time__ - __time__ % 60 as stamp,
          COUNT(1) as value
        from
          log
        GROUP BY
          stamp
        order by
          stamp
      )
    LIMIT
      1000
  • Result display

    image

Example 8. Analyze HTTP_USER_AGENT and display sorted by PV

Group and aggregate by http_user_agent, then query the requests of each proxy and the total client traffic returned. Since the unit is byte, use the round function to convert to MB and retain two decimal places. Use case when to layer status into 2xx, 3xx, 4xx, 5xx, and calculate the proportion of each layer. Display using a table for a more intuitive view of the data and its meaning. (Try Demo)

Query and analysis statements:

* |
select
  http_user_agent as "User Agent",
  count(*) as pv,
  round(sum(request_length) / 1024.0 / 1024, 2) as "Request Message Traffic (MB)",
  round(sum(body_bytes_sent) / 1024.0 / 1024, 2) as "Returned Client Traffic (MB)",
  round(
    sum(
      case
        when status >= 200
        and status < 300 then 1
        else 0
      end
    ) * 100.0 / count(1),
    6
  ) as "2xx Proportion (%)",
  round(
    sum(
      case
        when status >= 300
        and status < 400 then 1
        else 0
      end
    ) * 100.0 / count(1),
    6
  ) as "3xx Proportion (%)",
  round(
    sum(
      case
        when status >= 400
        and status < 500 then 1
        else 0
      end
    ) * 100.0 / count(1),
    6
  ) as "4xx Proportion (%)",
  round(
    sum(
      case
        when status >= 500
        and status < 600 then 1
        else 0
      end
    ) * 100.0 / count(1),
    6
  ) as "5xx Proportion (%)"
group by
  "User Agent"
order by
  pv desc
limit
  100

Example 9. Analyze the proportion of error requests in Nginx logs

First, obtain the number of error requests with a status exceeding 400 and the total number of requests within SQL, then calculate the ratio externally. Display using a statistical chart. (Try Demo)

  • Query and analysis statements

    * |
    select
      round((errorCount * 100.0 / totalCount), 2) as errorRatio
    from
      (
        select
          sum(
            case
              when status >= 400 then 1
              else 0
            end
          ) as errorCount,
          count(1) as totalCount
        from
          log
      )
  • Result displayimage