All Products
Search
Document Center

Simple Log Service:Common examples of log data analysis

Last Updated:Mar 10, 2025

This topic provides some examples of log data analysis.

Trigger an alert when the error rate exceeds 40% within the previous 5 minutes

Calculate the rate at which an HTTP 500 error is returned every minute to trigger an alert when the error rate exceeds 40% within the previous 5 minutes.

status:500 | select __topic__, max_by(error_count,window_time)/1.0/sum(error_count) as error_ratio, sum(error_count) as total_error  from (
select __topic__, count(*) as error_count , __time__ - __time__ % 300  as window_time  from log group by __topic__, window_time
) 
group by __topic__  having  max_by(error_count,window_time)/1.0/sum(error_count)   > 0.4  and sum(error_count) > 500 order by total_error desc limit 100

Collect traffic statistics and configure alerts

Collect traffic statistics every minute to trigger an alert when the traffic volume falls below a specific threshold. The statistics that are collected in the previous minute do not cover a full minute. To collect the average traffic statistics per minute, divide the statistical value by greatest(max(__time__) - min(__time__),1) for normalization.

* | SELECT SUM(inflow) / greatest(max(__time__) - min(__time__),1) as inflow_per_minute, date_trunc('minute',__time__)  as minute group by minute

Calculate the average latency by data size

Distribute data to multiple buckets based on the data size and calculate the average latency of data in each bucket.

* | select avg(latency) as latency , case when originSize < 5000 then 's1' when originSize < 20000 then 's2' when originSize < 500000 then 's3' when originSize < 100000000 then 's4' else 's5' end as os group by os

Return the percentages of different results

Return the count results of different departments and the percentages of the results. This query includes subqueries and window functions. sum(c) over() indicates the sum of values in all rows.

* |  select   department, c*1.0/ sum(c) over ()  from(select  count(1) as c, department   from log group by department)

Count the number of URLs that meet the query condition

To count the number of URLs based on their characteristics, you can use the CASE WHEN clause or the COUNT_IF clause. The latter clause is simpler.

* | select  count_if(uri like '%login') as login_num, count_if(uri like '%register') as register_num, date_format(date_trunc('minute', __time__), '%m-%d %H:%i') as time  group by time order by time limit 100

General aggregate analysis

Query the global distribution of clients by PV

Use the ip_to_country function to obtain the countries to which the IP addresses belong, aggregate the countries based on the ip_country field, and then count the number of clients in each country. You can configure a world map to display the results. For more information, see Configure a world map.

* |
select
  ip_to_country(client_ip) as ip_country,
  count(*) as pv
group by
  ip_country
order by
  pv desc
limit
  500

Query the category and PV trend of request methods

Truncate time values by minute, aggregate and group the truncated time values to calculate the number of page views (PVs) based on the request_method field, and then sort the time values. You can configure a line chart to display the results. In the line chart, set the x-axis to t, the y-axis to pv, and the aggregate column to request_method. For more information, see Line chart.

* |
select
  date_format(date_trunc('minute', __time__), '%m-%d %H:%i') as t,
  request_method,
  count(*) as pv
group by
  t,
  request_method
order by
  t asc
limit
  10000

Query the distribution of http_user_agent-based requests by PV

Aggregate and group data based on the http_user_agent field and query the number of requests from each agent and the total response size returned to the clients. Convert the unit of the total response size from bytes to MB and round the size to two decimal places. Then, use the CASE WHEN expression to categorize the status string into layers, including 2xx, 3xx, 4xx, and 5xx and calculate the percentage of each layer. You can configure a table to display the results. For more information, see Basic table.

* |
select
  http_user_agent as "User agent",
  count(*) as pv,
  round(sum(request_length) / 1024.0 / 1024, 2) as "Request traffic (MB)",
  round(sum(body_bytes_sent) / 1024.0 / 1024, 2) as "Response traffic (MB)",
  round(
    sum(
      case
        when status >= 200
        and status < 300 then 1
        else 0
      end
    ) * 100.0 / count(1),
    6
  ) as "Percentage of status code 2xx (%)",
  round(
    sum(
      case
        when status >= 300
        and status < 400 then 1
        else 0
      end
    ) * 100.0 / count(1),
    6
  ) as "Percentage of status code 3xx (%)",
  round(
    sum(
      case
        when status >= 400
        and status < 500 then 1
        else 0
      end
    ) * 100.0 / count(1),
    6
  ) as "Percentage of status code 4xx (%)",
  round(
    sum(
      case
        when status >= 500
        and status < 600 then 1
        else 0
      end
    ) * 100.0 / count(1),
    6
  ) as "Percentage of status code 5xx (%)"
group by
  "User agent"
order by
  pv desc
limit
  100

Query the daily consumption and the trend forecast results for the current month

Group data based on the bill number and obtain information, including the time, service identifier, bill type, and payable amount. Aggregate and calculate the daily expense, and search for the most recent day on which the expense is generated by time. Use the sls_inner_ts_regression function to obtain the daily consumption and the trend forecast results for the current month. The time, expense, passed string, forecast period, and number of forecast points are specified in the sls_inner_ts_regression function. You can use the name of the passed string as an alias. You can configure a line chart to display the daily consumption and the trend forecast results for the current month. In the line chart, set the x-axis to time and the y-axis to the actual consumption and the forecast consumption. For more information, see Line chart.

source :bill |
select
  date_format(res.stamp, '%Y-%m-%d') as time,
  res.real as "Actual consumption",case
    when is_nan(res.real) then res.pred
    else null
  end as "Forecast consumption",
  res.instances
from(
    select
      sls_inner_ts_regression(
        cast(day as bigint),
        total,
        array ['total'],
        86400,
        60
      ) as res
    from
      (
        select
          *
        from
          (
            select
              *,
              max(day) over() as lastday
            from
              (
                select
                  to_unixtime(date_trunc('day', __time__)) as day,
                  sum(PretaxAmount) as total
                from
                  (
                    select
                      RecordID,
                      arbitrary(__time__) as __time__,
                      arbitrary(ProductCode) as ProductCode,
                      arbitrary(item) as item,
                      arbitrary(PretaxAmount) as PretaxAmount
                    from
                      log
                    group by
                      RecordID
                  )
                group by
                  day
                order by
                  day
              )
          )
        where
          day < lastday
      )
  )
limit
  1000

Query the consumption percentage of each service for the current month

Aggregate and calculate the total expense of each service by service name, use the row_number function to sort the services by expense in descending order, search for the top six services that have the highest expense, and then group the other services into Other. You can configure a donut chart to display the results. For more information, see Configure a donut chart.

source :bill |
select
  case
    when rnk > 6
    or pretaxamount <= 0 then 'Other'
    else ProductName
  end as ProductName,
  sum(PretaxAmount) as PretaxAmount
from(
    select
      *,
      row_number() over(
        order by
          pretaxamount desc
      ) as rnk
    from(
        select
          ProductName,
          sum(PretaxAmount) as PretaxAmount
        from
          log
        group by
          ProductName
      )
  )
group by
  ProductName
order by
  PretaxAmount desc
limit
  1000

Calculate the ratio of the expense of the previous day to the expense of the same day in the previous month

Calculate the expense of the previous day, use the COALESCE expression to ensure that 0 is returned if no expense is generated, use the round function to round the result to three decimal places, and then use the compare function to calculate the ratio of the expense of the previous day to the expense of the same day in the previous month. You can configure a trend chart to display the results. For more information, see Trend chart.

source :bill |
select
  round(diff [1], 2),
  round(diff [2], 2),
  round(diff [3] * 100 -100, 2)
from(
    select
      compare("Expense of the previous day", 604800) as diff
    from(
        select
          round(coalesce(sum(PretaxAmount), 0), 3) as "Expense of the previous day"
        from
          log
      )
  )

Tomcat web service analysis

Query the trend of Tomcat request status

Use the date_trunc function to truncate time values in logs by minute, use the date_format function to extract time values in hours and minutes, and then use the GROUP BY clause to aggregate data based on the extracted time values and the status code to count the number of each status code per minute. You can configure a flow chart to display the results. In the flow chart, set the x-axis to time, the y-axis to count, and the aggregate column to status. For more information, see Flow chart.

* |
select
  date_format(date_trunc('minute', __time__), '%H:%i') as time,
  COUNT(1) as c,
  status
GROUP by
  time,
  status
ORDER by
  time
LIMIT
  1000

Query the distribution of PVs and UVs for Tomcat access over time

Use the time_series function to truncate time values in logs by two minutes, aggregate data by time, and then calculate the number of PVs and unique visitors (UVs). Use the approx_distinct function to calculate the number of PVs and UVs after deduplication and sort the results by time. You can configure a line chart with multiple y-axes to display the results. In the line chart, set the x-axis to time and the y-axis to uv and pv. For more information, see Create a line chart with multiple y-axes.

* |
select
  time_series(__time__, '2m', '%H:%i', '0') as time,
  COUNT(1) as pv,
  approx_distinct(remote_addr) as uv
GROUP by
  time
ORDER by
  time
LIMIT
  1000

Query the number of Tomcat error requests and the comparison with that of the previous hour

Use the internal SQL statement to obtain the number of requests whose status code is greater than or equal to 400 and use the compare function in the middle part of the SQL statement to obtain the data that was generated 3,600 seconds ago. In the external SQL statement, c1 specifies the number of error requests at the current time, c2 specifies the number of error requests that were generated 3,600 seconds ago, c3 specifies the ratio of c1 to c2. You can configure a donut chart to display the results. In the donut chart, c1 is the display value and c3 is the comparison value. For more information, see Configure a donut chart.

status >= 400 |
SELECT
  diff [1] AS c1,
  diff [2] AS c2,
  round(diff [1] * 100.0 / diff [2] - 100.0, 2) AS c3
FROM
  (
    select
      compare(c, 3600) AS diff
    from
      (
        select
          count(1) as c
        from
          log
      )
  )

Query the top 10 URIs in Tomcat requests

Aggregate and group data based on the request_uri field, calculate the number of PVs for each request_uri, sort the results by PV in descending order, and then obtain the top 10 records. You can configure a basic bar gauge. In the bar gauge, set the x-axis to page and the y-axis to pv. For more information, see Configure a basic bar gauge.

* |
SELECT
  request_uri as page,
  COUNT(*) as pv
GROUP by
  page
ORDER by
  pv DESC
LIMIT
  10

Query the types and distribution of Tomcat clients

Aggregate and analyze data based on the user_agent field, and calculate the number of clients of each type. You can configure a donut chart to display the results. In the donut chart, set the type to user_agent and the value column to c. For more information, see Configure a donut chart.

* |
SELECT
  user_agent,
  COUNT(*) AS c
GROUP BY
  user_agent
ORDER BY
  c DESC

Collect statistics on the outbound Tomcat traffic

Use the time_series function to truncate time values and aggregate the time values to calculate the sum of the values of the body_bytes_sent field. You can configure the x-axis and the y-axis of a line chart to display the results. In the line chart, set the x-axis to time and the y-axis to body-sent. For more information, see Configure the x-axis and the y-axis of a line chart.

* |
select
  time_series(__time__, '10s', '%H:%i:%S', '0') as time,
  sum(body_bytes_sent) as body_sent
GROUP by
  time
ORDER by
  time
LIMIT
  1000

Query the percentage of the number of error Tomcat requests

Query the number of error requests whose status code is greater than or equal to 400 and the total number of requests in the internal SQL statement, and calculate the ratio of the number of error Tomcat requests to the total number of requests. You can configure a dial to display the results. For more information, see Configure a dial.

* |
select
  round((errorCount * 100.0 / totalCount), 2) as errorRatio
from
  (
    select
      sum(
        case
          when status >= 400 then 1
          else 0
        end
      ) as errorCount,
      count(1) as totalCount
    from
      log
  )