This topic provides some examples of log data analysis.
Trigger an alert when the error rate exceeds 40% within the previous 5 minutes
Calculate the rate at which an HTTP 500 error is returned every minute to trigger an alert when the error rate exceeds 40% within the previous 5 minutes.
status:500 | select __topic__, max_by(error_count,window_time)/1.0/sum(error_count) as error_ratio, sum(error_count) as total_error from (
select __topic__, count(*) as error_count , __time__ - __time__ % 300 as window_time from log group by __topic__, window_time
)
group by __topic__ having max_by(error_count,window_time)/1.0/sum(error_count) > 0.4 and sum(error_count) > 500 order by total_error desc limit 100
Collect traffic statistics and configure alerts
Collect traffic statistics every minute to trigger an alert when the traffic volume falls below a specific threshold. The statistics that are collected in the previous minute do not cover a full minute. To collect the average traffic statistics per minute, divide the statistical value by greatest(max(__time__) - min(__time__),1)
for normalization.
* | SELECT SUM(inflow) / greatest(max(__time__) - min(__time__),1) as inflow_per_minute, date_trunc('minute',__time__) as minute group by minute
Calculate the average latency by data size
Distribute data to multiple buckets based on the data size and calculate the average latency of data in each bucket.
* | select avg(latency) as latency , case when originSize < 5000 then 's1' when originSize < 20000 then 's2' when originSize < 500000 then 's3' when originSize < 100000000 then 's4' else 's5' end as os group by os
Return the percentages of different results
Return the count results of different departments and the percentages of the results. This query includes subqueries and window functions. sum(c) over()
indicates the sum of values in all rows.
* | select department, c*1.0/ sum(c) over () from(select count(1) as c, department from log group by department)
Count the number of URLs that meet the query condition
To count the number of URLs based on their characteristics, you can use the CASE WHEN clause or the COUNT_IF clause. The latter clause is simpler.
* | select count_if(uri like '%login') as login_num, count_if(uri like '%register') as register_num, date_format(date_trunc('minute', __time__), '%m-%d %H:%i') as time group by time order by time limit 100
General aggregate analysis
Query the global distribution of clients by PV
Use the ip_to_country function to obtain the countries to which the IP addresses belong, aggregate the countries based on the ip_country field, and then count the number of clients in each country. You can configure a world map to display the results. For more information, see Configure a world map.
* |
select
ip_to_country(client_ip) as ip_country,
count(*) as pv
group by
ip_country
order by
pv desc
limit
500
Query the category and PV trend of request methods
Truncate time values by minute, aggregate and group the truncated time values to calculate the number of page views (PVs) based on the request_method field, and then sort the time values. You can configure a line chart to display the results. In the line chart, set the x-axis to t, the y-axis to pv, and the aggregate column to request_method. For more information, see Line chart.
* |
select
date_format(date_trunc('minute', __time__), '%m-%d %H:%i') as t,
request_method,
count(*) as pv
group by
t,
request_method
order by
t asc
limit
10000
Query the distribution of http_user_agent-based requests by PV
Aggregate and group data based on the http_user_agent field and query the number of requests from each agent and the total response size returned to the clients. Convert the unit of the total response size from bytes to MB and round the size to two decimal places. Then, use the CASE WHEN
expression to categorize the status string into layers, including 2xx, 3xx, 4xx, and 5xx and calculate the percentage of each layer. You can configure a table to display the results. For more information, see Basic table.
* |
select
http_user_agent as "User agent",
count(*) as pv,
round(sum(request_length) / 1024.0 / 1024, 2) as "Request traffic (MB)",
round(sum(body_bytes_sent) / 1024.0 / 1024, 2) as "Response traffic (MB)",
round(
sum(
case
when status >= 200
and status < 300 then 1
else 0
end
) * 100.0 / count(1),
6
) as "Percentage of status code 2xx (%)",
round(
sum(
case
when status >= 300
and status < 400 then 1
else 0
end
) * 100.0 / count(1),
6
) as "Percentage of status code 3xx (%)",
round(
sum(
case
when status >= 400
and status < 500 then 1
else 0
end
) * 100.0 / count(1),
6
) as "Percentage of status code 4xx (%)",
round(
sum(
case
when status >= 500
and status < 600 then 1
else 0
end
) * 100.0 / count(1),
6
) as "Percentage of status code 5xx (%)"
group by
"User agent"
order by
pv desc
limit
100
Query the daily consumption and the trend forecast results for the current month
Group data based on the bill number and obtain information, including the time, service identifier, bill type, and payable amount. Aggregate and calculate the daily expense, and search for the most recent day on which the expense is generated by time. Use the sls_inner_ts_regression function to obtain the daily consumption and the trend forecast results for the current month. The time, expense, passed string, forecast period, and number of forecast points are specified in the sls_inner_ts_regression function. You can use the name of the passed string as an alias. You can configure a line chart to display the daily consumption and the trend forecast results for the current month. In the line chart, set the x-axis to time and the y-axis to the actual consumption and the forecast consumption. For more information, see Line chart.
source :bill |
select
date_format(res.stamp, '%Y-%m-%d') as time,
res.real as "Actual consumption",case
when is_nan(res.real) then res.pred
else null
end as "Forecast consumption",
res.instances
from(
select
sls_inner_ts_regression(
cast(day as bigint),
total,
array ['total'],
86400,
60
) as res
from
(
select
*
from
(
select
*,
max(day) over() as lastday
from
(
select
to_unixtime(date_trunc('day', __time__)) as day,
sum(PretaxAmount) as total
from
(
select
RecordID,
arbitrary(__time__) as __time__,
arbitrary(ProductCode) as ProductCode,
arbitrary(item) as item,
arbitrary(PretaxAmount) as PretaxAmount
from
log
group by
RecordID
)
group by
day
order by
day
)
)
where
day < lastday
)
)
limit
1000
Query the consumption percentage of each service for the current month
Aggregate and calculate the total expense of each service by service name, use the row_number function to sort the services by expense in descending order, search for the top six services that have the highest expense, and then group the other services into Other. You can configure a donut chart to display the results. For more information, see Configure a donut chart.
source :bill |
select
case
when rnk > 6
or pretaxamount <= 0 then 'Other'
else ProductName
end as ProductName,
sum(PretaxAmount) as PretaxAmount
from(
select
*,
row_number() over(
order by
pretaxamount desc
) as rnk
from(
select
ProductName,
sum(PretaxAmount) as PretaxAmount
from
log
group by
ProductName
)
)
group by
ProductName
order by
PretaxAmount desc
limit
1000
Calculate the ratio of the expense of the previous day to the expense of the same day in the previous month
Calculate the expense of the previous day, use the COALESCE expression to ensure that 0 is returned if no expense is generated, use the round function to round the result to three decimal places, and then use the compare function to calculate the ratio of the expense of the previous day to the expense of the same day in the previous month. You can configure a trend chart to display the results. For more information, see Trend chart.
source :bill |
select
round(diff [1], 2),
round(diff [2], 2),
round(diff [3] * 100 -100, 2)
from(
select
compare("Expense of the previous day", 604800) as diff
from(
select
round(coalesce(sum(PretaxAmount), 0), 3) as "Expense of the previous day"
from
log
)
)
Tomcat web service analysis
Query the trend of Tomcat request status
Use the date_trunc function to truncate time values in logs by minute, use the date_format function to extract time values in hours and minutes, and then use the GROUP BY clause to aggregate data based on the extracted time values and the status code to count the number of each status code per minute. You can configure a flow chart to display the results. In the flow chart, set the x-axis to time, the y-axis to count, and the aggregate column to status. For more information, see Flow chart.
* |
select
date_format(date_trunc('minute', __time__), '%H:%i') as time,
COUNT(1) as c,
status
GROUP by
time,
status
ORDER by
time
LIMIT
1000
Query the distribution of PVs and UVs for Tomcat access over time
Use the time_series function to truncate time values in logs by two minutes, aggregate data by time, and then calculate the number of PVs and unique visitors (UVs). Use the approx_distinct function to calculate the number of PVs and UVs after deduplication and sort the results by time. You can configure a line chart with multiple y-axes to display the results. In the line chart, set the x-axis to time and the y-axis to uv and pv. For more information, see Create a line chart with multiple y-axes.
* |
select
time_series(__time__, '2m', '%H:%i', '0') as time,
COUNT(1) as pv,
approx_distinct(remote_addr) as uv
GROUP by
time
ORDER by
time
LIMIT
1000
Query the number of Tomcat error requests and the comparison with that of the previous hour
Use the internal SQL statement to obtain the number of requests whose status code is greater than or equal to 400 and use the compare function in the middle part of the SQL statement to obtain the data that was generated 3,600 seconds ago. In the external SQL statement, c1 specifies the number of error requests at the current time, c2 specifies the number of error requests that were generated 3,600 seconds ago, c3 specifies the ratio of c1 to c2. You can configure a donut chart to display the results. In the donut chart, c1 is the display value and c3 is the comparison value. For more information, see Configure a donut chart.
status >= 400 |
SELECT
diff [1] AS c1,
diff [2] AS c2,
round(diff [1] * 100.0 / diff [2] - 100.0, 2) AS c3
FROM
(
select
compare(c, 3600) AS diff
from
(
select
count(1) as c
from
log
)
)
Query the top 10 URIs in Tomcat requests
Aggregate and group data based on the request_uri field, calculate the number of PVs for each request_uri, sort the results by PV in descending order, and then obtain the top 10 records. You can configure a basic bar gauge. In the bar gauge, set the x-axis to page and the y-axis to pv. For more information, see Configure a basic bar gauge.
* |
SELECT
request_uri as page,
COUNT(*) as pv
GROUP by
page
ORDER by
pv DESC
LIMIT
10
Query the types and distribution of Tomcat clients
Aggregate and analyze data based on the user_agent field, and calculate the number of clients of each type. You can configure a donut chart to display the results. In the donut chart, set the type to user_agent and the value column to c. For more information, see Configure a donut chart.
* |
SELECT
user_agent,
COUNT(*) AS c
GROUP BY
user_agent
ORDER BY
c DESC
Collect statistics on the outbound Tomcat traffic
Use the time_series function to truncate time values and aggregate the time values to calculate the sum of the values of the body_bytes_sent field. You can configure the x-axis and the y-axis of a line chart to display the results. In the line chart, set the x-axis to time and the y-axis to body-sent. For more information, see Configure the x-axis and the y-axis of a line chart.
* |
select
time_series(__time__, '10s', '%H:%i:%S', '0') as time,
sum(body_bytes_sent) as body_sent
GROUP by
time
ORDER by
time
LIMIT
1000
Query the percentage of the number of error Tomcat requests
Query the number of error requests whose status code is greater than or equal to 400 and the total number of requests in the internal SQL statement, and calculate the ratio of the number of error Tomcat requests to the total number of requests. You can configure a dial to display the results. For more information, see Configure a dial.
* |
select
round((errorCount * 100.0 / totalCount), 2) as errorRatio
from
(
select
sum(
case
when status >= 400 then 1
else 0
end
) as errorCount,
count(1) as totalCount
from
log
)