This topic describes how to query and analyze website logs in the Log Service console.
Prerequisites
Step 1: Configure indexes
Step 2: Query logs
Search statement | Analytic statement
. A search statement can be executed alone. However, an analytic statement must be
executed together with a search statement. The analysis feature is based on search
results or all data in the Logstore.
- To query the log entries that contain Chrome, execute the following search statement:
Chrome
- To query the log entries whose request duration is greater than 60 seconds, execute
the following search statement:
request_time > 60
- To query the log entries whose request duration ranges from 60 seconds to 120 seconds,
execute the following search statement:
request_time in [60 120]
- To query the log entries that contain successful GET requests (status code: 200 to
299), execute the following search statement:
request_method : GET and status in [200 299]
- To query the log entries whose value of the request_uri field is /request/path-2, execute the following search statement:
request_uri:/request/path-2/file-2
Step 3: Analyze logs
- Calculate the page views (PVs) of a website.
To calculate the PVs of a website, use the COUNT function in a query statement.
* | SELECT COUNT(*) AS PV
- Calculate the PVs of a website by 1 minute.
Use the date_trunc function to truncate a time by minute and use the GROUP BY clause to group analysis results by time. Then, use the COUNT function to calculate the number of PVs per minute and use the ORDER BY clause to sort the analysis results by time.
* | SELECT COUNT(*) as PV, date_trunc('minute', __time__) as time GROUP BY time ORDER BY time
- Calculate the number of requests for each request method by 5 minutes.
Use __time__ - __time__ %300 to truncate a time by 5 minutes and use the GROUP BY clause to group analysis results by time. Then, use the COUNT function to calculate the number of requests every 5 minutes and use the ORDER BY clause to sort the analysis results by time.
* | SELECT request_method, COUNT(*) as count, __time__ - __time__ %300 as time GROUP BY time, request_method ORDER BY time
- Compare the number of PVs of the current week with the number of PVs of the last week.
Use the COUNT function to calculate the total number of PVs. Then, use the ts_compare function to obtain the ratio of the PVs of the current week to the PVs of the last week. website_log in the following query statement is the Logstore name:
* | SELECT diff[1] as this_week, diff[2] as last_week, time FROM (SELECT ts_compare(pv, 604800) as diff, time FROM (SELECT COUNT(*) as pv, date_trunc('week', __time__) as time FROM website_log GROUP BY time ORDER BY time) GROUP BY time)
- Collect the distribution statistics of client IP addresses.
Use the ip_to_province function to obtain the province to which an IP address belongs, and use the GROUP BY clause to group analysis results by province. Then, use the COUNT function to calculate the number of occurrences of each IP address, and use the ORDER BY clause to sort the analysis results by the number of occurrences.
* | SELECT COUNT(*) as count, ip_to_province(client_ip) as address GROUP BY address ORDER BY count DESC
- Calculate the top 10 accessed request URIs.
Use the GROUP BY clause to group analysis results by request URI. Use the COUNT function to calculate the number of access requests for each URI. Then, use the ORDER BY clause to sort the analysis results by the number of access requests.
* | SELECT COUNT(*) as PV, request_uri as PATH GROUP BY PATH ORDER BY PV DESC LIMIT 10
- Query the log entries whose value of the request_uri field ends with %file-7.
Note In query statements, the wildcard characters asterisk (*) and question mark (?) are used for fuzzy searches. The wildcard characters must be used in the middle or at the end of a word. If you want to query fields that end with a specific character, you can use the LIKE operator in an analytic statement.
* | select * from website_log where request_uri like '%file-7'
website_log in the preceding query statement is the Logstore name.
- Calculate the statistics of request URIs that are accessed.
Use the regexp_extract function to extract the file part from the request_uri field. Then, use the COUNT function to calculate the number of access requests for each URI.
* | SELECT regexp_extract(request_uri, '.*\/(file.*)', 1) file, count(*) as count group by file
- Query the log entries whose value of the request_uri field contains %abc%.
* | SELECT * where request_uri like '%/%abc/%%' escape '/'
Sample logs
__tag__:__client_ip__:192.0.2.0
__tag__:__receive_time__:1609985755
__source__:198.51.100.0
__topic__:website_access_log
body_bytes_sent:4512
client_ip:198.51.100.10
host:example.com
http_host:example.com
http_user_agent:Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10_5_8; ja-jp) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27
http_x_forwarded_for:198.51.100.1
instance_id:i-02
instance_name:instance-01
network_type:vlan
owner_id:%abc%-01
referer:example.com
region:cn-shanghai
remote_addr:203.0.113.0
remote_user:neb
request_length:4103
request_method:POST
request_time:69
request_uri:/request/path-1/file-0
scheme:https
server_protocol:HTTP/2.0
slbid:slb-02
status:200
time_local:07/Jan/2021:02:15:53
upstream_addr:203.0.113.10
upstream_response_time:43
upstream_status:200
user_agent:Mozilla/5.0 (X11; Linux i686) AppleWebKit/534.33 (KHTML, like Gecko) Ubuntu/9.10 Chromium/13.0.752.0 Chrome/13.0.752.0 Safari/534.33
vip_addr:192.0.2.2
vpc_id:3db327b1****82df19818a72