This topic describes how to parse normal and abnormal CSV format logs.
Normal CSV format logs
Raw log
_program_:access _severity_:6 _priority_:14 _facility_:1 topic:syslog-forwarder content:10.64.10.20|10/Jun/2019:11:32:16 +0800|m.zf.cn|GET /css/mip-base.css HTTP/1.1|200|0.077|6404|10.11.186.82:8001|200|0.060|https://yz.m.sm.cn/s?q=%25%24%23%40%21&from=wy878378&uc_param_str=dnntnwvepffrgibijbprsvdsei|-|Mozilla/5.0 (Linux; Android 9; HWI-AL00 Build/HUAWEIHWI-A00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Mobile Safari/537.36|-|-Requirements
If the value of the
_program_field isaccess, parse thecontentfield as pipe-separated values (PSV), and then delete thecontentfield.Split the
requestfield into therequest_method,request, andhttp_versionfields.You can apply
URLdecoding tohttp_referer.
Solution
If the value of the
_program_field isaccess, use the parse-csv instruction to parse thecontentfield and delete the originalcontentfield. The statement is as follows:* | where _program_='access' | parse-csv -delim='|' content as remote_addr,time_local,host,request,status,request_time,body_bytes_sent,upstream_addr,upstream_status,upstream_response_time,http_referer,http_x_forwarded_for,http_user_agent,session_id,guid | project-away contentReturned log:
__source__: 1.2.3.4 __tag__:__client_ip__: 2.3.X.X __tag__:__receive_time__: 1562845168 __topic__: _facility_: 1 _priority_: 14 _program_: access _severity_: 6 body_bytes_sent: 6404 guid: - host: m.zf.cn http_referer: https://yz.m.sm.cn/s?q=%25%24%23%40%21&from=wy878378&uc_param_str=dnntnwvepffrgibijbprsvdsei http_user_agent: Mozilla/5.0 (Linux; Android 9; HWI-AL00 Build/HUAWEIHWI-A00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Mobile Safari/537.36 http_x_forwarded_for: - remote_addr: 10.64.10.20 request: GET /css/mip-base.css HTTP/1.1 request_time: 0.077 session_id: - status: 200 time_local: 10/Jun/2019:11:32:16 +0800 topic: syslog-forwarder upstream_addr: 10.11.186.82:8001 upstream_response_time: 0.060 upstream_status: 200Use parse-regexp to parse the
requestfield into therequest_method,request, andhttp_versionfields.* | parse-regexp request, '(\S+)' as request_method | parse-regexp request, '\S+\s+\S+\s+(\S+)' as http_version | parse-regexp request, '\S+\s+(\S+)' as requestReturned log:
request: /css/mip-base.css request_method: GET http_version: HTTP/1.1URL-decode the
http_refererfield.* | extend http_referer=url_decode(http_referer)Returned log:
http_referer:https://yz.m.sm.cn/s?q=%$#@!&from=wy878378&uc_param_str=dnntnwvepffrgibijbprsvdsei
Abnormal CSV format logs
The following log format contains an abnormal log entry.
Raw log
__source__: 1.2.3.4 __tag__:__client_ip__: 2.3.X.X __tag__:__receive_time__: 1562840879 __topic__: content: 101.132.xx.xx|07/Aug/2019:11:10:37 +0800|www.123.com|GET /alyun/htsw/?ad=5|8|6|11| HTTP/1.1|200|6.729|14559|1.2.3.4:8001|200|6.716|-|-|Mozilla/5.0 (Linux; Android 4.1.1; Nexus 7 Build/JRO03D))||Requirement
Parse the
contentfield.Solution
In the
contentfield, replaceGET /alyun/htsw/?ad=5|8|6|11| HTTP/1.1with"GET /alyun/htsw/?ad=5|8|6|11| HTTP/1.1". Then, use the parse-csv instruction and set thequoteparameter to correctly parse the fields. Finally, delete the original content field.* | extend content=replace(content,'GET /alyun/htsw/?ad=5|8|6|11| HTTP/1.1','"GET /alyun/htsw/?ad=5|8|6|11| HTTP/1.1"') | parse-csv -delim='|' -quote='"' content as remote_addr,time_local,host,request,status,request_time,body_bytes_sent,upstream_addr,upstream_status, upstream_response_time,http_referer,http_x_forwarded_for,http_user_agent,session_id,guid | project-away contentReturned log
__source__: 1.2.3.4 __tag__:__client_ip__: 2.3.X.X __tag__:__receive_time__: 1562840879 __topic__: body_bytes_sent: 14559 host: www.123.com http_referer: - http_user_agent: Mozilla/5.0 (Linux; Android 4.1.1; Nexus 7 Build/JRO03D)) http_x_forwarded_for: - remote_addr: 101.132.xx.xx request: GET /alyun/htsw/?ad=5|8|6|11| HTTP/1.1 request_time: 6.729 status: 200 time_local: 07/Aug/2019:11:10:37 +0800 upstream_addr: 1.2.3.4:8001 upstream_response_time: 6.716 upstream_status: 200