Data preparation

Last Updated: Mar 27, 2018

The example is based on a real dataset from the HTTP access log data at CoolShell.com.

The data includes the access logs captured on February 12, 2014 (Attachment coolshell_20140212.log. Use “`” as the column separator).

The data format is as follows:

$remote_addr - $remote_user [$time_local] “$request” $status $body_bytes_sent”$http_referer” “$http_user_agent” [unknown_content];

Field name Description
$remote_addr The client IP address for sending requests
$remote_user The logon name of the client
$time_local The local time of the server
$request The request, including the HTTP request type + request URL + HTTP protocol version number
$status The status code returned by the server
$body_bytes_sent The number of bytes returned to the client (excluding the header)
$http_referer The source URL of the request
$http_user_agent The client information for sending requests, such as the browser

A piece of actual data is as follows: 03:08:03`GET /feed HTTP/1.1`200`92446Mozilla/5.0 (Linux; Android 4.4.2; Nexus 4 Build/KOT49H) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/ Mobile Safari/537.36

