Analyze the following items based on this network log:
Collect the statistics of Page View (PV) and Unique Visitor (UV) of the website based on the device types of the user (such as Android, iPad, iPhone, and PC), and generate the daily statistical statement.
Obtain the access sources of the website to learn about the sources of website traffic.
[Description] Website statistical indicators:
PV and UV are two basic indicators for measuring website traffic. Each time a web page is opened is counted as one PV and each viewing of the page adds to its page view count. UV refers to the number of unique visitors to the website per day. Only one UV is counted if the same visitor accesses the website multiple times.
Referer refers to the source of the request log. As a critical indicator for advertising evaluation of websites, Referer analyzes the complete access source, visitors, and their preferences.
The following describes the procedure to meet these two requirements:
1) Import the log data to the ODPS table. From the perspective of data warehouse, this table belongs to the ODS layer. Therefore, the name of this ODPS table is ods_log_tacker.
2) Process the data. The section for data description explains that the $request field of log data includes the “HTTP request type + request URL + HTTP protocol version number”. Since the subsequent analysis normally performs separate query and statistics on GET request and URL, the request field of the original table needs to be split. Write the split data into the table dw_log_parser (table at the data warehouse layer).
4) In the data warehouse, the dimension table and fact table are normally created with the in-depth data analysis. In this tutorial, the user dimension table dim_user_info and the fact table dw_log_fact for the website access can be created.
5) Based on the dimension table and fact table at the data warehouse layer, the PV and UV statistical tables are generated based on the user device information (adm_user_measures at the application data mart layer) and the source URL statistical table of website requests (adm_refer_info at the application data mart layer).