Dynamic Route for CDN (DCDN) log analysis is a log data analysis and query service
that integrates Object Storage Service (OSS) and DCDN. You can enable offline log
delivery in the DCDN console to automatically deliver DCDN logs to your OSS buckets
for long-term management. Then, you can use the serverless SQL engine of Data Lake
Analytics (DLA) to analyze the log data. This topic describes how to use the serverless
SQL engine of DLA to analyze and query DCDN log data.
Prerequisites
Before you analyze DCDN log data, make sure that you have performed the following
operations:
Background information
After you enable offline log delivery in the DCDN console, DLA in the region where
DCDN offline log delivery is enabled is automatically activated and DLA automatically
creates a table to store the delivered logs.
Procedure
This example assumes that DCDN logs are stored in the China (Shanghai) region where
DCDN offline log delivery is enabled.
- Log on to the DLA console.
- Select China (Shanghai) from the region drop-down list in the upper-left corner, as
shown in the following figure.
- In the left-side navigation pane, choose .
- On the Execute page, execute an SQL statement to query and analyze DCDN log data. Examples:
- Top 10 URL path access errors
// Query the HTTP status codes that are greater than 400. Obtain access errors that correspond to each URL path based on the URL paths contained in the requests, excluding aggregated URL paths. Then, sort the errors in descending order of the error quantity to obtain top 10 errors.
SELECT urlPath,count(*) as c from alicdn_offline_log.alicdn_offline_log_dcdn WHERE httpcode > 400 GROUP BY urlPath ORDER BY c DESC LIMIT 10;
- Access source percentage
// Use the domain name filter condition to filter out the logs for which the access source is empty. Aggregate the number of requests from each access source. Then, calculate the percentage of requests for a single domain to all requests. Retain two decimal places for the percentage.
select
refer_domain as "Access source",
c as "Requests",
round(c * 100.0 /(sum(c) over()), 2) as "Percentage (%)"
from
(
select
domain as refer_domain,
count(*) as c
from
alicdn_offline_log.alicdn_offline_log_dcdn
group by
refer_domain
order by
c desc
limit
100
)
where refer_domain ! = "";
- Top domains based on access times
// Use GROUP BY to group access times by domain name, use the COUNT (*) function to calculate the number of times of accesses to each domain, use ORDER BY to sort the domains based on the access times, and use DESC to sort the domains in descending order. In addition, use Limit 100 to return the top 100 domains, use a pie chart for graphical presentation, use return_code for classification, and set the value column to c.
select
Domain,
count(*) as c
FROM
alicdn_offline_log.alicdn_offline_log_dcdn
group by
Domain
order by
c desc
limit
100
- Distribution of offline logs of different data types
// Use the SUM function to calculate the total value of respsize, and use the calculated value as sum_res. Use GROUP BY to group offline logs of different data types by content_type, use ORDER BY to sort offline logs of different data types by sum_res. Then, obtain top 10 data types, use a pie chart for graphical presentation, use content_type for classification, and set the value column to sum_res.
select
contenttype,
sum(respsize) as sum_res
FROM alicdn_offline_log.alicdn_offline_log_dcdn
group by
contenttype
order by
sum_res desc
limit
10;