Dynamic Route for CDN (DCDN) log analysis is a log data analysis and query service that integrates Object Storage Service (OSS) and DCDN. You can enable offline log delivery in the DCDN console to automatically deliver DCDN logs to your OSS buckets for long-term management. Then, you can use the serverless SQL engine of Data Lake Analytics (DLA) to analyze the log data. This topic describes how to use the serverless SQL engine of DLA to analyze and query DCDN log data.
Before you analyze DCDN log data, make sure that you have performed the following operations:
After you enable offline log delivery in the DCDN console, DLA in the region where DCDN offline log delivery is enabled is automatically activated and DLA automatically creates a table to store the delivered logs.
This example assumes that DCDN logs are stored in the China (Shanghai) region where DCDN offline log delivery is enabled.
- Log on to the DLA console.
- Select China (Shanghai) from the region drop-down list in the upper-left corner, as shown in the following figure.
- In the left-side navigation pane, choose .
- On the Execute page, execute an SQL statement to query and analyze DCDN log data. Examples:
- Top 10 URL path access errors
// Query the HTTP status codes that are greater than 400. Obtain access errors that correspond to each URL path based on the URL paths contained in the requests, excluding aggregated URL paths. Then, sort the errors in descending order of the error quantity to obtain top 10 errors. SELECT urlPath,count(*) as c from alicdn_offline_log.alicdn_offline_log_dcdn WHERE httpcode > 400 GROUP BY urlPath ORDER BY c DESC LIMIT 10;
- Access source percentage
// Use the domain name filter condition to filter out the logs for which the access source is empty. Aggregate the number of requests from each access source. Then, calculate the percentage of requests for a single domain to all requests. Retain two decimal places for the percentage. select refer_domain as "Access source", c as "Requests", round(c * 100.0 /(sum(c) over()), 2) as "Percentage (%)" from ( select domain as refer_domain, count(*) as c from alicdn_offline_log.alicdn_offline_log_dcdn group by refer_domain order by c desc limit 100 ) where refer_domain ! = "";
- Top domains based on access times
// Use GROUP BY to group access times by domain name, use the COUNT (*) function to calculate the number of times of accesses to each domain, use ORDER BY to sort the domains based on the access times, and use DESC to sort the domains in descending order. In addition, use Limit 100 to return the top 100 domains, use a pie chart for graphical presentation, use return_code for classification, and set the value column to c. select Domain, count(*) as c FROM alicdn_offline_log.alicdn_offline_log_dcdn group by Domain order by c desc limit 100
- Distribution of offline logs of different data types
// Use the SUM function to calculate the total value of respsize, and use the calculated value as sum_res. Use GROUP BY to group offline logs of different data types by content_type, use ORDER BY to sort offline logs of different data types by sum_res. Then, obtain top 10 data types, use a pie chart for graphical presentation, use content_type for classification, and set the value column to sum_res. select contenttype, sum(respsize) as sum_res FROM alicdn_offline_log.alicdn_offline_log_dcdn group by contenttype order by sum_res desc limit 10;
- Top 10 URL path access errors