CDN page optimization does not take effect troubleshooting problems-Alibaba Cloud Developer Community

alibaba Cloud CDN provides the page optimization function. When the page optimization function is enabled, CDN automatically removes redundant comments and duplicate white spaces on HTML pages, reducing the file size and improving page readability. One problem encountered in this case is that the CDN page optimization function is enabled according to the document, but access to HTML pages does not actually have an optimization effect.

in the process of troubleshooting and testing, it is found that the page has been optimized by directly viewing the response results in the curl URL mode, but the access to HTML directly in the browser has not been optimized. By further comparing the curl Request and the Network and Request Header under the browser Response Header, it is found that the browser Request Header contains Accept-Encoding: gzip, deflate,Response Header returns Content-Encoding: gzip, as shown in the following figure

we know that GZIP encoding over HTTP is a technology used to improve the performance of WEB applications. High-traffic WEB sites often use GZIP compression technology to make users feel faster. According to the test, the browser request header contains Accept-Encoding: gzip, indicating that the browser supports decoding gzip compressed files. If the server supports gzip compression, after receiving this request header, the compressed gzip file is returned in the Response Header format of Content-Encoding: gzip. Run the following curl Command to test the Accept-Encoding request header. The returned result is gzip and the page is not optimized.

curl -I 'http://测试URL' -H 'Accept-Encoding: gzip'

To sum up, we can conclude that after the returned result contains Accept-Encoding: gzip, the CDN page compression does not take effect, and the returned result does not contain Accept-Encoding: gzip,CDN page compression takes effect.

Because the request link is: Client->CDN --> source site Gzip compression, it may be generated by CDN or the source site. This is easy to verify. You can directly use the curl Command to bind it to the source IP address for testing.

curl -I 'http://测试URL' -H 'Accept-Encoding: gzip' -x '源站IP:80'

Test results show that the returned result contains Accept-Encoding: gzip, indicating that it is caused by gzip compression of the source site. The reason for this phenomenon has gradually emerged, the whole process as follows:

when a user initiates a request in a browser, the browser carries the Accept-Encoding: gzip request header by default, and CDN acts as a proxy server, the request header from the real client (browser) is forwarded when the source server is sent back to the source server. Due to Gzip compression enabled on the source server, therefore, after receiving the request header, the compressed Gzip content is returned to CDN. Because CDN does not have the Gunzip function, the page optimization of the compressed Gzip content cannot be performed, therefore, the page optimization function does not take effect.

the above basic positioning problem, but to be more clear, I built a Nginx-based Web server as the CDN source site, and in the Nginx configuration file nginx. Gzip compression is enabled in conf. The configuration is as follows: however, the test found that CDN can still compress the page after accessing through CDN and sending the Accept-Encoding: gzip request header, which is quite strange. To locate the problem, I grabbed the package directly on the Web server and looked at the interaction request between CDN and the Web server. I found a strange phenomenon: when CDN requests the Web server, it forwards Accept-Encoding: gzip, but the Web server does not respond to Content-Encoding: gzip. The message is shown in the following figure:

according to this phenomenon, check the configuration description of the ngx_http_gzip_module module on the Nginx official website. You can see that the module has the following configuration parameters

one of the gzip_proxied parameters causes attention. The syntax of this parameter can be explained as follows: gzip_proxied [off | expired | no-cache | no-store | private | no_last_modified | no_etag | auth | any]... Default value: gzip_proxied off scope: http, server, location Nginx is enabled as a reverse proxy, and the results returned by the backend server are enabled or disabled, the backend server must return a header containing.

off-for all requests from the proxy server, disable compression expired-if the response header contains Expires header information, no-cache is enabled. If the response header contains Cache-Control: if no-cache header information is enabled, no-store compression is enabled. If the response header contains "Cache-Control:no-store" head information the enable compression private-if the response header header contains "Cache-Control: for private header information, enable the compression no_last_modified-if the response header does not contain the Last-Modified header information, enable the compression no_etag-if the response header does not contain the ETag header information enable compression auth-if the response header contains Authorization header information, enable compression any-unconditionally enable compression, the compressed content is returned for any request from the proxy server.

Because gzip_proxied expired no-cache no-store private auth is configured in the Nginx configuration, it is equivalent to enabling the gzip_proxied parameter. When the Web server discovers a request from the proxy server (the request from CDN here), the Web server verifies the gzip_proxied parameter. When the Response Header of the server does not return Expires, "Cache-Control:no-cache" and other similar Response headers, the server returns data without Gzip compression. If the Gzip configuration module is configured as follows, the server will return Gzip compressed content for any request from the proxy server.

gzip_proxied  any

then the question arises: how does the server determine that the request is from the proxy server rather than the real client. This involves the HTTP Header. For more information about Via, see The HTTP documentation about. Via is a general header added by the proxy server. It applies to forward and reverse proxies and can be found in both request and response headers. This message header can be used to track message forwarding, prevent circular requests, and identify the protocol support capabilities of message senders in the request or response delivery chain. Here, when CDN acts as a proxy server to request the source server, the request header will contain the Via header (as shown in the screenshot above), the server knows that the request is from the upstream proxy server based on the Via in the request header.

The problem with the HTTP server is to know whether the proxy itself can process the compressed response. The Accept-Encoding: gzip header in the input request is probably provided by the original client request, but this does not indicate the capability of the proxy or gateway that it passes through, in other words, the server does not know whether the upstream proxy server can process the Gzip compressed content. Therefore, in this scenario, it is reasonable for the server to adopt the safest option and choose not to compress the response it sends back. For more information about the impact of the Via Header on Gzip compression, see this Akamai article.

since the origin site responds to gzip content, CDN page optimization will not take effect, then the energy station responds to uncompressed content, but in this case, for the client, if the requested file is not compressed, the client bandwidth is consumed, which affects the access performance of the web page. In addition to page optimization, CDN also provides the Gzip function. Can we optimize pages and compress Gzip at the CDN level?

The test shows that when CDN enables page optimization, whether Gzip compression or Br compression is enabled, as long as the request header contains accept-encoding: gzip, deflate, br header, page optimization does not take effect. Therefore, at the CDN level, intelligent compression has a higher priority, and the compressed content cannot be optimized on the page. It seems that this solution is not feasible for the time being. Of course, this part of strategy needs to be improved at the product level.

to sum up, this problem can be summarized as follows (1) If the source site responds to the Gzip compressed content, CDN will fail to take effect due to Gunzip failure (2) if you want to implement intelligent compression and page optimization with CDN, the priority of intelligent compression at CDN level is higher than that of page optimization, and page optimization will not take effect (3) if you want to use CDN for page optimization, make sure that Gzip compression is disabled on the origin server. If the origin server is Nginx, modify the gzip_proxied parameter of the ngx_http_gzip_module module in the Nginx configuration file to set the request from the proxy server without returning the Gzip compressed content. (4) In addition, you can configure and delete the Accept-Encoding header at the CDN level.

Selected, One-Stop Store for Enterprise Applications
Support various scenarios to meet companies' needs at different stages of development

Start Building Today with a Free Trial to 50+ Products

Learn and experience the power of Alibaba Cloud.

Sign Up Now