CDN HTTPS optimization practice

At present, mainstream websites rely on HTTPS (HTTP over TLS/SSL) to realize server authentication, data encryption and integrity protection. Website performance, reliability and security. At present, almost all HTTPS and CDN technologies have become necessary basic services for commercial websites.

However, the design and development of HTTPS and CDN have been independent for a long time. HTTPS was originally designed as an end-to-end (End-to-End) protocol, while CDN is a Man in the Middle (Man in the Middle) protocol. way of working. How the original website is authorized to the intermediate CDN manufacturer, how to complete the identity authentication, key exchange and data protection between the browser-CDN-original website, and how to revoke this authorization, neither academia nor industry has ever done it before systemic considerations.

In general HTTPS is the trend of the future. Its application will become more and more extensive, and CDN will need to make corresponding changes.

Recently, at the Alibaba Yunqi Community 2017 Online Technology Summit, Rong Ke, an expert on Alibaba Cloud CDN technology, came to explain the technical practice behind the CDN HTTPS red envelope. This article mainly starts with SSL/TLS and HTTP/2, focuses on analyzing the HTTPS architecture and optimization practices, and finally guides users on how to better use HTTPS. The following is the wonderful content arrangement:

Introduction to SSL/TLS and HTTP/2

HTTPS

For HTTPS, the transmission of SSL/TLS is actually added under HTTP. The structure of the entire TCP/IP protocol is shown in the figure. Above the transport layer is the session layer. The SSL/TLS protocol is transmitted in the session layer, and HTTP is in the application layer. If the SL/TLS protocol is not used, the data transmitted by HTTP is clear text.

SSL (Secure Socket Layer) is a secure socket layer, and TLS (Transport Layer Security) is a transport layer security protocol, which is based on the SSL3.0 protocol specification and is a subsequent version of SSL3.0. SSL was not deployed and applied on a large scale until version 3.0. The difference between TLS version and SSL version is that the supported encryption algorithms are different. The latest one currently used is the TLS1.2 protocol. Version 1.3 is still in draft stage.

The picture shows the SSL/TLS handshake process. After the TCP-based connection has been established, the client will send its own protocol version number, encryption algorithm, and generated related random numbers to the server, and the server will choose the final protocol version. And encryption algorithm and server certificate are sent to Client. The client will verify the server certificate. After the verification is passed, the two will negotiate a key for subsequent encryption and decryption of HTTPS data.

HTTP/2 Protocol Features

HTTP/2 is also a very critical protocol. It was officially announced in 2015. It was created to solve the inefficiency and insecurity of the original version of HTTP. It is not to completely subvert HTTP, but to do it on the basis of HTTP. Enhanced, it features a binary protocol that supports header compression, multiplexing, and server push. Server push means that when the client sends a request, the server will make some judgments based on the client's request, and will push some resources contained in the page in the client's request to the client in advance, which improves the transmission efficiency. The more important thing about HTTP/2 is to strengthen the security of the protocol.

The HTTP/2 header has been converted into binary format, and is divided into message header and message body, which are all encapsulated into binary format for transmission.

Header compression is a very important feature in HTTP/2. When it is used for data transmission between the same client and a server, some headers are the same in multiple requests, so multiple requests will appear multiple times. Transmit the same header. The HTTP/2 protocol indexes all header information for this situation. If the same header is directly transmitted with the index number in the next transmission, it will not transmit a long string of strings, reducing the amount of network transmission information. Improved transmission efficiency. Of course, header compression also has some disadvantages, because both the client and the server must maintain the index table to determine the information of each index value corresponding to the HTTP header. By occupying more memory in exchange for reducing the amount of data transmission, it is also possible Think of it as exchanging time through space. For the current situation where the memory is expanding day by day, it is more important to increase the transmission efficiency.

The HTTP/2 protocol multiplexing function is shown in the figure. The version before HTTP supports at most keep live, and multiple HTTP requests can be transmitted on one TCP connection. For the most basic keep live, it can only be executed after one request is transmitted. The transmission of a request, and the pipeline on this basis, can transmit multiple get requests at the same time in the request direction, but they are not true multiplexing. On the basis of TCP connection, HTTP/2 adds the concept of stream, and each stream can handle a single HTTP request. On this basis, multiple sstreams can be transmitted simultaneously on one TCP connection, and different streams have corresponding numbers. Thus true multiplexing is supported.

CDN HTTPS Architecture

Why HTTPS?

HTTPS effectively prevents the content of the website from being tampered with or hijacked, and strengthens the security of the website. Some current situations are also increasingly demanding HTTPS: Chrome/Firefox will mark HTTP as unsafe in the future, and now we have a reminder of an exclamation mark when we access the HTTP protocol; Apple ATS will also require App Store apps to use HTTPS; HTTP The /2 protocol is already in use, and mainstream browsers only support HTTP/2 based on TLS; Google search rankings weight HTTPS websites; in addition, US and British government websites require HTTPS.

The most important thing in the CDN HTTPS architecture is certificate management. The basic architecture of CDN nodes is to use LVS as the four-layer load balancer, and use tengine as the seven-layer load balancer. The cache software is the self-developed Swift, relying on this architecture to efficiently process HTTP requests . When it is necessary to support HTTPS, there must first be a certificate management center to store the user's certificate and private key. We have made an improvement. Each node will have a dynamic certificate loading function, which can be dynamically loaded on demand. The access process is that when the Client initiates an HTTPS, it will eventually be processed by a certain tengine. When the tengine receives the SSL handshake information, it will extract the SNI information, which is the domain name information to be accessed, and dynamically obtain the domain name according to the domain name. The corresponding certificate and key, on this basis, perform an SSL handshake. After the handshake is completed, the following HTTP request is processed. We dynamically load certificates and private keys, only store them in memory and confuse the storage to ensure security in all aspects.

Full link supports HTTPS

We can support HTTPS throughout the link. For a CDN architecture with two-level nodes, from Client to L1 node, from L1 to L2, and from L2 back to its own source site, there are three TCP connections in the entire link, each A section of CDN supports HTTPS. The first section requires the user's own certificate, and the second section is our certificate to ensure the encryption of the transmitted data. The third paragraph requires the user's origin site to support HTTPS, and the CDN's back-to-origin also uses HTTPS.

No private key solution

More and more users pay more attention to the security of their own certificates and private keys, and hope to store the private key on the user's private server instead of providing it to the CDN. In response to this situation, CDN has launched a solution without a private key. Users need to build a private key server KeyServer by themselves. When there is an HTTPS handshake, the SNI will be extracted from the handshake information, the domain name requested to be accessed will be determined, and the domain name configuration will be obtained. For this domain name, if the user configures his own KeyServer, tengine sends the data to be decrypted or signed to KeyServer, KeyServer responds with the result, tenine gets the result and completes the handshake. This solution saves the private key on the client's private server by stripping the part that needs to use the private key in the SSL handshake process through the KeyServer. The CDN also implements a set of KeyServer programs. Next, CDN will sort out a complete solution for building a KeyServer cluster. If users want to build their own KeyServer, CDN can provide users with solutions and source code.

CDN HTTPS feature

CDN HTTPS features are as follows:

Dynamic certificates take effect quickly, and the entire network can take effect within 1 minute;
Support for SPDY and HTTP/2;
Abundant configuration items, which can be set dynamically;
Support user KeyServer to realize no private key service;
Linked with Alibaba Cloud Certificate Center CAS, you can apply for a free certificate.

HTTPS Optimization Practices

Multiple handshakes include multiple asymmetric data encryption and decryption and certificate transmission, which will indeed consume more performance in reality. So, is HTTPS necessarily slower?
The above figure does not fully represent that all HTTPS accesses can be improved, but in some scenarios we have optimized the response speed of the request so that the response speed does not decrease, and even some scenarios have improved.

The optimization method is as follows:

Reduce handshake: SSL Session ID/Session Ticket, TCP KeepAlive is also required;

HTTP/2: Multiplexing and header compression can effectively improve data transmission efficiency;
Domain name merger: reduce SSL handshake and improve reuse;
Protocol stack optimization: adjust TCP initialization window, fast retransmission;
Priority algorithm: ECDSA > RSA.
In terms of handling the peak value, the QPS of the company is very large, exceeding the previous processing value. How to deal with it? Methods as below:

The Cache system is preheated, and all are loaded to the first-level node to avoid returning to the source during access;
The dispatching system also needs to do work, including predicting the peak value; statistics of hotspot areas, apportionment of adjacent non-hotspot areas; proportional distribution according to node capabilities;
Such a large peak may definitely exceed our expectations, and we also need to limit the current.
How to better use HTTPS?

Certificate

Apply according to your own domain name, whether it is a single domain name, multiple domain names or a generic domain name. The application channels include Alibaba Cloud CAS and other manufacturers.

Alibaba Cloud Cloud Shield Certificate Service can issue Symantec, CFCA, and GeoTrust certificates. CFCA focuses on domestic financial institutions. Certificates are divided into DV, OV, and EV. DV refers to domain name level certificates, and OV and EV are corporate level certificates.

Retrofit the source site to support HTTPS

The most tedious part of the work is the transformation of the source site, the most basic of which includes the following aspects:

Page resources: there will be many HTTP links, and there will be warnings for referencing such links in HTTPS, especially when there are asynchronous calls, they will not even be executed;

SSL/TLS: TLS version 1.0 or above, supports SNI;

Optimized configuration: enable Session ID/Session Ticket;

Certificate: Support SHA256, SHA-1 is no longer safe;

HSTS: Consider forcing HTTPS.

The user directly enters the domain name, how to HTTPS?

When the user browses the address bar and enters the domain name to visit, most of the browsers are accessed through HTTP by default. At this time, our user's own website needs to be configured, and a 302 jump is made, and all HTTP requests are accessed by it. HTTPS, which can realize HTTPS access, if only 302, subsequent access requests need to jump from HTTP to HTTPS every time, and one more HTTP request processing. In this regard, additional configuration is required. After the browser access follows the jump, an HTTPS request is re-initiated. In addition to the content normally provided by the server, an additional STS header will be added. The header identifier tells the browser that this domain name will be To force HTTPS, and give a large timeout.

When the user closes the browser and wants to visit the website next time, the browser will directly convert the HTTP request to HTTPS internally, realizing real HTTPS, and the browser will access through HTTPS within this timeout period in the future.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us