In most gateways implemented based on Envoy, there is a common issue: when HTTP/2 is enabled, clients may occasionally encounter a 404 error. It can be observed from the logs that for these 404 requests, the domain name in the :authority header does not match the domain name in the Server Name Indication (SNI).
This issue is particularly likely to occur when using a wildcard certificate and configuring routes for multiple domains.
Related community issues:
• https://github.com/envoyproxy/envoy/issues/6767
• https://github.com/istio/istio/issues/13589
• https://github.com/projectcontour/contour/issues/1493
This issue is related to the client's connection reuse mechanism. For HTTP/2, the ability to multiplex connections is a core difference compared with HTTP/1. Especially for browser scenarios, maximizing connection reuse can significantly optimize page load times under TLS (without considering head-of-line blocking). In the HTTP/2 RFC specification, there is also the following description of connection reuse:
Connections that are made to an origin server, either directly or through a tunnel created using the CONNECT method (Section 8.3), MAY be reused for requests with multiple different URI authority components. A connection can be reused as long as the origin server is authoritative (Section 10.1). For TCP connections without TLS, this depends on the host having resolved to the same IP address.
Therefore, browsers like Chrome will reuse an HTTP/2 connection established for domain A to make requests for domain B under the following conditions:
Once a request for domain B is sent over a connection established for domain A, the issue arises where the gateway logs show a mismatch between the SNI and the :authority header, as described above.
In Envoy gateways, the common mapping method between SNI and domain name routing is one-to-one. This means that when matching to SNI A, only the routing configuration for domain A will be present, and there will be no routing for domain B. This results in the 404 error.
Specifically, the common way to organize Envoy configurations is that each SNI has its own independent filter chain, and the RDS configuration in the HCM configuration in this filter chain is also independent.
Strictly speaking, this issue is not a bug in Envoy, but rather a result of improper configuration organization. It can be resolved by reusing the same filter chain for domains that share the same certificate.
However, this solution has two main drawbacks:
A common method is to return a 421 status code by using a Lua filter, for example:
"@type": "type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua"
inlineCode: |
function envoy_on_request(request_handle)
local streamInfo = request_handle:streamInfo()
if streamInfo:requestedServerName() ~= "" then
if (string.sub(streamInfo:requestedServerName(), 1, 2) == "*." and not string.find(request_handle:headers():get(":authority"), string.sub(streamInfo:requestedServerName(), 2))) then
request_handle:respond({[":status"] = "421"}, "Misdirected Request")
end
if (string.sub(streamInfo:requestedServerName(), 1, 2) ~= "*." and streamInfo:requestedServerName() ~= request_handle:headers():get(":authority")) then
request_handle:respond({[":status"] = "421"}, "Misdirected Request")
end
end
end
This is also based on the recommendation in the HTTP/2 RFC:
In some deployments, reusing a connection for multiple origins can result in requests being directed to the wrong origin server. For example, TLS termination might be performed by a middlebox that uses the TLS Server Name Indication (SNI) [TLS-EXT] extension to select an origin server. This means that it is possible for clients to send confidential information to servers that might not be the intended target for the request, even though the server is otherwise authoritative. A server that does not wish clients to reuse connections can indicate that it is not authoritative for a request by sending a 421 (Misdirected Request) status code in response to the request (see Section 9.1.2).
This solution also has two main drawbacks:
If all HTTPS filter chains share the same RDS, the issue can be solved. However, it may lead to an excessively large RDS resource, which would make it impossible to optimize incremental updates using solutions like delta xDS. In addition, any change in the RDS resource will modify the resource checksum, causing Envoy to reload the entire RDS configuration. This leads to significant CPU usage in the main thread due to the need to re-parse the configuration and regenerate data structures.
The current VHDS solution is on-demand based. If the domain name configuration cannot be found in the current request route, the configuration is pulled from the xDS server. This can cause data plane traffic to be forwarded to the control plane. A high volume of 404 requests can thereby stress the control plane, and the availability of the control plane directly affects the availability of the data plane.
Currently, Envoy supports routing configuration slicing based on specific headers. The original design was intended to route traffic differently based on cookies. This can be extended to support routing slices based on domain names. The key points for extension are:
Below is an example of how Higress extends ScopedRoutes in its configuration:
// [#next-free-field: 6]
message ScopedRoutes {
option (udpa.annotations.versioning).previous_message_type =
"envoy.config.filter.network.http_connection_manager.v2.ScopedRoutes";
...
...
message HostValueExtractor {
option (udpa.annotations.versioning).previous_message_type =
"envoy.config.filter.network.http_connection_manager.v2.ScopedRoutes.ScopeKeyBuilder."
"FragmentBuilder.HostValueExtractor";
// The maximum number of host superset recomputes. If not specified, defaults to 100.
google.protobuf.UInt32Value max_recompute_num = 1;
}
message LocalPortValueExtractor {
option (udpa.annotations.versioning).previous_message_type =
"envoy.config.filter.network.http_connection_manager.v2.ScopedRoutes.ScopeKeyBuilder."
"FragmentBuilder.LocalPortValueExtractor";
}
oneof type {
option (validate.required) = true;
// Specifies how a header field's value should be extracted.
HeaderValueExtractor header_value_extractor = 1;
// Extract the fragemnt value from the :authority header, and support recompute with the wildcard domains,
// i.e. ``www.example.com`` can be recomputed with ``*.example.com``, then ``*.com``, then ``*``.
HostValueExtractor host_value_extractor = 101;
// Extract the fragment value from local port of the connection.
LocalPortValueExtractor local_port_value_extractor = 102;
}
}
// The final(built) scope key consists of the ordered union of these fragments, which are compared in order with the
// fragments of a :ref:`ScopedRouteConfiguration<envoy_v3_api_msg_config.route.v3.ScopedRouteConfiguration>`.
// A missing fragment during comparison will make the key invalid, i.e., the computed key doesn't match any key.
repeated FragmentBuilder fragments = 1 [(validate.rules).repeated = {min_items: 1}];
}
When all filter chains share the same route configuration, different filter chains may have different authentication policies. For example, some may require client certificate authentication (mTLS), while others may use IP-based RBAC (Role-Based Access Control).
Exposing all routes to any filter chain indiscriminately is insecure.
A potential solution is to have the control plane identify this security risk. When it detects that a domain must be accessed only through a specific filter chain's authentication, it can implement corresponding protections.
For example, Higress introduces an allow_server_names configuration item for VirtualHosts. When mTLS is enabled, it can be configured to allow access only if the request contains a specific SNI.
// [#protodoc-title: HTTP route components]
// * Routing :ref:`architecture overview <arch_overview_http_routing>`
// * HTTP :ref:`router filter <config_http_filters_router>`
// The top level element in the routing configuration is a virtual host. Each virtual host has
// a logical name as well as a set of domains that get routed to it based on the incoming request's
// host header. This allows a single listener to service multiple top level domain path trees. Once
// a virtual host is selected based on the domain, the routes are processed in order to see which
// upstream cluster to route to or whether to perform a redirect.
// [#next-free-field: 24]
message VirtualHost {
option (udpa.annotations.versioning).previous_message_type = "envoy.api.v2.route.VirtualHost";
...
...
// If non-empty, a list of server names (such as SNI for the TLS protocol) is used to determine
// whether this request is allowed to access this VirutalHost. If not allowed, 421 Misdirected Request will be returned.
//
// The server name can be matched whith wildcard domains, i.e. ``www.example.com`` can be matched with
// ``www.example.com``, ``*.example.com`` and ``*.com``.
//
// Note that partial wildcards are not supported, and values like ``*w.example.com`` are invalid.
//
// This is useful when expose all virtual hosts to arbitrary HCM filters (such as using SRDS), and you want to make
// mTLS-protected routes invisible to requests with different SNIs.
//
// .. attention::
//
// See the :ref:`FAQ entry <faq_how_to_setup_sni>` on how to configure SNI for more
// information.
repeated string allow_server_names = 101;
}
Traditional HTTP proxy software like Apache HTTPD does not support routing when the :authority and SNI are inconsistent. Nginx, however, is one of the earliest gateways to implement this feature. Some people have raised issues in the Nginx community, arguing that this poses a security risk: https://trac.nginx.org/nginx/ticket/1694
This is the response from the Nginx maintainers at the time, which clearly states that there is no security risk:
In theory, you are right. SNI was designed to be used with the only one name, and requesting different names over a connection which uses SNI is not correct. QuotingRFC 6066:If the server_name is established in the TLS session handshake, the client SHOULD NOT attempt to request a different server name at the application layer.But in practice, SPDY introduced so-called "connection reuse", which effectively uses a connection with an established SNI for request to different application-level names. And it is followed byHTTP/2 connection reuse, which does the same: a HTTP/2 client can request a different host over an already established connection.The 421 (Misdirected Request) status code, also introduced by HTTP/2 RFC, is expected to be used only when such a connection reuse is not possible due to server limitations. In nginx, 421 is returned when a client tries to request a server protected with client SSL certificates over a connection established to a different server.
Technology is constantly evolving. When RFC 6066 was established, technologies like HTTP/2 multiplexing did not exist. Therefore, it was considered incorrect for a client to send requests for different domains over a connection using SNI.
However, with the development of web technologies, front-end pages are now carrying richer content and making more concurrent requests. This has led to higher demands for performance and faster response times, both for clients and servers. As a result, SPDY and HTTP/2 emerged to meet these needs. Faced with the high cost of TLS connections, connection reuse should be maximized whenever possible. If a server has been authenticated through an HTTPS certificate to handle requests for other domains, then sending requests for different domains over the same connection is entirely secure.
Gateways should naturally accommodate such reasonable client demands. From a user experience perspective, connection reuse can improve page rendering and API access speeds. From a broader perspective, it also simplifies the transmission of information in a secure manner, making it more energy-efficient and environmentally friendly.
Compile-time Instrumentation: The Optimal Choice for Monitoring Go Applications
API is MCP Higress Launches MCP Marketplace to Accelerate Existing APIs into the MCP Era
552 posts | 53 followers
FollowAlibaba Cloud Native - February 15, 2023
Alibaba Clouder - March 19, 2020
Alibaba Clouder - January 11, 2018
Xi Ning Wang(王夕宁) - July 1, 2021
Alibaba Cloud Native Community - December 11, 2023
Alibaba Clouder - June 28, 2017
552 posts | 53 followers
FollowAccelerate and secure the development, deployment, and management of containerized applications cost-effectively.
Learn MoreHTTPDNS is a domain name resolution service for mobile clients. It features anti-hijacking, high accuracy, and low latency.
Learn MoreSecure and easy solutions for moving you workloads to the cloud
Learn MoreAlibaba Cloud Function Compute is a fully-managed event-driven compute service. It allows you to focus on writing and uploading code without the need to manage infrastructure such as servers.
Learn MoreMore Posts by Alibaba Cloud Native Community