If you must use nscd in a Kubernetes cluster

preface

I wrote a document "DNS Best Practices in Kubernetes Cluster", which mentioned that you can use nscd as the DNS cache in the container. Yes, I just wrote this simple sentence, but didn't mention its specific implementation plan. The reason is that the nscd scheme is probably the least recommended one. This article will describe the scheme you should give priority to, why not recommend nscd, and how to reasonably use nscd in Kubernetes cluster.

Better solution

Connection pool

DNS domain name resolution in Kubernetes cluster often has problems for various reasons, including kernel problems and load problems. You must have encountered these annoying domain name resolution problems before opening this article. In many cases, we may not have the energy to find the root cause of the problem, but we can avoid using DNS domain name resolution as much as possible. How do you do it? Connection pool! There is no need to elaborate on the way to introduce connection pool. In addition to saving DNS domain name resolution costs, it can directly avoid the extra cost of each TCP link handshake.

Connection pools can be used to manage long connections, both for mutual access between microservices and direct connection to databases. Almost all Web and RPC frameworks can support connection pools. For how PHP supports connection pooling, please refer to the official documentation.

Node DNS cache

Containers in Kubernetes clusters are generally high-density deployed. A single cluster node will run multiple service pods. We can run node-level DNS cache components on each cluster node to proxy DNS resolution and cache for all local containers. We provide the NodeLocal DNSCache cache component and the corresponding Webhook controller on the ACK product. This cache component is also a DNS server. It listens to an interface on the node and exposes the DNS service through a local IP address. Webhook controller writes the IP address of this local DNS server into the Pod configuration before Pod scheduling, so that the cache component is used as the DNS server by default after Pod is started.

The cache component has the following advantages:

• Each node only runs one cache component to reduce resource consumption

• The cache component is a DNS server that is transparent to the business and has no compatibility issues

• The connection of the service pod to the cache component does not consume the Connrack table entries, and bypasses a large number of iptables rules compared with the direct request for CoreDNS

• Cache components connect to CoreDNS using TCP protocol to improve availability

Schematic diagram of NodeLocal DNSCache

Why not recommend nscd scheme

compatibility

Nscd is a Daemon program that also provides DNS caching capability, but in fact it is not a DNS server. It does not even work in TCP/IP mode. Nscd will listen to/var/run/nscd/socket by default. When your application calls a method such as getaddrinfo through the glibc library, this method will check whether the socket exists. If it exists, glibc will let nscd send DNS requests and cache the results.

The key problem is that not all applications are using glibc for domain name resolution. For example, Golang uses its own DNS Resolver by default and doesn't care about nscd sockets. For another example, in the common Alpine image, the system uses musl instead of glibc. The native musl does not support nscd for domain name resolution at all.

In addition, it is worth mentioning that there are some schemes to mount/var/run/nscd/socket directory into the container to provide DNS cache in the container. In this scheme, if the glibc versions of ECS and container are different, the cache function will be invalidated and even the resolution exception will occur.

accuracy

If you rely on multiple A records for Round-Robin type load balancing, nscd will only cache the first A record and return it. Therefore, during the cache expiration period, the load will only be forwarded to one of the A records. If NodeLocal DNSCache is used, the cache results will also be randomly broken when returned. This problem does not exist.

Timeliness

The effective time of DNS changes should generally be less than its TTL. The default parameter in the nscd cache mechanism allows it to be changed in TTL+CACHE_ PRUNE_ INTERVAL (15 seconds) before refreshing.

Use reasonable posture

After reading this, it seems that you are determined to use the nscd scheme. Two schemes are streamed on the network:

Scheme 1: host starts nscd

Idea: Start nscd on the host ECS, and mount the host path/var/run/nscd/socket in the pod

Disadvantages: It is necessary to strictly ensure that the glibc version is consistent between the host and the container, otherwise it will encounter compatibility problems

Scheme 2: start nscd in container

Idea: Modify the container image. Before a single container starts the business process, start nscd as the background process

Disadvantages:

• Need to re-image

• In violation of the principle of single responsibility of the container, the business process is no longer the main process of the container, and it is difficult to handle the elegant exit logic

Recommended solution: SideCar starts nscd

Idea: Modify the container deployment YAML. When running the business container, start an independent nscd container and share the nscd socket to the business container by mounting the same directory.

advantage:

• Less business intrusion, no need to make container image again

• The business container and cache container use the same container image to avoid inconsistent glibc versions

I omit all irrelevant attributes above. The container of wordpress name in YAML represents your business container. We inject the following nscd containers into the same Pod. Note:

1. The same service image is recommended for the nscd container

2. Create a public emptyDir type directory and mount it to the business container and nscd container at the same time

3. Be sure to limit the resources of nscd to avoid affecting the main container

4. nscd can be installed in the container image in advance or after startup. The following example is the installation after startup. The YAML of nscd container is as follows

After integrating with the business container, YAML is as follows

The nscd container and the business master container will be started together. After the initialization of nscd in the nscd container is completed, the nscd socket will be automatically created into the emptyDir directory of the announcement. The nscd can be used successfully in the next business domain name resolution request.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us