When an application connects to a Microservices Engine (MSE) Nacos instance, DNS resolution for the instance domain name may fail with one of the following errors:
UnknownHostExceptionNo route to hostUnable to resolve host
The following steps help you identify which layer of the DNS resolution path is broken and fix it.
How DNS resolution works
DNS resolution for an MSE Nacos instance follows a layered path. Understanding this path helps you isolate which layer is causing the failure:
The application sends a DNS query for the Nacos instance domain name.
The query reaches the DNS server configured in
/etc/resolv.confon the application node or container.In Kubernetes environments, the query first goes to CoreDNS, which forwards external domain lookups to the upstream DNS server.
A failure at any layer in this path produces the errors listed above.
Common causes
| Cause | Description |
|---|---|
| Misconfigured DNS on the application node | The DNS server address in /etc/resolv.conf is incorrect or unreachable. |
| Container DNS mismatch | The container does not inherit the DNS configuration from the host node, or the network mode prevents DNS queries from reaching the correct server. |
| DNS service failure | The DNS server itself is down. In Kubernetes clusters, this typically means CoreDNS has failed. |
Prerequisites
Before you begin, make sure you have:
SSH or shell access to the application node or container
(Recommended) The
digcommand-line tool installed. If unavailable,pingworks for basic checks
Troubleshoot with dig (recommended)
The dig command provides detailed DNS query information, including the responding server and query status, which makes root cause analysis faster.
Step 1: Install dig
If dig is not installed, run:
yum install -y bind-utilsStep 2: Query the Nacos domain name
Run the following command. Replace <mse-nacos-host> with the domain name of your MSE Nacos instance.
dig <mse-nacos-host>Example output:
; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.1.alios7.2 <<>> <mse-nacos-host>
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 46791
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0
;; QUESTION SECTION:
;<mse-nacos-host>. IN A
;; AUTHORITY SECTION:
com. 900 IN SOA a.gtld-servers.net. nstld.verisign-grs.com. 1670413473 1800 900 604800 86400
;; Query time: 0 msec
;; SERVER: yyy.yyy.yyy.yyy#zz(...)
;; WHEN: Wed Dec 07 19:39:32 CST 2022
;; MSG SIZE rcvd: 73Step 3: Interpret the results
Check two fields in the output. The domain name is correctly resolved when the status is NOERROR and the SERVER field shows a valid DNS server.
| Field | Location in output | What to look for |
|---|---|---|
| status | HEADER line | NOERROR = resolution succeeded. NXDOMAIN or other values = resolution failed. |
| SERVER | Near the bottom | The IP address of the DNS server that handled the query. Verify that this DNS server is valid and reachable. Record this address for troubleshooting. |
Step 4: Fix the issue based on the results
If dig fails to resolve the domain name:
Elastic Compute Service (ECS) deployment: Record the DNS server IP from the
SERVERfield, then submit a ticket to ECS or network technical support.Docker or Kubernetes deployment: Access the host node and run
dig <mse-nacos-host>again from the host:If the host resolves the domain name successfully, the container DNS configuration differs from the host. Copy the DNS settings from
/etc/resolv.confon the host to the container, or switch the container network mode to use the host network.If the host also fails to resolve the domain name, submit a ticket to ECS or network technical support.
Container Service for Kubernetes (ACK) deployment: submit a ticket to ACK technical support to investigate a possible CoreDNS failure.
If dig resolves the domain name successfully and the application recovers:
The DNS server experienced a transient failure. submit a ticket to network technical support to investigate the root cause of the DNS service interruption.
Troubleshoot with ping (alternative)
Use ping when dig is unavailable. ping confirms whether the domain name resolves but does not show DNS server details.
Step 1: Ping the Nacos domain name
Run the following command. Replace <mse-nacos-host> with the domain name of your MSE Nacos instance.
ping <mse-nacos-host>Interpret the result:
| Output | Meaning |
|---|---|
PING <mse-nacos-host> (xxx.xx.xx.xx) 56(84) bytes of data. | Resolution succeeded. The IP address is shown in parentheses. |
unknown host | Resolution failed. |
Step 2: Check the DNS configuration
If the domain name fails to resolve, view the DNS configuration on the application node:
cat /etc/resolv.confRecord the nameserver entries listed in this file.
Step 3: Fix the issue based on the results
ECS deployment: Record the content of
/etc/resolv.conf, then submit a ticket to ECS or network technical support.Docker or Kubernetes deployment: Access the host node and run
ping <mse-nacos-host>again from the host:If the host resolves the domain name successfully, the container DNS configuration differs from the host. Copy the DNS settings from
/etc/resolv.confon the host to the container, or switch the container network mode to use the host network.If the host also fails to resolve the domain name, submit a ticket to ECS or network technical support.
ACK deployment: submit a ticket to ACK technical support to investigate a possible CoreDNS failure.
If ping resolves the domain name successfully and the application recovers:
The DNS server experienced a transient failure. submit a ticket to network technical support to investigate the root cause of the DNS service interruption.