What do I do if the domain name of an MSE Nacos instance fails to be resolved? - Microservices Engine

This topic describes how to troubleshoot the issue that the domain name of a Microservices Engine (MSE) Nacos instance fails to be resolved.

Problem description

When an application is connected to an MSE Nacos instance, the domain name of the instance fails to be resolved and the following error messages may be returned:

UnknownHostException
No route to host
Unable to resolve host

Cause

The DNS server or name server is incorrectly configured for the application node. As a result, the domain name of the MSE Nacos instance fails to be resolved.
The container does not use the DNS server or name server of the host, or the network type is invalid. As a result, the domain name of the MSE instance fails to be resolved.
The DNS server or name server that is configured for the application node fails. For example, CoreDNS that is required by the Kubernetes cluster fails. As a result, the domain name of the MSE instance fails to be resolved.

Solution

Solution 1: Use a dig command

Run the following command to install the dig tool:

yum install -y bind-utils

Use the dig command listed in the following code to try to resolve the domain name of the instance. Check whether the domain name is correctly resolved based on the values of the status and SERVER fields.

If the value of the status field is NOERROR and the DNS server or name server specified by the SERVER field is valid, the domain name is correctly resolved.

dig ${mse.nacos.host}

; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.1.alios7.2 <<>> ${mse.nacos.host}
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN ## Confirm the value of the status field. id: 46791
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;${mse.nacos.host}. IN A

;; AUTHORITY SECTION:
com.                    900     IN      SOA     a.gtld-servers.net. nstld.verisign-grs.com. 1670413473 1800 900 604800 86400

;; Query time: 0 msec
## Confirm the value of the SERVER field.
;; SERVER: yyy.yyy.yyy.yyy#zz(...)
;; WHEN: Wed Dec 07 19:39:32 CST 2022
;; MSG SIZE  rcvd: 73

If you fail to resolve the domain name by using the dig command, fix the issue by using one of the following methods:
- If you use an Elastic Compute Service (ECS) instance to deploy your environment, record the IP address of the DNS server or name server in the SERVER field. Then, submit a ticket to contact ECS or network technical support to help locate the cause of the domain name resolution failure.
- If you use a Docker or a Kubernetes cluster to deploy your environment, access the host or node and run the dig command again to try to resolve the domain name.
  - If the domain name resolution is successful, the network type is invalid or the configuration of the DNS server or name server in the container is different from that of the node. In this case, modify the network type, or migrate the configuration information of the resolv.conf file in the /etc path on the node to the container and try to resolve the domain name again.
  - If the domain name fails to be resolved, submit a ticket to contact ECS or network technical support to help locate the cause of the domain name resolution failure.
If the domain name can be resolved by using the dig command and the application restores to normal, the DNS server or name server fails. submit a ticket to contact network technical support to help locate the cause of the DNS server failure or the name server failure.
If you use a Container Service for Kubernetes (ACK) cluster to deploy your environment, submit a ticket to contact ACK technical support to help locate the cause of the CoreDNS failure.

Solution 2: Use a ping command

Run the ping ${mse.nacos.host} command to try to resolve the domain name.
- If the message unknown host appears, the domain name fails to be resolved.
- If the message PING ${mse.nacos.host} (xxx.xx.xx.xx) 56(84) bytes of data. appears, the domain name is successfully resolved.
If the domain name fails to be resolved by using theping command, view the content of the resolv.conf file in the /etc path, obtain the IP address of the DNS server or name server, and try to fix the issue by using one of the following methods:
- If you use an ECS instance to deploy your environment, record the content of the resolv.conf file in the /etc path, and submit a ticket to contact ECS or network technical support to help locate the cause of the domain name resolution failure.
- If you use a Docker or a Kubernetes cluster to deploy your environment, access the host or node and run the ping command again to try to resolve the domain name.
  - If the domain name resolution is successful, the network type is invalid or the configuration of the DNS server or name server in the container is different from that of the node. In this case, modify the network type, or migrate the configuration information of the resolv.conf file in the /etc path on the node to the container and try to resolve the domain name again.
  - If the domain name fails to be resolved, submit a ticket to contact ECS or network technical support to help locate the cause of the domain name resolution failure.
If the domain name can be resolved by using the ping command and the application restores to normal, the DNS server or name server fails. submit a ticket to contact network technical support to help locate the cause of the DNS server failure or the name server failure.
If you use an ACK cluster to deploy your environment, submit a ticket to contact ACK technical support to help locate the cause of the CoreDNS failure.