Error message

HSFServiceAddressNotFoundException: This error is reported when the address of the service to be called is not found.

Description

The service to be called is xxxx, which is in the xxxx group.

Basic troubleshooting

  1. Check whether the service is published and called correctly.
    • Check whether the service is published. Specifically, query the required service in the service governance console in the corresponding environment, such as the daily, staging, or online environment.
    • Check whether the interface, version, and group configured in the XML files of the provider and consumer are consistent in the code (case-sensitive and without spaces before or after them).
  2. Check whether the IP HSF port (12200 by default) of the service provider is properly connected through telnet. If not, the firewall is enabled or the network connection fails. Ask relevant personnel to perform troubleshooting.
  3. Check whether multiple network interface controllers (NICs) exist. If so, use -Dhsf.server.ip to specify the IP address of the provider.
    • In the local development environment, you can directly set JVM startup parameters.
    • In the production environment, you need to contact the development personnel to determine the solution.
  4. Check whether the service call is initiated too early. If a call is initiated before ConfigServer pushes the address, an error occurs. Add the configuration of maxWaitTimeForCsAddress to the configuration items of the consumer. For more information, see Create a service consumer.

Local development environment troubleshooting

When a lightweight configuration center is used for local development, no authentication is required for service publishing and subscription. Therefore, services can be registered and subscribed once they start normally. After the basic troubleshooting, perform the following operations:

  1. Troubleshoot in the lightweight configuration center.
    1. Log on to the lightweight configuration center console and check whether the corresponding service is published and whether the IP address and port of the service provider are correct. If the service fails to be published, perform troubleshooting by following the steps in "Troubleshoot the issue by the service provider."
    2. Check whether the corresponding service has been subscribed to. If the service fails to be subscribed, troubleshoot the issue by following the steps in "Troubleshoot the issue by the service consumer."
    3. Specifically, check whether the IP address and port of the service provider can be connected through telnet on the consumer instance.
  2. Troubleshoot the issue by the service provider.
    If the service fails to be published in the lightweight configuration center console, perform troubleshooting as follows:
    1. Use ping jmenv.tbsite.net to check that the IP address of Address Server is the same as that of the lightweight configuration center.
    2. Clear the /{userhome}/logs/ and /{userhome}/configclient/ directories.
    3. Start the service provider. If Tomcat has been started, restart it.
    4. Check whether the Tomcat startup log contains any exceptions, and check the startup duration (in ms). If any exception occurs, resolve it.
    5. Check the /{userhome}/configclient/logs/configclient.log or /{userhome}/logs/configclient/configclient.log file (slightly different in different versions) to determine whether the IP address of the registry connected through Connecting to remoting://{IP address} is the same as that of the lightweight configuration center. If not, check whether the IP address of the lightweight configuration center has been changed by using -Daddress.server.ip={accessible IP address}.
    6. Check whether the service name, version, and group are the expected ones.
    7. If [Register-ok][Publish-ok] appears, the service provider is registered with the service registry. In the development environment, the service can be registered once it is started.
    Note During development, multiple service providers, such as A, B, and C, may be started on the instance. The HSF port numbers provided by each service provider increase in sequence from the port number 12200. You can also specify the IP address and port number by using the JVM parameter -Dhsf.server.ip=<ip> -Dhsf.server.port=<port>. In the lightweight configuration center, you can check whether the provider port is consistent with the used port. If not, there may be a failure to call the provider. You can update the provider port in the lightweight configuration center console, or delete the service and publish the application again.
  3. Troubleshoot the issue by the service consumer
    1. Use ping jmenv.tbsite.net to check that the IP address of Address Server is the same as that of the lightweight configuration center.
    2. Start the service consumer. If the Tomcat has been started, restart it.
    3. Check whether the Tomcat startup log contains any exceptions, and check the startup duration (in ms). If any exception occurs, resolve it.
    4. Check the /{userhome}/configclient/logs/configclient.log or /{userhome}/logs/configclient/configclient.log file (slightly different in different versions) to determine whether the IP address of the registry connected through Connecting to remoting://{IP address} is the same as that of the lightweight configuration center. If not, check whether the IP address of the lightweight configuration center has been changed by using -Daddress.server.ip={accessible IP address}.
    5. Check the log for service subscription information. Check whether the specific information about the service provider is received according to [Data-received]. If not, check whether the service provider is registered.
    6. Specifically, check whether the IP address and port of the service provider can be connected through telnet on the consumer instance. If not, the firewall is enabled or the network connection fails. Ask relevant personnel for troubleshooting.

Troubleshoot the issue online

Applications managed and deployed by Enterprise Distributed Application Service (EDAS) are in the production environment with strict service authentication and data isolation. Due to authentication, services cannot be directly called among Alibaba Cloud accounts in the production environment, and services in the production environment cannot be called or accessed in the development environment.

  1. Troubleshoot the issue by the service provider
    1. Check the Address Server domain name corresponding to -Daddress.server.domain={Address Server domain name} in the id="codeph_hor_1ki_a0p">cat /home/admin/{taobao-tomcat directory}/bin/setenv.sh file.
    2. Ping the Address Server domain name to check whether the returned IP address is normal. If the domain name fails to be pinged, the network connection fails. Check the network connection.
    3. Clear the /home/admin/logs/, /home/admin/configclient/, and /home/admin/{taobao-tomcat directory}/logs/ directories.
    4. Start the service provider. If Tomcat has been started, restart it.
    5. Check whether the /home/admin/{taobao-tomcat directory}/logs/catalina.out file contains any exceptions, and check the startup duration (in ms). If any exception occurs, resolve it.
    6. Check whether the /home/admin/{taobao-tomcat directory}/logs/localhost-{date}.log file contains any exceptions. If any exception occurs, resolve it.
    7. Check the /home/admin/configclient/logs/configclient.log or /home/admin/logs/configclient/configclient.log file (slightly different in different versions). Check whether the service name, version, and group are the expected ones according to [Register-ok][Publish-ok] of the service. If [Publish or unregister error] appears, troubleshoot the issue.

      Check the edas.hsf.xxxx version in the catalina.out log file.

      1. If the version is earlier than edas.hsf.2114.1.0, create the corresponding service group. Otherwise, authentication fails. Log on to the EDAS console. In the left-side navigation pane, choose Microservice Management > Service Groups to check whether the service group of the application has been created.
      2. If the version is edas.hsf.2114.1.0 or later, multi-tenant data segregation is provided. You do not need to create a service group. The corresponding service is registered twice: registered based on tenants (which is always successful) and based on groups (which may fail but does not affect service calls).
        2018-07-19 10:28:44.716|ERROR|[] [] [%s] [Publish or unregister error] spas-authentication-failed! dataId:com.alibaba.edas.testcase.api.TestCase:1.0.0 group:test erorr:java.lang.Error: A receivedRevision:2 tenant:DEFAULT_TENANT
        2018-07-19 10:28:44.717|INFO|[] [] [] [Register-ok] Publisher (HSFProvider-com.alibaba.edas.testcase.api.TestCase:1.0.0.2 for com.alibaba.edas.testcase.api.TestCase:1.0.0)Tenant:0846c173-decf-4b47-xxxxxxxx in group test in env default
        2018-07-19 10:28:44.717|INFO|[] [] [] [Publish-ok] dataId=com.alibaba.edas.testcase.api.TestCase:1.0.0, clientId=HSFProvider-com.alibaba.edas.testcase.api.TestCase:1.0.0.2, datumId=ecu:ed5b9d2b-a276-4ad7-b7b9-14e432ff2356:192.168.0.1,tenant=0846c173-decf-4b47-xxxxxxxx, rev=2, env=default                                       

        According to the preceding error logs, authentication with tenant:DEFAULT_TENANT fails while publishing with tenant=0846c173-decf-4b47-xxxxxxxx is successful. Ensure that at least one authentication is successful.

      3. If [Register-ok][Publish-ok] appears, the service provider is registered with the service registry.
  2. Troubleshoot the issue by the service consumer
    1. Check the Address Server domain name corresponding to -Daddress.server.domain={Address Server domain name} in the cat /home/admin/{taobao-tomcat directory}/bin/setenv.sh file.
    2. Ping the Address Server domain name to check whether the returned IP address is normal. If the domain name fails to be pinged, the network connection fails. Check the network connection.
    3. Clear the /home/admin/logs/, /home/admin/configclient/, and /home/admin/{taobao-tomcat directory}/logs/ directories.
    4. Start the service consumer. If the Tomcat has been started, restart it.
    5. Check whether the /home/admin/{taobao-tomcat directory}/logs/catalina.out file contains any exceptions, and check the startup duration (in ms). If any exception occurs, resolve it.
    6. Check whether the /home/admin/{taobao-tomcat directory}/logs/localhost-{date}.log file contains any exceptions. If any exception occurs, resolve it.
    7. Check the /home/admin/configclient/logs/configclient.log or /home/admin/logs/configclient/configclient.log files (slightly different in different versions) for the service subscription information. Search for the corresponding services and check whether the specific information about the service provider is received according to [Data-received]. If not, check whether the service provider is registered.
    8. Specifically, check whether the IP address and port of the service provider can be connected through telnet on the consumer instance. If not, the firewall is enabled or the network connection fails. Ask relevant personnel for troubleshooting.
    9. Troubleshoot the issue according to relevant logs
      1. Check /home/admin/configclient/snapshot/DEFAULT_ENV/ for the services subscribed by the consumer:
        [root@iZ2ze26awga24ijh93152dZ com.alibaba.edas.carshop.itemcenter.ItemService:1.0.0]# cat HSF-0846c173-decf-4b47-8aa0-xxxxxx.dat
                                        [
                                        "192.168.0.1:12200? _p\u003dhessian2\u0026_ENV\u003dDEFAULT\u0026v\u003d2.0\u0026_TIMEOUT\u003d3000\u0026_ih2\u003dy\u0026_TID\u003d0846c173-decf-4b47-8aa0-04b5a5610096\u0026_SERIALIZETYPE\u003dhessian\u0026_auth\u003dy"
                                        ]
        										
      2. Check /home/admin/logs/hsf/hsf.log for the service call errors.
      3. Check /home/admin/logs/hsf/hsf-remoting.log for the heartbeat check logs of the consumer and the provider.
        01 2018-06-20 12:35:00.797 ERROR [HSF-Worker-2-thread-1:hsf.remoting] [] [] [HSF-0085] [remoting] fail to connect: /192.168.1.1:12200 in timeout: 4000
        										

        The preceding log indicates that a persistent TCP connection cannot be established between the consumer and the provider.

        1. Check whether the service corresponding to the instance IP address is started and whether the related port (for example, port 12200) is being monitored.
        2. If the service is started and the port is being monitored, check whether the port of the provider is available through telnet on the consumer.