Error message

HSFServiceAddressNotFoundException: This error is reported when the address of the service to be called is not found.

Description

The service to be called is xxxx, which is in the xxxx group.

Basic troubleshooting

  1. Check whether the service is published and called in a correct way.
    • Check whether the service is published. You can query the required service in the service governance console in the service environment, such as the daily environment, staging environment, or online environment.
    • Check whether the interface, version, and group configured in the XML files of the provider and consumer are consistent in the code. Take note of this point: The code is case-sensitive and without spaces before or after them.
  2. Check whether the IP address and the High-speed Service Framework (HSF) port of the service provider are properly connected through telnet. By default, the port number is 12200. If the IP address and the port cannot be properly connected, the firewall is enabled or the network connection fails. Ask related personnel for troubleshooting.
  3. Check whether multiple network interface controllers (NICs) exist. If multiple NICs exist, use -Dhsf.server.ip to specify the IP address of the provider.
    • In the on-premises development environment, you can set Java Virtual Machine (JVM) startup parameters.
    • In the production environment, you must contact the development personnel to determine the solution.
  4. Check whether the service call is early initiated. If a call is initiated before ConfigServer pushes the address, an error occurs. Add the configuration of maxWaitTimeForCsAddress to the configuration items of the consumer. For more information, see Develop a service consumer.

Troubleshooting in the on-premises development environment

When a lightweight configuration center is used for local development, no authentication is required for service publishing and subscription. Therefore, services can be registered and subscribed to when they properly start. After basic troubleshooting, perform the following operations:

  1. Troubleshoot in the lightweight configuration center.
    1. Log on to the lightweight configuration center console and check whether the service is published and whether the IP address and port of the service provider are correct. If the service fails to be published, troubleshoot the issue by following the steps in "Troubleshoot the issue by the service provider."
    2. Check whether the service has been subscribed to. If the service fails to be subscribed to, troubleshoot the issue by following the steps in Troubleshoot the issue by the service consumer.
    3. Check whether the IP address and port of the service provider can be connected through telnet on the consumer instance.
  2. Troubleshoot the issue by the service provider.
    If the service fails to be published in the lightweight configuration center console, perform the following steps for troubleshooting:
    1. Use ping jmenv.tbsite.net to check that the IP address of the address server is the same as that of the lightweight configuration center.
    2. Clear the /{userhome}/logs/ and /{userhome}/configclient/ directories.
    3. Start the service provider. If Tomcat has been started, restart it.
    4. Check whether the Tomcat startup log records exceptions, and check the startup duration in ms. If an exception occurs, resolve it.
    5. Check the /{userhome}/configclient/logs/configclient.log file or the /{userhome}/logs/configclient/configclient.log file to determine whether the IP address of the registry connected by using Connecting to remoting://{IP address} is the same as that of the lightweight configuration center. The two files of different versions have slight differences. If the IP addresses are different, check whether the IP address of the lightweight configuration center has been changed by using -Daddress.server.ip={Accessible IP address}.
    6. If [Register-ok][Publish-ok] appears, check whether the service name, version, and group are the expected ones.
    7. If [Register-ok][Publish-ok] appears, the service provider is registered with the service registry. In the development environment, the service can be registered after it is started.
    Note During development, multiple service providers, such as A, B, and C, may be started on the instance. The HSF port number provided by each service provider increases in sequence from the port number 12200. You can also specify the IP address and port number by using the JVM parameter -Dhsf.server.ip=<ip> -Dhsf.server.port=<port>. In the lightweight configuration center, you can check whether the provider port is the same as the used port. If the ports are different, a failure may occur when you call the provider. You can update the provider port in the lightweight configuration center console, or delete the service and publish the application again.
  3. Troubleshoot the issue by the service consumer.
    1. Use ping jmenv.tbsite.net to check that the IP address of the address server is the same as that of the lightweight configuration center.
    2. Start the service consumer. If Tomcat has been started, restart it.
    3. Check whether the Tomcat startup log records exceptions, and check the startup duration in ms. If an exception occurs, resolve it.
    4. Check the /{userhome}/configclient/logs/configclient.log file or the /{userhome}/logs/configclient/configclient.log file to determine whether the IP address of the registry connected by using Connecting to remoting://{IP address} is the same as that of the lightweight configuration center. The two files of different versions have slight differences. If the IP addresses are different, check whether the IP address of the lightweight configuration center has been changed by using -Daddress.server.ip={Accessible IP address}.
    5. Check the log for service subscription information. Check whether the specific information about the service provider is received based on [Data-received]. If the service provider information is not received, check whether the service provider is registered.
    6. Check whether the IP address and port of the service provider can be connected through telnet on the consumer instance. If the connection fails, check whether the firewall is enabled or the network connection fails. Ask related personnel for troubleshooting.

Online troubleshooting

Applications managed and deployed in Enterprise Distributed Application Service (EDAS) are in a production environment that enables strict service authentication and data isolation. Due to authentication, services cannot be called among Alibaba Cloud accounts in the production environment, and services in the production environment cannot be called or accessed from the development environment.

  1. Troubleshoot the issue by the service provider.
    1. Check the domain name of the address server that corresponds to -Daddress.server.domain={Address server domain name} in the id="codeph_hor_1ki_a0p">cat /home/admin/{taobao-tomcat directory}/bin/setenv.sh file.
    2. Use ping {Address server domain name} to check whether the returned IP address is normal. If the domain name fails to be pinged, the network connection fails. Check the network connection.
    3. Clear the /home/admin/logs/, /home/admin/configclient/, and /home/admin/{taobao-tomcat directory}/logs/ directories.
    4. Start the service provider. If Tomcat has been started, restart it.
    5. Check whether the /home/admin/{taobao-tomcat directory}/logs/catalina.out file contains exceptions, and check the startup duration in ms. If an exception occurs, resolve it.
    6. Check whether the /home/admin/{taobao-tomcat directory}/logs/localhost-{Date}.log file contains exceptions. If an exception occurs, resolve it.
    7. Check the /home/admin/configclient/logs/configclient.log file or the /home/admin/logs/configclient/configclient.log file. The two files of different versions have slight differences. If [Register-ok][Publish-ok] appears, check whether the service name, version, and group are the expected ones. If [Publish or unregister error] appears, troubleshoot the issue.

      Check the edas.hsf.xxxx version in the catalina.out log file.

      1. If the version is earlier than edas.hsf.2114.1.0, create the required service group. Otherwise, authentication fails. Log on to the EDAS console. In the left-side navigation pane, choose Microservice Management > HSF, and click Service Groups to check whether the service group of the application has been created.
      2. If the version is edas.hsf.2114.1.0 or later, multi-tenant data isolation is provided. You do not need to create a service group. The service is registered twice: registered based on tenants and registered based on groups. Tenant-based registration is always successful. Group-based registration may fail but does not affect service calls.
        2018-07-19 10:28:44.716|ERROR|[] [] [%s] [Publish or unregister error] spas-authentication-failed! dataId:com.alibaba.edas.testcase.api.TestCase:1.0.0 group:test erorr:java.lang.Error: A receivedRevision:2 tenant:DEFAULT_TENANT
        2018-07-19 10:28:44.717|INFO|[] [] [] [Register-ok] Publisher (HSFProvider-com.alibaba.edas.testcase.api.TestCase:1.0.0.2 for com.alibaba.edas.testcase.api.TestCase:1.0.0)Tenant:0846c173-decf-4b47-xxxxxxxx in group test in env default
        2018-07-19 10:28:44.717|INFO|[] [] [] [Publish-ok] dataId=com.alibaba.edas.testcase.api.TestCase:1.0.0, clientId=HSFProvider-com.alibaba.edas.testcase.api.TestCase:1.0.0.2, datumId=ecu:ed5b9d2b-a276-4ad7-b7b9-14e432ff2356:192.168.0.1,tenant=0846c173-decf-4b47-xxxxxxxx, rev=2, env=default                                       

        The preceding error logs show that authentication by using tenant:DEFAULT_TENANT fails, but service publishing by using tenant=0846c173-decf-4b47-xxxxxxxx is successful. Ensure that at least one authentication operation is successful.

      3. If [Register-ok][Publish-ok] appears, the service provider is registered with the service registry.
  2. Troubleshoot the issue by the service consumer.
    1. Check the domain name of the address server that corresponds to -Daddress.server.domain={Address server domain name} in the cat /home/admin/{taobao-tomcat directory}/bin/setenv.sh file.
    2. Use ping {Address server domain name} to check whether the returned IP address is normal. If the domain name fails to be pinged, the network connection fails. Check the network connection.
    3. Clear the /home/admin/logs/, /home/admin/configclient/, and /home/admin/{taobao-tomcat directory}/logs/ directories.
    4. Start the service consumer. If Tomcat has been started, restart it.
    5. Check whether the /home/admin/{taobao-tomcat directory}/logs/catalina.out file contains exceptions, and check the startup duration in ms. If an exception occurs, resolve it.
    6. Check whether the /home/admin/{taobao-tomcat directory}/logs/localhost-{Date}.log file contains exceptions. If an exception occurs, resolve it.
    7. Check the /home/admin/configclient/logs/configclient.log file or the /home/admin/logs/configclient/configclient.log file for the service subscription information. The two files of different versions have slight differences. Search for the required service and check whether the specific information about the service provider is received based on [Data-received]. If the service provider information is not received, check whether the service provider is registered.
    8. Check whether the IP address and port of the service provider can be connected through telnet on the consumer instance. If the connection fails, check whether the firewall is enabled or the network connection fails. Ask related personnel for troubleshooting.
    9. Troubleshoot the issue by using related logs.
      1. Check the subscription services by the consumer in /home/admin/configclient/snapshot/DEFAULT_ENV/:
        [root@iZ2ze26awga24ijh93152dZ com.alibaba.edas.carshop.itemcenter.ItemService:1.0.0]# cat HSF-0846c173-decf-4b47-8aa0-xxxxxx.dat
                                        [
                                        "192.168.0.1:12200? _p\u003dhessian2\u0026_ENV\u003dDEFAULT\u0026v\u003d2.0\u0026_TIMEOUT\u003d3000\u0026_ih2\u003dy\u0026_TID\u003d0846c173-decf-4b47-8aa0-04b5a5610096\u0026_SERIALIZETYPE\u003dhessian\u0026_auth\u003dy"
                                        ]
        										
      2. Check the service call errors in /home/admin/logs/hsf/hsf.log.
      3. Check the heartbeat check logs of the consumer and the provider in /home/admin/logs/hsf/hsf-remoting.log.
        01 2018-06-20 12:35:00.797 ERROR [HSF-Worker-2-thread-1:hsf.remoting] [] [] [HSF-0085] [remoting] fail to connect: /192.168.1.1:12200 in timeout: 4000
        										

        The preceding log shows that a persistent TCP connection cannot be established between the consumer and the provider.

        1. Check whether the service that corresponds to the instance IP address is started and whether the related port such as port 12200 is being listened.
        2. If the service is started and the port is being listened, check whether the port of the provider is available through telnet on the consumer.