All Products
Search
Document Center

:Troubleshoot the exceptions that occur when you access an ECS instance

Last Updated:Apr 02, 2024

This topic describes the factors that may cause exceptions on the entire link when you access related services on an Elastic Compute Service (ECS) instance over a private network or from an on-premises device over the Internet, the possible exception symptoms, and the troubleshooting solutions to access exceptions.

Background information

Exceptions may occur when you access an ECS instance over a private network or from an on-premises device over the Internet. This topic describes the factors that cause the access exceptions, possible symptoms, troubleshooting solutions, and the methods to test the exceptions and record test results. The following sections provide the details:

Note

This topic does not take into account the factors of Alibaba Cloud CDN or third-party content delivery networks (CDNs).

Descriptions of the factors that cause ECS access exceptions

Descriptions of the factors that cause exceptions when you access an ECS instance over a private network

If a client accesses an ECS instance over a private network, the link is relatively simple. Check the following factors that may cause access exceptions and the possible symptoms of the client:

  • Configurations of the source instance

    • Descriptions of factors:

      Access exceptions are caused by security policies of the source instance, such as the firewall and security software, or operating system issues, such as virus infection.

    • Possible symptoms and causes:

      • Ping packet loss: Network errors occur due to operating system issues of the source instance, such as virus infection.

      • Ping failure: Security policies of the source instance, such as the security software, block the ping command.

      • Telnet failure for all ports: Network errors occur due to operating system issues of the source instance, such as virus infection.

      • Telnet failure for specific ports: Security policies of the source instance, such as the security software, deny the access to specific ports.

  • Security group configurations of the source instance

    • Descriptions of factors:

      The security group rules associated with the source instance block access to the destination instance.

    • Possible symptoms and causes:

      • Ping failure: An outbound rule that blocks the ping command is configured for the source instance.

      • Telnet failure for all ports: An outbound rule that denies traffic on all ports is configured for the source instance.

      • Telnet failure for specific ports: An outbound rule that denies traffic on specific ports is configured for the source instance.

  • IP address whitelist of an SLB instance

    • Descriptions of factors:

      If the destination instance is a Server Load Balancer (SLB) instance and the IP address whitelist configured for the destination instance is enabled for the related listener ports, only the specified IP address or CIDR blocks can access the ports.

    • Possible symptoms and causes:

      You fail to run the telnet command for specific ports. This indicates that the IP address of the source instance is not added to the IP address whitelist of the destination instance.

  • Security group configurations of the destination instance

    • Descriptions of factors:

      The security group rules associated with the destination instance block access from the source instance.

    • Possible symptoms and causes:

      • Ping failure: An inbound rule that blocks the ping command is configured for the destination instance.

      • Telnet failure for all ports: An inbound rule that denies traffic on specific ports is configured for the destination instance.

      • Telnet failure for specific ports: An inbound rule that denies traffic on all ports is configured for the destination instance.

  • Configurations of the destination instance

    • Descriptions of factors:

      Access exceptions are caused by security policies of the destination instance, such as the firewall and security software, or operating system issues, such as virus infection.

    • Possible symptoms and causes:

      • Ping packet loss: Network errors occur due to operating system issues of the destination instance, such as virus infection.

      • Ping failure: Security policies of the destination instance, such as the security software, block the ping command.

      • Telnet failure for all ports: Access exceptions occur due to operating system issues of the destination instance, such as virus infection.

      • Telnet failure for specific ports: Security policies of the destination instance, such as the security software, deny the access to specific ports.

Descriptions of the factors that cause exceptions when you access an ECS instance over the Internet

If a client accesses an ECS instance over the Internet, numerous factors are involved. Check the following main factors:

Client network environment

Check the following factors that may cause access exceptions and the possible symptoms of the client in the client network environment:

  • User on-premises network

    • Descriptions of factors:

      If exceptions occur on the user on-premises network, specific or all IP addresses may not be accessed.

    • Possible symptoms and causes:

      The destination instance cannot be accessed, and other non-Alibaba Cloud IP addresses cannot be accessed.

  • On-premises DNS hijacking

    • Descriptions of factors:

      Domain Name System (DNS) hijacking is detected in the user on-premises network environment or local carrier network environment. As a result, when you attempt to access the related services on the destination instance, you are redirected to an unexpected page or experience advertising interruption.

    • Possible symptoms and causes:

      • Unexpected redirection: You are redirected to an unexpected page when you attempt to access the related services on the destination instance.

      • Advertising interruption: You are interrupted by advertisements when you attempt to access the related services on the destination instance.

Carrier network environment

Check the following factors that may cause access exceptions and the possible symptoms of the client in the carrier network environment:

  • Carrier network policies

    • Descriptions of factors:

      Carriers may perform DNS hijacking, or block access to specific IP addresses, domain names, or ports based on their network policies.

    • Possible symptoms and causes:

      • Advertising interruption: You are interrupted by advertisements when you attempt to access the related services on the destination instance.

      • Failure in accessing domain names but succeeded in accessing IP addresses: Carriers block access to specific illegal domain names.

      • Telnet failure for all ports: Carriers block access to specific illegal IP addresses.

      • Telnet failure for specific ports: Carriers block access to specific high-risk ports.

  • ICP filing

    • Descriptions of factors:

      ICP filing is required for domestic servers based on administrative control requirements.

    • Possible symptoms and causes:

      • Unexpected redirection: The domain name that is associated with the destination instance is not filed. As a result, when you attempt to access the related services on the destination instance, you are redirected to the page on which you are required to perform ICP filing.

      • Failure in accessing domain names but succeeded in accessing IP addresses: The domain name that is associated with the destination instance is not filed. As a result, when you attempt to access the related services on the destination instance, you are redirected to the page on which you are required to perform ICP filing. However, you can access IP addresses as expected.

Alibaba Cloud network environment

Check the following factors that may cause access exceptions and the possible symptoms of the client in the Alibaba Cloud network environment:

  • Security Center - Shutdown of zombies

    • Descriptions of factors:

      The destination instance is shut down by Security Center due to its ongoing external attacks caused by issues, such as zombie computers and infections.

    • Possible symptoms and causes:

      • Ping failure: The instance is shut down and cannot be pinged as a result.

      • Telnet failure for all ports: All ports are inaccessible due to instance shutdown.

  • Security Center - Access block

    • Descriptions of factors:

      The source instance is blocked by Security Center due to its behaviors, such as continuous scanning, probing, and attacking.

      Note

      If the on-premises network of the source instance uses Network Address Translation (NAT) to access the Internet, the source of the attack may not necessarily be the instance used by the client, but possibly other instances within the same network. Due to the sharing of the same public IP address, after Security Center blocks the related IP address, access from the source instance is affected.

    • Possible symptoms and causes:

      • Ping failure: The IP address of the source instance is blocked by Security Center. As a result, the ping command is blocked.

      • Telnet failure for all ports: The IP address of the source instance is blocked by Security Center. As a result, all ports are inaccessible.

  • Content Moderation - Content blocking

    • Descriptions of factors:

      The URLs related to the destination instance contain illegal content and access to the destination instance is blocked.

    • Possible symptoms and causes:

      • Unexpected redirection: Exceptions occur on the services on the source instance. As a result, you are redirected to the page on which you are notified of source instance exceptions in Anti-DDoS Proxy or the same kind of page in Web Application Firewall (WAF).

      • Failure in accessing specific URLs: The client cannot access the URLs that meet the conditions of a WAF rule and you are redirected to the page on which you are notified of content blocking.

  • Anti-DDoS Proxy and WAF

    • Descriptions of factors:

      Exceptions occur on the services on the destination instance or the access to the destination instance from the source instance is blocked by an Anti-DDoS Proxy or WAF rule. As a result, access exceptions occur.

    • Possible symptoms and causes:

      • Ping failure: The instance is shut down and cannot be pinged as a result.

      • Telnet failure for all ports: All ports are inaccessible due to instance shutdown.

  • IP address whitelist of an SLB instance

    • Descriptions of factors:

      If the destination instance is an SLB instance and the IP address whitelist configured for the destination instance is enabled for the related listener ports, only the specified IP address or CIDR blocks can access the ports.

    • Possible symptoms and causes:

      You fail to run the telnet command for specific ports. This indicates that the IP address of the source instance is not added to the IP address whitelist of the destination instance.

  • Security group configurations of the destination instance

    • Descriptions of factors:

      The security group rules associated with the destination instance block access from the source instance.

    • Possible symptoms and causes:

      • Ping failure: An inbound rule that blocks the ping command is configured for the destination instance.

      • Telnet failure for all ports: An inbound rule that denies traffic on specific ports is configured for the destination instance.

      • Telnet failure for specific ports: An inbound rule that denies traffic on all ports is configured for the destination instance.

Environment of the destination instance

Check the following factors that may cause access exceptions and the possible symptoms of the client in the destination instance environment:

  • Destination instance suspension due to overdue payments

    • Descriptions of factors:

      The destination instance cannot be accessed because it is suspended due to overdue payments.

    • Possible symptoms and causes:

      • Ping failure: The destination instance cannot be pinged because it is suspended due to overdue payments.

      • Telnet failure for all ports: All ports are inaccessible due to overdue payments of the destination instance.

  • Configurations of the destination instance

    • Descriptions of factors:

      Access exceptions are caused by security policies of the destination instance, such as the firewall and security software, or operating system issues, such as virus infection.

    • Possible symptoms and causes:

      • Ping packet loss: Network errors occur due to operating system issues of the destination instance, such as virus infection.

      • Ping failure: Security policies of the destination instance, such as the security software, block the ping command.

      • Telnet failure for all ports: Access exceptions occur due to operating system issues of the destination instance, such as virus infection.

      • Telnet failure for specific ports: Security policies of the destination instance, such as the security software, deny the access to specific ports.

  • Access control of the source IP address

    • Descriptions of factors:

      The service software of the destination instance performs access control on the source IP address. As a result, the source instance cannot access the destination instance.

    • Possible symptoms and causes:

      You fail to run the telnet command for specific ports. This indicates that the service software on the related ports performs access control on the source IP address, and therefore access from the IP address of the source instance is blocked.

Troubleshooting ideas and solutions to ECS access exceptions

The troubleshooting idea can be broken down into two parts:

Troubleshooting idea of exceptions when you access an ECS instance over a private network

If exceptions occur when you use a client to access an ECS instance over a private network, you can check the following items to determine and troubleshoot the exceptions. You can conduct a comparative test of using other different instances to access the destination instance at the same time, and check whether exceptions occur on all the instances during the access.

An exception occurs on all the instances that you use to access the destination instance

If an exception occurs on all the instances that you use to access the destination instance, the exception may be caused by the security groups that are associated with the destination instance or by internal issues of the destination instance. You need to further analyze whether the internal access of the destination instance is normal. For example, you can log on to the destination instance by connecting to a Linux instance via a management terminal and use the 127.0.0.1 address in the destination instance to conduct a comparative test and check whether the destination instance has internal issues.

  • Internal access issues exist in the destination instance

    If internal access issues exist in the destination instance, you need to contact the service provider or business O&M personnel to check code configurations and software running status.

  • Internal access of the destination instance is normal

    • If the internal access of the destination instance is normal, you need to check whether the security groups that are associated with the destination instance or the configurations of security software of the operating system of the destination instance block the access from the source instance. For information about security group FAQ, see Security group FAQ.

    • If no obvious exception is found after you analyze the security groups and the configurations of security software of the operating system, refer to Guidelines for capturing packets when network exceptions occur. If an exception occurs, capture packets from the client and the server at the same time, and submit a ticket about packet capture results to Alibaba Cloud technical support.

An exception occurs only when you use the source instance to access the destination instance

If an exception occurs only when you use the source instance to access the destination instance, the exception may be caused by the security groups that are associated with the source instance, internal issues of the source instance, or exceptions that occur on the network between the source instance and destination instance. You need to further analyze whether port tests performed by running the telnet command are normal, which indicates whether it is the case that you only fail to run the ping command but can access the desired port.

  • The source instance cannot ping the destination instance, but the port test performed by running the telnet command is normal

    If you fail to run the ping command but can access the desired port, you need to check whether the security groups that are associated with the destination instance or the configurations of security software of the operating system of the destination instance block the ping command for the source instance. For information about security group FAQ, see Security group FAQ.

  • Neither the telnet port test nor the ping test from the source instance to the destination instance is normal

    If exceptions occur on both telnet port tests and ping tests, further troubleshooting is required. Check whether you can succeed in pinging the gateway from inside the source instance.

    • Exceptions occur when you ping the gateway from inside the source instance.

      If exceptions occur when you ping the gateway from inside the source instance, such as ping failure or packet loss, you need to view system logs to check the running status of the source instance, such as loads and network configurations of the source instance.

    • You succeed in pinging the gateway from inside the source instance.

      If you succeed in pinging the gateway from inside the source instance, further troubleshooting is required. Check whether you can succeed in pinging the gateway of the destination instance from the source instance.

      • You succeed in pinging the gateway of the destination instance from the source instance.

        If you succeed in pinging the gateway from inside the source instance and pinging the gateway of the destination instance from the source instance, you need to view system logs to check the running status of the destination instance, such as loads and network configurations of the destination instance.

      • Exceptions occur when you ping the gateway of the destination instance from the source instance.

        If no exceptions occur when you ping the gateway from inside the source instance but exceptions occur when you ping the gateway of the destination instance from the source instance, such as ping failure or packet loss, the exceptions may be caused by the issues of network between the source instance and destination instance. For more information, see Guidelines for capturing packets when network exceptions occur. If an exception occurs, capture packets from the client and the server at the same time, and submit a ticket about packet capture results to Alibaba Cloud technical support.

Troubleshooting idea of exceptions when you access an ECS instance over the Internet

If exceptions occur when you use a client to access an ECS instance over the Internet, you can check the following items to determine and troubleshoot the exceptions.

  1. URL-related issues: Check whether you experience advertising interruption when you attempt to use a client to access the related services on the destination instance.

    • You experienced advertising interruption

      If you experienced advertising interruption, you need to check whether the internal access of the destination instance is normal. For example, you can log on to the destination instance by connecting to a Linux instance via a management terminal and use the 127.0.0.1 address in the destination instance to conduct a comparative test and check whether the destination instance has internal issues.

      • Internal access issues exist in the destination instance.

        If internal access issues exist in the destination instance, you need to contact the service provider or business O&M personnel to check code configurations and software running status.

      • Internal access of the destination instance is normal.

        If the internal access of the destination instance is normal, you need to check whether the issue is caused by exceptions on the on-premises network or hijacking from the local carrier. You can modify the IP address of the on-premises DNS server to check whether the issue can be resolved. If the issue persists, we recommend that you contact the local network department for troubleshooting and analysis, or provide feedback to the local carrier.

    • You did not experience advertising interruption

      If you did not experience advertising interruption, you need to check whether you are redirected to an unexpected page when you attempt to use a client to access the related services on the destination instance.

  2. Issue scope: If the issue is not due to a URL access exception, you need to identify the scope of the issue based on comparative analysis. You can use a third-party dialing test platform to perform comparative access tests from all over the country to check whether access to the destination instance from all networks experiences same exceptions.

    • Exceptions occur for all networks

      If the test results show that exceptions occur when you access the destination instance from all external networks, you can log on to the destination instance by connecting to a Linux instance via a management terminal and use the 127.0.0.1 address in the destination instance to conduct a comparative test and check whether the destination instance has internal issues.

      • Internal access issues exist in the destination instance.

        If internal access issues exist in the destination instance, you need to contact the service provider or business O&M personnel to check code configurations and software running status.

      • Internal access of the destination instance is normal.

        If the internal access of the destination instance is normal, you need to check whether the security groups that are associated with the destination instance or the configurations of security software of the operating system of the destination instance block the access from the source instance. For information about security group FAQ, see Security group FAQ.

    • An exception occurs only when you use the source instance to access the destination instance

      If an exception occurs only when you use the source instance to access the destination instance, you can perform the subsequent steps to further troubleshoot the issue.

  3. Issue symptom: If an exception occurs only when you use the source instance to access the destination instance, you need to perform tests such as ping tests or telnet port tests to further troubleshoot the issue. Check whether you can succeed in pinging the IP address of the destination instance from the client.

    • You succeed in pinging the IP address of the destination instance

      If you experience ping packet loss or ping failure when you use the client to ping the IP address of the destination instance, it may be caused by the exceptions that occur on the intermediate link or peer instance. In this case, you need to use the MTR tool to further troubleshoot the issue.

    • You succeed in pinging the IP address of the destination instance but fail in accessing specific ports

      If you succeed in pinging the IP address of the destination instance but fail in accessing specific ports, you need to check whether the security groups that are associated with the destination instance or the configurations of security software of the operating system of the destination instance block the access from the source instance.

      • The destination instance blocks access to specific ports from the client.

        If you confirm that a specific blocking policy is configured for the source instance, you need to perform adjustment. For information about security group FAQ, see Security group FAQ.

      • The destination instance does not block the access from the source instance.

        If no blocking policy is configured for the source instance from the destination instance, the issue may be caused by interception from a carrier. In this case, you need to use tools such as tracetcp to further analyze the port blocking issue.

Instructions for submitting a ticket about ECS access exceptions

If you need to conduct relevant tests and record the test results when you troubleshoot issues, you can perform the following operations:

The client accesses an ECS instance over a private network

If a client accesses an ECS instance over a private network, perform the following steps to conduct relevant tests and record the test results:

  1. Access the destination instance from different instances and check whether the same exception occurs.

  2. Check whether you can succeed in pinging the IP address of the destination instance.

  3. Check whether you can succeed in running the telnet command for specific ports of the destination instance.

  4. Check whether you can succeed in pinging the gateway from inside the source instance.

  5. Check whether you can succeed in pinging the gateway from inside the destination instance.

  6. Check whether you can succeed in pinging the gateway of the destination instance from the source instance.

  7. Check whether you can succeed in pinging the gateway of the source instance from the destination instance.

  8. Check whether you need to capture packets based on your business requirements. If you need to capture packets, refer to Guidelines for capturing packets when network exceptions occur. If an exception occurs, capture packets from the source instance and destination instance at the same time.

The client accesses an ECS instance over the Internet

If a client accesses an ECS instance over the Internet, perform the following steps to conduct relevant tests and record the test results:

  1. Access the destination instance from instances that are deployed in different regions and different network environments and check whether the same exception occurs.

  2. Check whether you are interrupted by advertisements.

  3. Check whether you are redirected to an unexpected page.

  4. Check whether you can succeed in pinging the IP address of the destination instance from the client.

    If a ping exception occurs, such as packet loss or interruption, perform a test and record the test data.

  5. Check whether you can succeed in running the telnet command for the service ports of the destination instance from the client.

    If you succeed in pinging the IP address of the destination instance but fail in accessing specific ports, perform a test and record the test data.

  6. Check whether you need to capture packets based on your business requirements. If you need to capture packets, refer to Guidelines for capturing packets when network exceptions occur. If an exception occurs, capture packets from the source instance and destination instance at the same time.