All Products
Search
Document Center

Troubleshooting and guidelines for access to ECS instance exceptions

Last Updated: May 06, 2022

Disclaimer: This article may contain information about third-party products. Such information is for reference only. Alibaba Cloud does not make any guarantee, express or implied, with respect to the performance and reliability of third-party products, as well as potential impacts of operations on the products.

Introduction

This article describes the possible access exceptions when you access services on an ECS instance through a private network or a local public network. This article describes the factors and symptoms that may cause access exceptions throughout the link, and then describes the troubleshooting methods when an exception occurs. Finally, the precautions when submitting the work order are explained.

Note: the relevant information in this article is not affected by Alibaba Cloud CDN or a third-party CDN network.

Background

Alibaba Cloud reminds you that:

  • Before you perform operations that may cause risks, such as modifying instance configurations or data, we recommend that you check the disaster recovery and fault tolerance capabilities of the instances to ensure data security.
  • If you modify the configurations and data of instances including but not limited to ECS and RDS instances, we recommend that you create snapshots or enable RDS log backup.
  • If you have authorized or submitted security information such as the logon account and password in the Alibaba Cloud Management console, we recommend that you modify such information in a timely manner.

This article mainly introduces the following aspects.

ECS instance access exception

The following table lists the causes of ECS instance access exceptions.

Factors related to private network access exceptions

If the client is accessed through a private network, the link is relatively simple. The possible causes of access exceptions and client symptoms are as follows.

  • Internal configurations of the source server.
    • Factors: internal firewall, security software, and other security policies of the source server, or internal system problems such as system poisoning, resulting in access exceptions.
    • Possible symptoms and causes are as follows.
      • ping packet loss: The network is abnormal due to internal operating system problems such as poisoning on the source server.
      • ping failed: the security policies of the source server, such as security software, are prohibited.
      • All ports are not connected to telnet: the source server suffers from internal system problems, such as poisoning.
      • Only some ports cannot be telnet-based: security policies such as security software in the source server prohibit access to some ports.
  • SOURCE Server Security Group configuration.
    • Factors: The security group rule for the source server blocks access to the target server.
    • Possible symptoms and causes are as follows.
      • ping failed: the source server configured a rule to prohibit ping in the outbound direction.
      • All ports cannot be telnet-based: the source server has configured drop rules for all ports in the outbound direction.
      • Only some ports cannot be telnet-based: the source server has configured drop rules for the specified port in the outbound direction.
  • Load balancing White List.
    • Factors: if the target server is a server load balancer instance, after the whitelist is enabled for the corresponding listening port, only the specified IP address or IP segment can be accessed.
    • The possible causes and disadvantages are as follows: Only some ports cannot be telnet-based, that is, the source server IP address is not in the whitelist. As a result, the corresponding listening port cannot be accessed.
  • The security group of the target server.
    • Factors: The security group rule for the target server blocks access to the source server.
    • Possible symptoms and causes are as follows.
      • ping failed: the destination server has configured a ping prohibited rule in the inbound direction.
      • All ports cannot be telnet-based: the destination server has configured a drop rule for the specified port in the inbound direction.
      • Only some ports cannot be telnet-based: in the inbound direction of the target server, drop rules are configured for all ports.
  • Internal configurations of the target server.
    • Factors: security policies such as internal firewalls and security software of the target server, or internal system problems such as system poisoning, resulting in access exceptions.
    • Possible symptoms and causes are as follows.
      • ping packet loss: the access is abnormal due to internal system problems such as poisoning on the target server.
      • ping failed: the security policies of the target server, such as security software, are prohibited.
      • All ports are not connected to telnet: access exceptions are caused by operating system internal problems such as poisoning on the target server.
      • Only some ports cannot be telnet-based: security policies such as security software in the target server prohibit access to some ports.

Factors related to access exceptions through the public network

If the client is accessed through the public network, many associated factors are involved, which are divided into the following types.

Client Network Environment

For client network environment, the possible causes of access exceptions and the client symptoms are as follows.

  • Your local network.
    • Factors: if your local network is abnormal, some IP addresses or all IP addresses may become inaccessible.
    • Possible symptom and cause: the target server cannot be accessed, and other non-Alibaba cloud IP addresses cannot be accessed.
  • Local DNS hijacking.
    • Factors: DNS hijacking occurs in the local network or the local carrier, causing abnormal redirects or insertion of advertisements when accessing the service associated with the target server.
    • Possible symptoms and causes are as follows.
      • Abnormal redirect: When the DNS hijacking causes access to the business associated with the target server, it redirects to other unassociated websites.
      • AD inserted: an ad is inserted into the page when you access the service associated with the target server due to DNS hijacking.
The operator network environment

For the carrier network environment, the possible causes of access exceptions and the client symptoms are as follows.

  • The network policy of the service provider.
    • Factors: based on the policy, the carrier may perform DNS hijacking or block access to certain IP addresses, domain names, or ports.
    • Possible symptoms and causes are as follows.
      • AD inserted: an ad is inserted into the page when you access the service associated with the target server due to DNS hijacking.
      • The domain name cannot be accessed but the IP address access is normal: the operator blocks the access to some illegal domain names.
      • All ports cannot be telnet-based: the carrier blocks access to some non-compliant IP addresses.
      • Only some ports cannot be telnet-based: the carrier blocks access to some high-risk ports.
  • Apply for the ICP filing.
    • Factors: for domestic servers, ICP filing is required according to administrative control requirements.
    • Possible symptoms and causes are as follows.
      • Abnormal redirect: the associated domain name of the target server is not filed, and when you access the associated business, you are redirected to the filing prompt page.
      • Domain name cannot be accessed but IP address access is normal: if the associated domain name of the target server is not filed, the access request is redirected to the filing prompt page, but access through IP address is not affected.
Alibaba Cloud network environment

For Alibaba cloud network environment, the possible causes of access exceptions and the client symptoms are as follows.

  • Alibaba Cloud Security-shutdown broilers.
    • Factors: the target server is shut down by Alibaba Cloud security because of zombie attacks, poisoning, and other problems.
    • Possible symptoms and causes are as follows.
      • ping failed: the server is shut down and cannot be pinged.
      • All ports are not connected to telnet: The server is shut down and all ports are not connected.
  • Alibaba Cloud Security-access interception.
    • Factors: the source server is blocked by Alibaba Cloud Security for continuous scanning detection and attack.
      Note: if the local network of the source server accesses the public network through NAT sharing, the attack source is not necessarily the customer's own server, but other servers in the same network. If the same public IP address is used, access to the source server is affected when Alibaba Cloud Security blocks the IP address.
    • Possible symptoms and causes are as follows.
      • ping failed: the source server IP address is intercepted by Alibaba Cloud Security, which results in ping prohibition.
      • All ports cannot be telnet-based: the source server IP address is intercepted by Alibaba Cloud security, making all ports inaccessible.
  • Green net-violation shielding.
    • Factors: access to the relevant URL of the target server is blocked because of illegal content.
    • Possible symptoms and causes are as follows.
      • Abnormal redirect: the source server has an abnormal service, causing the relevant access to jump to the anti-DDoS pro or Web Application Firewall the source site exception prompt page.
      • Some URLs cannot be accessed: if the client cannot access the corresponding URL that is hit by the Web Application Firewall rule, the corresponding blocking prompt page is displayed.
  • Anti-DDoS pro and Web Application Firewall.
    • Factors: access exceptions occur on the target server because of service exceptions on the target server or when access behavior related to the source server is hit by anti-DDoS pro or anti-Web Application Firewall pro blocking rules.
    • The possible symptoms and causes are as follows:
      • ping failed: the server is shut down and cannot be pinged.
      • All ports are not connected to telnet: The server is shut down and all ports are not connected.
  • Load balancing White List.
    • Factors: if the target server is a server load balancer instance, after the whitelist is enabled for the corresponding listening port, only the specified IP address or IP segment can be accessed.
    • Possible symptom and cause: only part of the ports cannot be telnet-based, that is, the source server IP address is not in the whitelist. As a result, the corresponding listening port cannot be accessed.
  • The security group of the target server.
    • Factors: The security group rule for the target server blocks access to the source server.
    • The possible symptoms and causes are as follows:
      • ping failed: the destination server is configured with ping prohibited rules in the inbound direction.
      • All ports cannot be telnet-based: the destination server has configured a drop rule for the specified port in the inbound direction.
      • Only some ports cannot be telnet-based: in the inbound direction of the target server, drop rules are configured for all ports.
Internal environment of the target ECS instance

The following table lists the factors that may cause access exceptions in the environment of the target ECS instance and their client symptoms.

  • The target server is out of service.
    • Factors: access failure occurs because of overdue payment for the target server.
    • The possible symptoms and causes are as follows:
      • ping failed: the target server cannot be pinged because of overdue payment.
      • All ports cannot be telnet-based: the target server is out of service because of overdue payment.
  • Internal configurations of the target server.
    • Factors: security policies such as internal firewalls and security software of the target server, or internal system problems such as system poisoning, resulting in access exceptions.
    • The possible symptoms and causes are as follows:
      • ping packet loss: the access is abnormal due to internal system problems such as poisoning on the target server.
      • ping failed: the security policies of the target server, such as security software, are prohibited.
      • All ports are not connected to telnet: access exceptions are caused by operating system internal problems such as poisoning on the target server.
      • Only some ports cannot be telnet-based: security policies such as security software in the target server prohibit access to some ports.
  • Software source address access control.
    • Factors: the internal service software of the target server controls access to the source IP address, making the source server inaccessible.
    • Possible symptom and cause: only part of the ports cannot be telnet-based, that is, the service software on the corresponding port controls access to the source IP address and blocks access to the source server.

Flow chart for troubleshooting ECS server access exception

Troubleshooting for ECS access exception

The exception ECS access is divided into the following two aspects.

Troubleshooting for private network access exceptions

If the client is accessed through the private network, you can perform the following steps to determine, troubleshoot, and handle the access exception. This allows you to perform a comparative test by simultaneously accessing the target server from other servers to see if an exception occurs when all servers access the target server.

    • An exception occurs when all servers access the target server.
      It can be inferred that the target server belongs to a security group or there is an exception inside the server. Further troubleshooting and analysis are required. Check whether internal access is normal. That is, through the use of management terminallog on to the server to connect to a Linux instance, and use 127.0.0.1 to perform a comparative access test on the server to check whether it is normal.
      • The access to the target server is abnormal.
        If the access to the target server is abnormal, contact the service provider or O&M personnel to check the code configuration and software running status.
      • Internal access to the target server is normal.
        • If the access to the server is normal, check whether the source server is blocked by the security group that the target server belongs to and the security configuration of related security software in the operating system. For more information about security group usage, see security group FAQ.
        • If no obvious exception is found after you troubleshoot and analyze the security software configurations of the security group and the operating system, seeoperations for capturing packets when a network exception occursin the event of an exception, the client and the server simultaneously capture packets, and open a ticket to contact Alibaba Cloud technical support personnel to submit the results of packet capture.
    • An exception occurs when only the source server accesses the target server.
      If there is an exception in the access to only the source server, it can be inferred that the source server belongs to a security group, the server has an exception, or the network between the source server and the target server has an exception. Further troubleshooting and analysis are required. Check whether the telnet port is normal, that is, whether the source server can access the target server only without ping and the port access is normal.
      • The source server cannot ping the target server, but the telnet Port test is normal.
        If you cannot ping the IP address of the server but the port is accessible, check the security configuration of the security software in the security group and the operating system of the target server. If ping is disabled on the source server. For more information about security group usage, see security group FAQ.
      • The telnet and ping tests from the source server to the target server are abnormal.
        If the telnet port and ping test are abnormal, further troubleshooting is required. ping the gateway from the source server to check whether it is normal.
        • An error occurred when the source server ping its own gateway.
          If the source server ping its own Gateway also has exceptions, such as ping failure or ping packet loss, you need to check the running status of the source server through system logs, such as server load, network configuration, etc.
        • The source server ping its own gateway is normal.
          If the source server ping its own gateway properly, further troubleshooting is required. ping the gateway of the target server from the source server to check whether it is normal.
          • ping the target gateway from the source server.
            If both the source server ping its own gateway and the target server gateway are normal, you need to check the running status of the target server through system logs, such as server load and network configuration.
          • The source server does not ping the target gateway.
            If the source server ping its own gateway, but the target server gateway has an exception, that is, ping failure or packet loss, it may be caused by an intermediate network exception. You need to refer to the operations for capturing packets when a network exception occurs, in the event of an exception, the packet is captured concurrently from the client and the server, and then you can open a ticket to contact Alibaba Cloud technical support and submit the packet capture result.

Troubleshooting for access exceptions through the public network

If the client is accessed through the public network, the access exception can be judged, investigated, and processed by the following steps.

  1. URL access problem determination: whether an ad is inserted into the page when the client accesses the services of the target server.
    • The page is inserted with an advertisement.
      If an advertisement is inserted into a page, you must check whether the internal access to the system is normal. Connect to a Linux instance by using a management terminallog on to the server and use 127.0.0.1 to perform a comparative access test on the server to check whether it is normal.
      • Internal access to the target server is also abnormal.
        If the access to the target server is abnormal, contact the service provider or O&M personnel to check the code configuration and software running status.
      • Internal access to the target server is normal.
        If the internal access of the target server is normal, it is determined that the local network is abnormal or the local carrier is hijacked. You can try to change the address of the local DNS server to see if the problem can be solved. If any problem persists, we recommend that you contact the local network department for troubleshooting or provide feedback to the local operator.

    • No ads have been inserted into the page.
      If the issue is not caused by advertising implanted in the page, check whether an abnormal redirection occurs to the URL when the client accesses the services of the target server.
      • An abnormal jump occurs on the page.
        If an abnormal page jump occurs, use a management terminallog on to a Linux instance and log on to the server. Use 127.0.0.1 to perform a comparative access test on the server to check whether it is normal.
        • Internal access to the target server is also abnormal.
          If the access to the target server is abnormal, contact the service provider or O&M personnel to check the code configuration and software running status.
        • Internal access to the target server is normal.
          If the access to the target server is normal, you can proceed with the page that you want to jump. as shown in the following code:
          • Not filed reminder page processing: the website still prompts for not filing after filing.
          • Anti-DDoS pro exception page: Alibaba Cloud security anti-DDoS pro reports 502 error. Alibaba Cloud security anti-DDoS pro reports a 504 error. For more information, see post anti-DDoS pro errors.
          • Blocked pages: For more information about how to unblock pages, see harmful Internet information.
          • Web Application Firewall blocking page: When the Web Application Firewall of Alibaba Cloud Security returns 405 error, see how to resolve 405 error.

      • No abnormal jumps appear on the page.
        If the page is not abnormal, see the subsequent steps for further troubleshooting and analysis.
  2. Problem scope determination: if the problem is not exception by URL access, determine the problem scope through comparative analysis. That is, through a third-party dialing test platform, compare access tests are performed from all over the country to determine whether all networks access the target server with the same exception.
    • All network access errors.
      If all external network access exceptions are found, connect to a Linux instance by using a management terminallog on to the server and use 127.0.0.1 to perform a comparative access test on the server to check whether it is normal.
      • Internal access to the target server is also abnormal.
        If the access to the target server is abnormal, contact the service provider or O&M personnel to check the code configuration and software running status.
      • Internal access to the target server is normal.
        If the access to the source server is normal, check whether the security group of the target server and the security configuration in the system are restricted. For more information about security group usage, see security group FAQ.

    • An exception occurs when only the source server accesses the target server.
      If access to only the source server is abnormal, see the subsequent steps for further troubleshooting and analysis.
  3. Problem and symptom determination: if there is an exception in the access to only the source server, you need to conduct further troubleshooting and analysis through ping or telnet tests. ping the IP address of the target server on the client to check whether it is normal.
    • ping the target server.
      If the client fails to ping the target server due to packet loss or failure, it may be caused by an exception in the intermediate link or the peer server. In this case, you need to pass the MTR link test for further troubleshooting and analysis. For more information, seelink testing description for ping packet loss or ping failure.
      Note: For more information, see the link test procedure section in the preceding document.
    • ping the target server, but the port cannot be accessed.
      If you ping the target server but the port cannot be accessed, check whether the target server belongs to a security group or internal security settings. If yes, check whether a policy blocks access to the corresponding port from the client.
      • The target server blocks the client's access to certain ports.
        If a blocking policy is applied to the source server, adjust the policy accordingly. For more information about security group usage, see security group FAQ.
      • The target server has no blocking policy.
        If the target server does not have a blocking policy for the source server, it may be blocked by the operator. In this case, you need to use tracetcp or other tools to further track and analyze the port blocking status. For more information about port availability test, see port availability test when ping succeeds but Port fails.

Instructions on ECS access exception

If the problem still fails to be solved after the above steps, test and record the test results separately, and then open a ticket to contact Alibaba Cloud technical support.

Client access through a private network

If the client is accessed through the private network, perform the test and record the test results by following the steps below.

  1. Perform the same access test on the target server through different servers to check for the same exception symptoms.
  2. ping the IP address of the target server to check whether it is normal.
  3. telnet the corresponding port on the target server to check whether it is normal.
  4. ping the source server's own gateway to check whether it is normal.
  5. The target server ping its own gateway to check whether it is normal.
  6. ping the gateway of the target server from the source server to check whether it is normal.
  7. ping the source gateway from the target server to check whether it is normal.
  8. Determine whether to capture packets based on the actual situation. When you need to capture packets, seeoperations for capturing packets when a network exception occursin the event of an exception, the packet is captured from the source server and the target server.

Client access through the public network

If the client is accessed through the Internet, perform the test and record the test results by following the steps below.

  1. Perform the same access test on the target server from different network environment in different regions to check for the same exception symptoms.
  2. Check whether the exception is caused by ad insertion on the page.
  3. Check whether an exception occurs on the page.
  4. ping the target server on the client to check whether it is normal.
  5. telnet the corresponding service port on the target server on the client to check whether it is normal.
  6. If the ping has an exception, such as packet loss or interruption, seelink test description for ping packet loss or ping failureperform a test and record the test data.
    Note: For more information, see the link test procedure section in the preceding document.
  7. If the ping is normal but the port cannot be accessed, seedescription of port availability test when the ping succeeds but the port failsperform a test and record the test data.
  8. Determine whether to capture packets based on the actual situation. When you need to capture packets, seeoperations for capturing packets when a network exception occursin the event of an exception, the packet is captured from the source server and the target server.

Application scope

  • Elastic Compute Service