All Products
Search
Document Center

Failed to connect to the Linux instance

Last Updated: Dec 15, 2020

Disclaimer: This article may contain information about third-party products. Such information is for reference only. Alibaba Cloud does not make any guarantee, express or implied, with respect to the performance and reliability of third-party products, as well as potential impacts of operations on the products.

Introduction

This article describes cases where you cannot remotely log on to a Linux instance and the troubleshooting methods.

Background

This topic describes the following solutions to the remote logon failure of a Linux instance:

Common error cases

The following are common cases in which you cannot remotely log on to a Linux instance through SSH. You can select different solutions for troubleshooting based on the actual error message.

PAM security framework

Linux PAM security framework can load relevant security modules to control access to account policies and logon policy of ECS instances. If the related configuration is abnormal or the related policies are triggered, SSH logon may fail. See the following common cases for solutions according to different error messages:

Configure the Linux environment

If an exception occurs in a Linux system environment, such as poisoning, account configuration, and environment variable configuration, the SSH logon may also fail. See the following common cases for solutions according to different error messages:

SSH service and parameter configuration

The default configuration file for the SSH service is /etc/ssh/sshd_config. The related parameters in the configuration file are incorrectly configured, or related features or policies are enabled, it may also cause the SSH login to fail. See the following common cases for solutions according to different error messages:

SSH service Associated directory or file configuration

For security reasons, the SSH service checks the permission configuration and group of related directories or files during running. If the permission configuration is too high or too low, the service may run abnormally and the client may fail to log on. See the following common cases for solutions according to different error messages:

SSH service key configuration

The SSH service uses asymmetric encryption technology to encrypt the transmitted data. The client and server exchange and verify the validity of the relevant key information. See the following common cases for solutions according to different error messages:

Troubleshooting

If a common error case does not solve the problem, you can refer to the following process for troubleshooting:

  1. Check CPU load, bandwidth, and memory usage
  2. Client troubleshooting
  3. Intermediate network
  4. Security group check
  5. Case for problems

Description

  • The following operations have been tested in a CentOS 6.5 64-bit operating system, and may be different in other Linux releases. For details, see the official documentation for the corresponding Linux release.
  • You can use SSH to connect to a Linux instance from a client. Passmanagement terminalit can be used for temporary O&M operations or troubleshooting when an SSH logon exception occurs on the client.
  • When you cannot remotely log on to a Linux instance through SSH, many factors may be involved.

Check CPU load, bandwidth, and memory usage

  1. Check whether the CPU load is too high. For more information about how to check CPU load and troubleshooting, see query and case analysis of Linux CPU load.
    • If the CPU load is too high during a certain period of time, the remote connection may fail. We recommend that you check whether the program or instance resources do not meet the existing requirements.
    • If the CPU load is not too high, continue to the next step.
  2. The failure of remote connection may be caused by insufficient public network bandwidth. The Specific Troubleshooting method is as follows. To solve this problem, restart the ECS instance and restart it. For more details, see manual renewal or auto-renewal.

    1. Log on to the ECS console.
    1. Locate the instance and click manage in the actions column. On the instance details page that appears, you can view the network monitoring data.
    1. Check whether the server bandwidth is "1K" or "0K". If you did not purchase a public network bandwidth when purchasing an instance, then upgrade the public network bandwidth, and did not select the required bandwidth when purchasing the instance, the bandwidth becomes "1K".
  3. After you enter the password to log on to the remote connection, the desktop cannot be displayed and you can exit without any error messages. This problem may be caused by insufficient memory on the server. Check the memory usage of the server. Perform the following operations:
    1. Use the remote connection function in the console to log on to a Linux instance.
    2. Check the memory usage in for more information, see text logs-Linux of shared block storage in your instances to view physical CPU and memory information, confirm that the memory shortage after further processing.

Client troubleshooting

If you cannot log on to the client normally, use different SSH clients to perform the logon test based on the same account information. If you can log on normally, it is determined that the client configuration is incorrect. You need to troubleshoot and analyze the client configuration or software running status. For more information about how to log on to a Linux instance from an SSH client, seeconnect to a Linux instance.

Step 1: log on to the instance by using the management Terminal

If you cannot connect to the instance remotely for any reason, try to use the remote connection function provided by Alibaba Cloud to make sure that the instance is still responding and is not completely down, and then troubleshoot the fault by reason.

  1. Log on to the ECS console. In the left-side navigation pane, choose instances> remote connection.
  2. When you connect to the instance for the first time or forget the password, clickchange the VNC connection passwordto modify the password for the remote connection.
  3. Then, connect to the instance by using the remote connection password.
Step 2: Check whether the local network of the client is abnormal.

Check whether there is a local failure that the user cannot connect to the Internet.

  • If there are, then check to see if the network interface controller, if there is an abnormality is installed. Connect to a Windows instance by using the management terminal you log on to the instance view /etc/hosts.deny file, see if there are intercept IP, if present, delete this IP configuration.
  • If it does not exist, continue to the next check.
Step 3: restart an instance

Make sure that the logon password is correct and you have reset the password before. Check whether the instance password has not been restarted after the instance password is reset. If there is a record of instance password modification but no record of instance restart, follow these steps to restart the instance:

  1. Log on to the ECS console. In the left-side navigation pane, click instances.
  2. At the top of the page, select a region, and choose more>Instance status>Restart the instance and then click OK.

Intermediate network

The intermediate network includes network check and port check.

Network check

If you cannot remotely connect to a Linux instance, check whether the network is normal.

  1. Use the network environment to compare network segments and connections of computers of different service providers to determine whether the problem is caused by a local network or a server problem. If the problem is caused by the local network or carrier, contact the local IT personnel or carrier. If the network interface controller driver is abnormal, reinstall it. Troubleshoot local network faults and proceed to the next step.
  2. Run the ping command on the client to test the network connectivity with the instance.
Port check

After the network is checked, further check whether the port is normal.

  1. Log on to the instance using the management terminal and run the following command to edit the SSH configuration file:
    vi /etc/ssh/sshd_config
  2. Locate the row where "# port 22" is located, and check whether the default port 22 is modified and whether the preceding "#" is deleted. If not, delete the preceding "#". Change port 22 to another port. Save the settings and exit.
    Note: Service listening can use a port range ranging from 0 to 65535. If the listening port is configured incorrectly, the remote desktop service listening fails.
  3. Run the following command to restart the SSH service:
    /etc/init.d/sshd restart
    Note: You can also run the service sshd restart command to restart the SSH service.
  4. Use the Web server that comes with Python to create a temporary listening port for testing.
    python -m SimpleHTTPServer [$Port]
  5. If the modified port number is not allowed in the ECS Security group rules, you must add the modified port number to the ECS Security group rules. For more information about how to add an ECS Security Group, see add security group rules.
    Note: by default, the ECS Security group rules allow port 22. After modifying the port of the remote desktop, you must allow the modified port in security group rules.
  6. Run the following command to test whether the Port obtained in the previous step is normal: If the Port test fails, troubleshoot the problem by referring to the description of port availability test when the ping command is used but the port is disconnected.
    telnet [$IP] [$Port]
    Note:
    • [$IP] indicates the IP address of a Linux instance.
    • [$Port] indicates the SSH Port number of a Linux instance.
    The command output is as follows: telnet 192.168.0.1 22command. Normally, the system returns the SSH software version number on the server.

Security group check

Check whether the security group configuration allows remote connection ports.

  1. For more information about security group rules, see search security group rules. If the remote connection port is not configured, seeset security group policies for Linux instances after enabling SSH.
  2. Check whether the ECS instance cannot be pinged. The ping fails after Iptables and network interface controller IP configuration are rectified and the system is rolled back. The default Internet rules of the ECS instance security group have been deleted. In this case, you need to reconfigure the Internet rules of the security group. For more information, seethe default Internet rule of the ECS instance security group is deleted and cannot be pinged. If it does not exist, continue to the next step.

Case for problems

If you cannot log on normally after troubleshooting and handling according to the preceding problem scenarios, We recommend that you perform troubleshooting and analysis as follows:

  1. Using a different client SSH and management terminal compare access, it is determined whether or not the individual client self-configuration or software operation caused by.
  2. See intermediate network problems for descriptions to test network connectivity.
  3. Refer to management terminal. Log on to the ECS instance and run the following command to view the related logs while performing access tests on the client.
     tailf /var/log/secure
  4. Refer to the following command, for example, the ssh -v 192.168.0.1 command, to obtain detailed SSH logon interactive logs in the Linux environment.
    ssh -v [$IP]
  5. Log on to the Linux instance through the management terminal and follow these steps to check the running status of the SSH service.
    1. Run the following command to check the service status:
      service sshd status
      service sshd restart
      Normally, the running status and process ID of the SSH service are returned. A similar output is displayed:
      [root@centos ~]# service sshd status
      openssh-daemon (pid   31350) is running...
      [root@centos ~]# service sshd restart
      Stopping sshd:                                             [  OK  ]
      Starting sshd:                                             [  OK  ]
    1. Run the following command to check the status of the service listener:
      netstat -ano | grep 0.0.0.0:22
      Normally, corresponding listening port returned. The command output is as follows:
      tcp     0    0 0.0.0.0:22      0.0.0.0:*      LISTEN    off (0.00/0/0)
    1. Log on to the Linux instance through the management terminal and run the following command. If the logon is successful, it is inferred that the configuration of the system firewall or external security group policy is abnormal, causing the client to log on to the system.
      ssh 127.0.0.1

 

If you cannot connect to the instance by using the remote connection function provided by Alibaba Cloud, restart the instance. A restart operation will stop your instance from running and interrupt business. Exercise caution when performing this operation.

Note: before restarting an instance, you must create a snapshot for the instance to back up data or create an image. For more information about how to create a snapshot, see create a snapshot.

  1. Log on to the ECS console. In the left-side navigation pane, click instances.
  2. At the top of the page, select a region and choose more>Instance status>Restart the instance and then click OK.

Application scope

  • Elastic Compute Service