All Products
Search
Document Center

:What do I do if I cannot connect to a Linux instance?

Last Updated:Aug 11, 2023

This topic describes how to troubleshoot the issue that you cannot connect to a Linux Elastic Compute Service (ECS) instance.

Causes

An SSH connection failure may be caused by various factors, such as the Pluggable Authentication Module (PAM) framework, security group settings, and SSH configurations. When you cannot connect to a Linux instance, perform different operations based on one of the following scenarios to troubleshoot the issue:

You want to log on to the Linux instance

If you want to log on to the Linux instance, perform the following steps to check the status of the instance and then send a command to the Linux instance by using Cloud Assistant or log on to the instance by using Virtual Network Computing (VNC):

Step 1: Check the status of the instance

Before you can identify the cause of the connection failure, you must check the status of the instance. An instance can provide external services only if the instance is in the Running state. Perform the following steps:

  1. Log on to the ECS console.

  2. In the left-side navigation pane, choose Instances & Images > Instances.

  3. In the upper-left corner of the top navigation bar, select the region where the Linux instance resides.

  4. On the Instances page, click the ID of the Linux instance, check the lifecycle status and health status of the instance, and then use an appropriate tool to log on to the instance.

    • If the instance is in a lifecycle state and a health state as described in the following table, you can perform the operations in Step 2: Log on to the instance by using VNC.

      Instance lifecycle status

      Instance health status

      Logon tool

      Starting

      Initializing

      VNC

      Running

      Initializing

      VNC

      OK or Impaired

      VNC or Workbench

      Stopping

      InsufficientData

      VNC

      Stopped

      InsufficientData

      None

    • If the instance is a lifecycle state that is not described in the preceding table, resolve the issue based on the lifecycle state.

      For more information about instance lifecycle states, see Instance lifecycle.

Step 2: Log on to the instance by using VNC

If Cloud Assistant is not available or cannot meet your business requirements, you can use VNC to log on to the instance.

  1. Log on to the ECS console.

  2. In the left-side navigation pane, choose Instances & Images > Instances.

  3. In the upper-left corner of the top navigation bar, select the region where the Linux instance resides.

  4. On the Instances page, find the instance and click Connect in the Actions column.

  5. In the Remote connection dialog box, click Show Other Logon Methods and then click Sign in now in the VNC section.

  6. Log on to the operating system of the instance.

    1. Enter a username, such as root or ecs-user, and press the Enter key.

    2. Enter the password that corresponds to the username and press the Enter key.

      Note

      The characters of the password are hidden when you enter the password to log on to the Linux instance.

Step 3: Use Cloud Assistant to send a command to the Linux instance

Use Cloud Assistant provided by Alibaba Cloud to send a command to the Linux instance.

  1. Log on to the ECS console.

  2. In the left-side navigation pane, choose Instances & Images > Instances.

  3. In the upper-left corner of the top navigation bar, select the region where the Linux instance resides.

  4. On the Instances page, click the ID of the Linux instance.

  5. On the Instance Details page, click the Remote Commands/Files tab and then click Send Remote Commands.

  6. Enter a command and click Run to run the command on the Linux instance without the need to log on to the instance.

    For information about Cloud Assistant, see Overview.

    发送命令

No error message is returned

If no error message is returned when you cannot connect to a Linux instance that is in the Running state, perform the following steps to troubleshoot the issue:

Step 1: Use Alibaba Cloud Workbench to connect to the instance

Use Workbench to connect to the Linux instance. If you cannot connect to the instance by using Workbench, Workbench reports an error message and a solution. Perform the following steps:

  1. Log on to the ECS console.

  2. In the left-side navigation pane, choose Instances & Images > Instances.

  3. In the upper-left corner of the top navigation bar, select the region where the Linux instance resides.

  4. On the Instances page, find the instance and click Connect in the Actions column.

  5. In the Remote connection dialog box, click Sign in now in the Workbench section.

  6. Check whether you can connect to the instance.

    In the Instance Login dialog box, the basic information about the instance is automatically populated by Workbench. Make sure that the basic information is correct, and enter a username and authentication information for the instance. Perform actions based on whether you can connect to the instance by using Workbench. For information about how to use Workbench to connect to a Linux instance, see Connect to a Linux instance by using a password or key.

    • If you cannot connect to the instance by using Workbench, troubleshoot the issue based on the error message and solution that are returned by Workbench. After you troubleshoot the issue, use Workbench to connect to the instance.

    • If you can connect to the instance by using Workbench, SSH works as expected on the instance. In this case, perform operations in Step 2: check network connectivity.

Step 2: check network connectivity

If you cannot connect to a Linux instance, check the network connectivity of the instance.

  1. Use computers from different CIDR blocks or different operators to connect the instance over other networks to determine whether an issue occurs on the local network or the server side.

    • If an issue occurs that is related to your local network or your operator, contact your local IT personnel or your operator.

    • If an exception occurs on a network interface (NIC) driver, re-install the driver.

  2. Run the ping command on your on-premises client to test the network connectivity of the instance.

Step 3: Check the ports and security groups of the instance

Check whether the required connection ports are open in the security groups of the instance.

  1. Log on to the ECS console.

  2. In the upper-left corner of the top navigation bar, select the region where the Linux instance resides.

  3. On the Instances page, click the ID of the instance.

  4. Click the Security Groups tab. In the security group list section, find a security group and click Add Rules in the Actions column.

  5. Click a tab based on the direction of the security group rule that you want to add.

  6. On the Security Group Rules page, use one of the following methods to add a security group rule. For more information, see Add a security group rule.

    • Method 1: Use the Quick Add feature to add a security group rule

      • Action: Allow

      • Port Range: SSH (22)

      • Authorization Object: 0.0.0.0/0, which indicates all IP addresses

    • Method 2: Manually add a security group rule

      • Action: Allow.

      • Priority: 1, which indicates the highest priority. A smaller number indicates a higher priority.

      • Protocol Type: Custom TCP.

      • Port Range: SSH (22).

      • Authorization Object: 0.0.0.0/0, which indicates all IP addresses. Alternatively, specify authorization objects based on your business requirements.

  7. Run the following command to check whether the SSH port is open on the Linux instance:

    telnet [$IP] [$Port]
    Note
    • [$IP] specifies the IP address of the instance.

    • [$Port] specifies the SSH port number of the instance.

    Sample command: telnet 192.168.0.1 22. The following command output indicates that the SSH port is open on the instance:

    Trying 192.168.0.1 ...
    Connected to 192.168.0.1.
    Escape character is '^]'

    If the SSH port is not open on the instance, perform the operations that are described in the What do I do if I cannot ping the public IP address of an ECS instance? to troubleshoot the issue.

Step 4: Check the CPU load, bandwidth usage, and memory usage of the instance

When you cannot connect to a Linux instance, the possible cause is that the instance has high CPU load or the instance has low bandwidth or memory.

  1. Check the CPU load on the instance and perform operations based on the check result:

    • If the CPU load is high, upgrade the instance type of the instance.

      If the applications that are hosted on the instance perform large numbers of disk read/write operations, initiate large numbers of network requests, or generate compute-intensive workloads, the CPU load on the instance becomes high. In this case, we recommend that you upgrade the instance type to resolve resource bottleneck issues. For more information, see Overview of instance configuration changes.

      Note

      For information about how to resolve the high-CPU-load issue, see Query and case analysis Linux CPU load.

    • If the CPU load is not high, proceed to the next step.

  2. Troubleshoot the low-public-bandwidth issue.

    When you cannot connect to a Linux instance, the possible cause is that the instance has low public bandwidth. To troubleshoot the issue, perform the following steps:

    1. Log on to the ECS console.

    2. In the upper-left corner of the top navigation bar, select the region where the Linux instance resides.

    3. On the Instances page, click the ID of the instance. In the Basic Information section, view the value of the Current Bandwidth parameter.

      If the value is 0 Mbps, the instance does not have public bandwidth. To allocate public bandwidth to the instance, upgrade the public bandwidth configurations. For more information, see Modify public bandwidth section of the "Overview of instance configuration changes" topic.

  3. Troubleshoot the low-memory issue.

    If the desktop is not displayed as expected for the Linux instance and the instance exits without an error message after you connect to the instance, the possible cause is that the instance has low memory. In this case, check the memory usage of the instance. Perform the following steps:

    1. Log on to the instance by using VNC.

      For more information, see Connect to a Linux instance by using a password.

    2. Check the memory usage. If the instance memory is low, we recommend that you upgrade the instance to a larger instance type. For more information, see Overview of instance configuration changes.

      For information about how to view the memory information about a Linux instance, see How to View physical CPU and memory information for ECS instances in Linux.

An error message is returned

In most cases, an error message is returned when a connection failure occurs. You can identify and resolve the issue based on the error message.

PAM framework

The PAM framework in Linux can load appropriate security modules and enforce access control policies, such as account policies and logon policies. If configurations are invalid or relevant policies are triggered, SSH logon may fail. In this case, refer to one of the following topics based on the returned error message to troubleshoot the logon failure issue:

System environment of the Linux instance

Exceptions, such as virus infection, invalid account configurations, and invalid environment configurations, in the system environment of a Linux instance may also cause SSH logon to fail. In this case, refer to one of the following topics based on the returned error message to troubleshoot the logon failure issue:

SSH service and parameter settings

The default configuration file of the SSH service is /etc/ssh/sshd_config. If the parameter settings in the configuration file are invalid or if relevant features or policies are enabled in the configuration file, SSH logon may fail. In this case, refer to one of the following topics based on the returned error message to troubleshoot the logon failure issue:

Directories or files that are associated with the SSH service

The SSH service checks the permission configurations and groups of relevant directories or files at runtime to ensure security. Improper permissions for the directories or files may cause the SSH service not to run as expected and result in logon failures from clients. In this case, refer to one of the following topics based on the returned error message to troubleshoot the logon failure issue:

SSH key configurations

SSH uses asymmetric encryption to encrypt data. The client and server exchange and validate keys for message integrity and encryption. If SSH logon fails due to invalid key configurations, troubleshoot the issue based on the error message by referring to the following topic:

What do I do if the "Host key verification failed" error message appears when I connect to an instance by using SSH?