This topic describes how to troubleshoot SSH (Secure Shell) connection failures to a Linux ECS instance.
If you need immediate access to a Linux instance for operations and maintenance (O&M), use a VNC connection. VNC bypasses SSH entirely. For more information, see Connect to an instance using VNC.
Common causes
Before you begin detailed troubleshooting, verify the following common causes. These checks resolve most SSH connection failures.
| Check | What to verify | Quick fix |
|---|---|---|
| Instance status | The instance is in the Running state and passes status checks. | Start the instance from the ECS console if it is stopped. |
| Public IP address | The instance has a public IP address or EIP assigned. | Assign a public IP or bind an EIP in the ECS console. |
| Security group rules | Port 22 (or your custom SSH port) is open in an inbound rule. | Add an inbound rule: Action = Allow, Protocol = Custom TCP, Destination = SSH (22). |
| Correct username | You are using the correct OS username (root for most images, or ecs-user if this was selected as the logon name during instance creation). |
Try root unless the image documentation specifies otherwise. |
| Authentication method | You are using the correct password or key pair for the instance. | Reset the password in the ECS console if needed. |
| Public bandwidth | The instance bandwidth is greater than 0 Mbit/s. | Upgrade the bandwidth. See Change bandwidth configurations. |
| Local firewall or proxy | Your local network, firewall, or corporate proxy is not blocking port 22. | Test from a different network or device. |
If none of these apply, continue with the sections below based on whether you receive a specific error message.
No specific error message
When no error message is returned, follow these steps in order to isolate the problem.
Step 1: Use the self-service troubleshooting tool
The Alibaba Cloud self-service troubleshooting tool checks security group configurations, the instance internal firewall, and listener status for common ports. It generates a diagnostic report.
Click to go to self-service troubleshooting page, and switch to the target region.
If the tool does not identify the issue, proceed to Step 2.
Step 2: Test the connection with Workbench
Use Workbench to test whether the SSH service on the instance is functioning.
-
Go to ECS console - Instances.
-
In the top navigation bar, select the region and resource group of the resource that you want to manage.
-
Click the ID of the target instance. On the instance details page, click Connect.
-
In the Remote connection dialog box, find Workbench and click Sign in now.
-
Confirm the prepopulated information, enter your username and authentication credentials, and attempt to log on.
For more information about Workbench, see Remotely log on to a Linux instance using Workbench.
Evaluate the result:
-
Logon succeeds: The SSH service is running correctly. The issue is in the network path between your client and the instance. Proceed to Step 3.
-
Logon fails: Workbench returns an error message and a solution. Follow the prompts to resolve the issue, then test again. For common Workbench errors, see Issues with VNC connections to an instance.
Step 3: Check the network
Verify that network connectivity exists between your client and the instance.
-
Test from a different network. Connect from a different network segment or a different internet service provider (ISP) to determine whether the issue is in your local network or on the server side.
-
If the issue is in your local network or ISP, contact your local IT staff or ISP for assistance.
-
If a network interface card driver is not working correctly, reinstall it.
-
-
Run a ping test. On your local machine, run:
ping <instance-public-IP>-
Packet loss or timeout: Use
tracert(Windows) ormtr(Linux/macOS) to identify where packets are dropped. For more information, see Use MTR for network link analysis. -
Network anomalies: Capture packets for analysis. For more information, see Use a packet capture tool to capture network packets.
-
Ping blocked: If the kernel does not prohibit ping requests but ping still fails, the instance operating system firewall may have a policy that drops ICMP packets from your IP.
-
For more information, see Troubleshoot a failure to ping the public IP address of an ECS instance.
Step 4: Check the port and security group
Verify that the security group allows inbound traffic on the SSH port.
-
Go to ECS console - Instances.
-
In the top navigation bar, select the region and resource group of the resource that you want to manage.
-
In the instance list, click the corresponding instance ID.
-
On the Security Groups tab, locate the security group and click Manage Rules in the Operation column.
-
On the Security Group Details page, in the Rules area, on the Inbound tab, click Add Rule and configure the rule with the following parameters:
Parameter Value Action Allow Priority 1 (a smaller value indicates a higher priority; 1 is the highest priority) Protocol Custom TCP Source Your IP address. Find your IP address by visiting https://cip.cc/.Destination (Current Instance) SSH (22) -
Run the following command to verify the port is reachable:
telnet <instance-IP> <port>NoteReplace
<instance-IP>with the IP address of the Linux instance and<port>with the SSH port number (default: 22).A successful response looks like this:
Trying 192.168.0.1 ... Connected to 192.168.0.1. Escape character is '^]'NoteFor deeper SSH connection debugging, run:
ssh -vvv <username>@<instance-IP>The verbose output shows each stage of the SSH handshake and helps pinpoint where the connection fails.
If the port test fails, see Troubleshoot port failures when an ECS instance can be pinged.
Step 5: Check CPU load, bandwidth, and memory
High resource utilization can cause connection failures or timeouts.
CPU load
If your application has high disk, network, or computing requirements, a high CPU load is expected. Upgrade the instance type to resolve the bottleneck. For more information, see Overview of instance type upgrades and downgrades.
For solutions to high CPU load, see Query and analyze CPU load on Linux systems.
If the CPU load is not high, check the bandwidth.
Public bandwidth
-
Go to ECS console - Instances.
-
In the top navigation bar, select the region and resource group of the resource that you want to manage.
-
In the instance list, click the instance ID. In the Configuration Information section, view the Internet Bandwidth.
If the bandwidth is 0 Mbit/s, the instance has no public bandwidth. Upgrade the bandwidth. For more information, see Change bandwidth configurations (network resources).
Memory
Insufficient memory can cause connections to drop without an error message.
-
Log on to the instance using a VNC connection. For more information, see Log on to a Linux instance using a password.
-
Check memory usage. If memory is insufficient, upgrade the instance type. For more information, see Overview of instance type upgrades and downgrades.
Specific error messages
When a connection fails, the system usually returns an error message. Find the matching error category and linked article below.
PAM security framework
The Pluggable Authentication Modules (PAM) security framework controls access to account and logon policies on the ECS instance. Incorrect PAM configurations can cause SSH logon to fail.
| Error or symptom | Article |
|---|---|
| Correct password is rejected | Cannot log on to a Linux ECS instance with the correct password |
requirement 'uid >= 1000' not met by user 'root' |
"requirement 'uid >= 1000' not met by user 'root'" error when you use SSH to log on to a Linux ECS instance |
| Account locked after multiple incorrect password attempts | User account is locked due to multiple consecutive incorrect password entries during SSH logon to a Linux instance |
Linux system environment
Issues in the Linux system environment such as a virus, incorrect account configuration, or incorrect environment variables can cause SSH logon to fail.
| Error or symptom | Article |
|---|---|
ssh_exchange_identification: read: Connection reset by peer |
"ssh_exchange_identification: read: Connection reset by peer" error when you use SSH to log on to an ECS instance |
fatal: mm_request_send: write: Broken pipe (virus-related) |
"fatal: mm_request_send: write: Broken pipe" error due to an SSH service exception caused by a virus |
main process exited, code=exited on SSH service start |
"main process exited, code=exited" error when the SSH service starts |
| System exception after logon due to ulimit restrictions | System exception after SSH logon to a Linux instance due to ulimit restrictions |
| Error when running the SSH command | Error when you use the SSH command to log on to a Linux ECS instance |
| SSH connection exception due to SELinux | SSH remote connection exception on a Linux instance because the SELinux service is enabled |
SSH service and parameter settings
The SSH service configuration file is /etc/ssh/sshd_config. Incorrect parameter settings or policies in this file can cause SSH logon to fail.
To check the current SSH service status on the instance, run:
systemctl status sshd
To verify which port SSH is listening on, run:
ss --listen --tcp --process --numeric | grep sshd
SSH service directory and file permissions
The SSH service checks the permissions, owner, and group of related directories and files during runtime. Permissions that are set too high or too low can cause client logon to fail.
| Error or symptom | Article |
|---|---|
No supported key exchange algorithms |
"No supported key exchange algorithms" error when you use the SSH command to log on to a Linux instance |
must be owned by root and not group or word-writable |
"must be owned by root and not group or word-writable" error when the SSH service starts |
SSH key configuration
The SSH service uses asymmetric key encryption for data transmission. The client and server exchange and verify key information during connection setup.
| Error or symptom | Article |
|---|---|
Host key verification failed |
"Host key verification failed" error when you use SSH to log on to an ECS instance |