When critical system users, such as the root account, do not exist on a Linux Elastic Compute Service (ECS) instance, you may be unable to connect to the instance. You can use the instance health diagnostics feature to perform troubleshooting.
Prerequisites
A Linux ECS instance cannot start and is diagnosed by the instance health diagnostics feature. The diagnostic report indicates that the root account of the instance failed the health check.
Background information
Problem description: In a Linux operating system, the /etc/passwd file contains basic information of all users on the system and the /etc/shadow file contains password information of the users. If information of a critical system user, such as the username and password of the root account, is missing from the files of a Linux ECS instance, you cannot connect to the instance.
Solution: Restore the /etc/passwd and /etc/shadow files. You must also restore the /etc/group file that contains basic information of groups on the system and information about relationships between users and groups.
Procedure
Prepare reference configuration files that contain correct information of system users.
Critical system users vary based on Linux distributions, and you may have created system users when you used Linux ECS instances. To resolve the issue that information of critical system users is missing from a Linux ECS instance, perform operations based on the Linux distribution.
We recommend that you obtain the configuration files of a healthy Linux ECS instance. The configuration files contain correct information of system users and can be used as reference files for troubleshooting. Make sure that the healthy Linux ECS instance and the faulty Linux ECS instance use the same Linux distribution and are installed with the same software. Paths to the required configuration files on the healthy Linux ECS instance:
/etc/passwd
/etc/shadow
/etc/group
You can view the correct information of system users in the configuration files on the healthy Linux ECS instance, or download the configuration files to your computer. In the following examples, an ECS instance that runs CentOS 7.5 is used. Sample configuration files:
/etc/passwd
root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin adm:x:3:4:adm:/var/adm:/sbin/nologin lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin sync:x:5:0:sync:/sbin:/bin/sync shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown halt:x:7:0:halt:/sbin:/sbin/halt mail:x:8:12:mail:/var/spool/mail:/sbin/nologin operator:x:11:0:operator:/root:/sbin/nologin games:x:12:100:games:/usr/games:/sbin/nologin ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin nobody:x:99:99:Nobody:/:/sbin/nologin systemd-network:x:192:192:systemd Network Management:/:/sbin/nologin dbus:x:81:81:System message bus:/:/sbin/nologin polkitd:x:999:998:User for polkitd:/:/sbin/nologin sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin postfix:x:89:89::/var/spool/postfix:/sbin/nologin chrony:x:998:996::/var/lib/chrony:/sbin/nologin ntp:x:38:38::/etc/ntp:/sbin/nologin tcpdump:x:72:72::/:/sbin/nologin nscd:x:28:28:NSCD Daemon:/:/sbin/nologin/etc/shadow
root:$6$Q9lA****/t1KPM$JLqO59UTxwGm****/rU7bHL0q5TVAij****/KeWAWPiO.6booVwpp7rdR9****.irQ6nso3YGVSqQqpyT****.:18668:0:99999:7::: bin:*:17632:0:99999:7::: daemon:*:17632:0:99999:7::: adm:*:17632:0:99999:7::: lp:*:17632:0:99999:7::: sync:*:17632:0:99999:7::: shutdown:*:17632:0:99999:7::: halt:*:17632:0:99999:7::: mail:*:17632:0:99999:7::: operator:*:17632:0:99999:7::: games:*:17632:0:99999:7::: ftp:*:17632:0:99999:7::: nobody:*:17632:0:99999:7::: systemd-network:!!:17864:::::: dbus:!!:17864:::::: polkitd:!!:17864:::::: sshd:!!:17864:::::: postfix:!!:17864:::::: chrony:!!:17864:::::: ntp:!!:17864:::::: tcpdump:!!:17864:::::: nscd:!!:17864::::::/etc/group
root:x:0: bin:x:1: daemon:x:2: sys:x:3: adm:x:4: tty:x:5: disk:x:6: lp:x:7: mem:x:8: kmem:x:9: wheel:x:10: cdrom:x:11: mail:x:12:postfix man:x:15: dialout:x:18: floppy:x:19: games:x:20: tape:x:33: video:x:39: ftp:x:50: lock:x:54: audio:x:63: nobody:x:99: users:x:100: utmp:x:22: utempter:x:35: input:x:999: systemd-journal:x:190: systemd-network:x:192: dbus:x:81: polkitd:x:998: ssh_keys:x:997: sshd:x:74: postdrop:x:90: postfix:x:89: chrony:x:996: ntp:x:38: tcpdump:x:72: nscd:x:28:
Connect to the faulty Linux ECS instance.
When a repair disk is attached to the ECS instance, you can use only Virtual Network Computing (VNC) to connect to the instance. For more information, see Connect to an instance by using VNC.
View the mount information of the original system disk on the faulty ECS instance.
On the repair disk that is temporarily attached to the ECS instance, the file systems of the original system disk of the instance are mounted to a temporary directory. You can use one of the following methods to view the temporary directory:
In the Associated Instances section on the disk details page of the original system disk, view the temporary directory. Example:
/tmp/ecs-offline-diagnose_disk-bp19bspzms79kqse****.bp19bspzms79kqse****is the serial number of the original system disk of the ECS instance.Run the mount command on the repair disk to view the temporary directory. For example, if the device name of the original system disk of the faulty ECS instance is /dev/vda, run the following command:
mount | grep /dev/vdaThe following command output is returned:
/dev/vda1 on /tmp/ecs-offline-diagnose_disk-bp19bspzms79kqse**** type ext4 (rw,relatime)
Run the chroot command to change the root directory to the temporary directory to which the original system disk of the faulty ECS instance is mounted and enter the chroot environment.
You must go to the temporary directory in which the original system disk resides to restore configuration files. For example, if the temporary directory is /tmp/ecs-offline-diagnose_disk-bp19bspzms79kqse****, run the following command:
chroot /tmp/ecs-offline-diagnose_disk-bp19bspzms79kqse****In the chroot environment, run the following commands to back up the /etc/passwd and /etc/shadow files:
cp /etc/passwd /etc/passwd.bak cp /etc/shadow /etc/shadow.bakSupplement missing information in the /etc/passwd file.
In the ECS console, view the health diagnostic report of the ECS instance to determine what information of critical system users is missing.
In the reference /etc/passwd file that you prepared, find the lines that contain the preceding missing information of critical system users and copy the lines.
Paste the copied lines to the correct positions in the /etc/passwd file in the chroot environment.
NoteYou can connect to an ECS instance that is being repaired by using only VNC. To paste copied information, click
in the upper-left corner of the VNC logon interface. Description of the data format in the /etc/passwd file:
Sample line of user information:
postfix:x:89:89::/var/spool/postfix:/sbin/nologinEach line of user information is separated by colons (:) into seven segments in the following format:
<Username>:<Password>:<UID>:<GID>:<User description>:<Home directory>:<Logon shell>After you paste user information, check the pasted information.
The pasted
UIDsmust be different from theUIDsof other users in the file.The pasted
GIDsmust be contained in the /etc/group file in the chroot environment. If the pasted GIDs are not contained in the /etc/group file in the chroot environment, copy the lines that contain the pastedGIDsfrom the reference /etc/group file to the /etc/group file in the chroot environment. Make sure that the copiedGIDsare different from theGIDsof other users in the /etc/group file in the chroot environment.Examples:
If the pasted
GIDis 89 and the /etc/group file in the chroot environment contains the GID, you do not need to modify the /etc/group file in the chroot environment.If the pasted
GIDis 89 and the /etc/group file in the chroot environment does not contain the GID, copy the lines that contain theGIDfrom the reference /etc/group file to the /etc/group file in the chroot environment. Make sure that the copied GID is different from theGIDsof other users in the /etc/group file in the chroot environment.
If configurations in the reference/etc/group file indicate relationships between groups and users whose information is copied, you must copy the configurations to the /etc/group file in the chroot environment.
For example, if information of the
postfixuser is missing from the /etc/passwd file in the chroot environment, supplement the corresponding information copied from the reference /etc/passwd and /etc/group files. If a configuration, such asmail:x:12:postfix, in the reference /etc/group file indicates that thepostfixuser is a member of themailgroup, you must also copy the configuration to the /etc/group file in the chroot environment.
Supplement missing information in the /etc/shadow file.
In the ECS console, view the health diagnostic report of the ECS instance to determine what information of critical system users is missing.
In the reference /etc/shadow file that you prepared, find the lines that contain the preceding missing information of critical system users and copy the lines.
Paste the copied lines to the correct positions of the /etc/shadow file in the chroot environment.
NoteYou can connect to an ECS instance that is being repaired by using only VNC. To paste copied information, click
in the upper-left corner of the VNC logon interface. If no information is missing from the /etc/shadow file in the chroot environment, you do not need to modify the file.
Exit the chroot environment and check the status of the faulty ECS instance.
Run the exit command to exit the chroot environment.
Go to the Troubleshooting page in the ECS console and click View History in the lower part of the page. On the Instance Health Diagnostics tab, detach the repair disk from the ECS instance and start the instance.
Connect to the ECS instance and confirm that you are connected to the instance.
Other solutions
To repair a faulty ECS instance, attach the system disk of the faulty ECS instance to a healthy ECS instance. For more information, see The key system user does not exist in the Linux instance.