All Products
Search
Document Center

Elastic Compute Service:Known issues of public images

Last Updated:Apr 25, 2024

Public images may have some known security vulnerabilities or configuration issues. Known issues of public images help you understand potential security risks and take corresponding measures to locate and resolve the issues at the earliest opportunity.

Known issues of Windows images

The KB5034439 patch fails to be installed in Windows Server 2022

  • Problem description

    The KB5034439 patch fails to be installed in the Windows Server 2022 operating system.

  • Cause

    The KB5034439 patch is an update released by Microsoft in January 2024 and used to restore the environment. By default, the update repository for images is the Alibaba Cloud internal Windows Server Update Services (WSUS) server that does not provide the patch. If you configure Microsoft Windows Update as the update repository and trigger an environment update, the system can search for and install the patch, but the installation fails. The issue is as expected and does not affect normal use of the operating system. For more information, see KB5034439: Windows Recovery Environment update for Windows Server 2022: January 9, 2024.

A patch released by Microsoft in June 2022 causes RRAS issues on servers that have NAT enabled

  • Problem description: According to an announcement from Microsoft on June 23, 2022, the installation of a security patch released by Microsoft in June 2022 may pose the following risks: A Windows server that is using the Routing and Remote Access Service (RRAS) might lose connection to the Internet, and devices that connect to the server might be unable to connect to the Internet.

  • Affected versions of Windows Server:

    • Windows Server 2022

    • Windows Server 2019

    • Windows Server 2016

    • Windows Server 2012 R2

    • Windows Server 2012

    When you check for system updates for Windows Server 2012 R2 and Windows Server 2012, select Check for updates that is marked ①, as shown in the following figure. The update repository to which the ① option is linked is the Alibaba Cloud internal WSUS server. The update repository to which the ② option is linked is the official Microsoft Windows Update server. In particular cases, security updates may cause potential issues. To prevent this scenario, Alibaba Cloud checks the Windows security updates from Microsoft and releases only the updates that pass the check to the internal WSUS server. 检查更新

  • Solution: The relevant patch has been removed from Alibaba Cloud WSUS. To prevent your Windows operating system from being affected by the issue, we recommend that you check whether the patch has been installed in your operating system. Run one of the following commands based on the version of your operating system:

    Windows Server 2012 R2: wmic qfe get hotfixid | find "5014738"
    Windows Server 2019: wmic qfe get hotfixid | find "5014692"
    Windows Server 2016: wmic qfe get hotfixid | find "5014702"
    Windows Server 2012: wmic qfe get hotfixid | find "5014747"
    Windows Server 2022: wmic qfe get hotfixid | find "5014678"

    If the command output indicates that the patch has been installed and you are experiencing RRAS issues on your Windows server, we recommend that you uninstall the patch to restore functionality to your server. Run one of the following commands based on the version of your operating system to uninstall the patch:

    Windows Server 2012 R2: wusa /uninstall /kb:5014738
    Windows Server 2019: wusa /uninstall /kb:5014692
    Windows Server 2016: wusa /uninstall /kb:5014702
    Windows Server 2012: wusa /uninstall /kb:5014747
    Windows Server 2022: wusa /uninstall /kb:5014678
    Note

    For further updates and operational guidance on the issue, follow the official Microsoft instructions. For more information, see RRAS Servers can lose connectivity if NAT is enabled on the public interface.

A patch released in January 2022 causes abnormal behavior on Windows Server domain controllers (DCs)

  • Problem description: According to an announcement from Microsoft on January 13, 2022, the installation of a security patch released by Microsoft in January 2022 may pose the following risks: Virtual machines in Hyper-V cannot start, Windows Server DCs cannot restart or fall into a restart loop, and IP security (IPSec) virtual private network (VPN) connections fail.

  • Affected versions of Windows Server:

    • Windows Server 2022

    • Windows Server, version 20H2

    • Windows Server 2019

    • Windows Server 2016

    • Windows Server 2012 R2

    • Windows Server 2012

    When you check for system updates for Windows Server 2012 R2 and Windows Server 2012, select Check for updates that is marked ①, as shown in the following figure. The update repository to which the ① option is linked is the Alibaba Cloud internal WSUS server. The update repository to which the ② option is linked is the official Microsoft Windows Update server. In particular cases, security updates may cause potential issues. To prevent this scenario, Alibaba Cloud checks the Windows security updates from Microsoft and releases only the updates that pass the check to the internal WSUS server. 检查更新

  • Solution: The relevant patch has been removed from Alibaba Cloud WSUS. To prevent your Windows operating system from being affected by the issue, we recommend that you check whether the patch has been installed in your operating system. Run one of the following commands based on the version of your operating system:

    Windows Server 2012 R2: wmic qfe get hotfixid | find "5009624"
    Windows Server 2019: wmic qfe get hotfixid | find "5009557"
    Windows Server 2016: wmic qfe get hotfixid | find "5009546"
    Windows Server 2012: wmic qfe get hotfixid | find "5009586"
    Windows Server 2022: wmic qfe get hotfixid | find "5009555"

    If the patch has been installed on your operating system and the DCs cannot be used or the virtual machines cannot start, we recommend that you uninstall the patch to restore functionality to your server. Run one of the following commands based on the version of your operating system to uninstall the patch:

    Windows Server 2012 R2: wusa /uninstall /kb:5009624
    Windows Server 2019: wusa /uninstall /kb:5009557
    Windows Server 2016: wusa /uninstall /kb:5009546
    Windows Server 2012: wusa /uninstall /kb:5009586
    Windows Server 2022: wusa /uninstall /kb:5009555
    Note

    For further updates and operational guidance on the issue, follow the official Microsoft instructions. For more information, see RRAS Servers can lose connectivity if NAT is enabled on the public interface.

.NET Framework 3.5 fails to be installed in Windows Server 2012 R2

  • Problem description: If the Windows Server 2012 R2 operating system uses the images that are mentioned in this section, you cannot install .NET Framework 3.5 in the operating system, because one of the following patches is installed in the images: the KB5027141 patch released in June 2023, KB5028872 patch released in July 2023, KB5028970 patch released in August 2023, or KB5029915 patch released in September 2023.

    Important

    If you still want to use the Windows Server 2012 R2 operating system, we recommend that you create Elastic Compute Service (ECS) instances in the ECS console by using one of the following Windows Server 2012 R2 community images that have .NET Framework 3.5 installed: win2012r2_9600_x64_dtc_zh-cn_40G_.Net3.5_alibase_20231204.vhd and win2012r2_9600_x64_dtc_en-us_40G_.Net3.5_alibase_20231204.vhd. For information about how to search for the image that you need, see Find an image.

    Windows Server 2012 R2 images in which the preceding patches are installed

    • Images in which the KB5027141 patch released in June 2023 is installed

      • win2012r2_9600_x64_dtc_en-us_40G_alibase_20230615.vhd

      • win2012r2_9600_x64_dtc_zh-cn_40G_alibase_20230615.vhd

    • Images in which the KB5028872 patch released in July 2023 is installed

      • win2012r2_9600_x64_dtc_en-us_40G_alibase_20230718.vhd

      • win2012r2_9600_x64_dtc_zh-cn_40G_alibase_20230718.vhd

    • Images in which the KB5028970 patch released in August 2023 is installed

      • win2012r2_9600_x64_dtc_en-us_40G_alibase_20230811.vhd

      • win2012r2_9600_x64_dtc_zh-cn_40G_alibase_20230811.vhd

    • Images in which the KB5029915 patch released in September 2023 is installed

      • win2012r2_9600_x64_dtc_en-us_40G_alibase_20230915.vhd

      • win2012r2_9600_x64_dtc_zh-cn_40G_alibase_20230915.vhd

    image.png

  • Solution:

    1. On the control panel of your on-premises computer, find the KB5027141, KB5028872, KB5028970, or KB5029915 patch, right-click the patch, and then select Uninstall from the drop-down list to uninstall the patch. For example, uninstall the KB5029915 patch as shown in the following figure.

      image

    2. Restart the ECS instance.

      For more information, see Restart an instance.

    3. Install .NET Framework 3.5 by using one of the following methods.

      Installation by using Server Manager

      1. In the Server Manager window, click Add roles and features.

      2. Follow the wizard default configuration, click Features from the left-side navigation pane, and then select .NET Framework 3.5 Features.

        image

        Follow the wizard to confirm the settings until the installation is complete.

        image

      Installation by running PowerShell commands

      Run one of the following commands:

      • Dism /Online /Enable-Feature /FeatureName:NetFX3 /All 

        image.png

      • Install-WindowsFeature -Name NET-Framework-Features

        image.png

Known issues of Linux images

CentOS

CentOS 8.0: The image version numbers of created instances change after the public image is updated

  • Problem description: After you connect to an instance created from the centos_8_0_x64_20G_alibase_20200218.vhd public image, you find that the operating system version of the instance is CentOS 8.1.

    testuser@ecshost:~$ lsb_release -a
    LSB Version:    :core-4.1-amd64:core-4.1-noarch
    Distributor ID:    CentOS
    Description:    CentOS Linux release 8.1.1911 (Core)
    Release:    8.1.1911
    Codename:    Core
  • Cause: The centos_8_0_x64_20G_alibase_20200218.vhd image is a public image that was updated by using the latest community update package. The version of CentOS in the image is upgraded to 8.1. Therefore, the actual operating system version is CentOS 8.1.

  • Affected image: centos_8_0_x64_20G_alibase_20200218.vhd.

  • Solution: You can call an API operation, such as the RunInstances operation, and set the ImageId value to centos_8_0_x64_20G_alibase_20191225.vhd to create an instance whose operating system version is CentOS 8.0.

CentOS 7: An issue may be caused by the updates of some image IDs

  • Problem description: Some CentOS 7 public images have their image IDs updated, which may affect the policies for obtaining image IDs during automated O&M.

  • Affected images: CentOS 7.5 and CentOS 7.6.

  • Cause: The image IDs used by the latest versions of CentOS 7.5 and CentOS 7.6 public images are in the following format: %<OS type>%_%<Major version number>%_%<Minor version number >%_%<Special field>%_alibase_%<Date>%.%<Format>%. For example, the image ID prefix of CentOS 7.5 public images is updated from centos_7_05_64 to centos_7_5_x64. In this case, you must adjust the automated O&M policies that may be affected by the image ID updates. For information about image IDs, see Release notes for 2023.

CentOS 7: The hostname changes from uppercase letters to lowercase letters after an instance is restarted

  • Problem description: The first time some instances that run CentOS 7 are restarted, the hostnames of these instances change from uppercase letters to lowercase letters. The following table describes some examples.

    Hostname

    Hostname after the instance is restarted for the first time

    The hostname remains in lowercase after the instance restarts

    iZm5e1qe*****sxx1ps5zX

    izm5e1qe*****sxx1ps5zx

    Yes

    ZZHost

    zzhost

    Yes

    NetworkNode

    networknode

    Yes

  • The following CentOS public images and custom images derived from these public images are affected:

    • centos_7_2_64_40G_base_20170222.vhd

    • centos_7_3_64_40G_base_20170322.vhd

    • centos_7_03_64_40G_alibase_20170503.vhd

    • centos_7_03_64_40G_alibase_20170523.vhd

    • centos_7_03_64_40G_alibase_20170625.vhd

    • centos_7_03_64_40G_alibase_20170710.vhd

    • centos_7_02_64_20G_alibase_20170818.vhd

    • centos_7_03_64_20G_alibase_20170818.vhd

    • centos_7_04_64_20G_alibase_201701015.vhd

  • Affected hostnames: If the hostnames of your applications deployed on the instances are case-sensitive, services may be affected when you restart these instances. The following table describes whether the hostname changes after an instance is restarted.

    Current state of hostname

    The hostname changes after an instance is restarted

    Time when the hostname changes

    Continue to read this section

    The hostname contains uppercase letters when you create the instance in the ECS console or by calling ECS API operations.

    Yes

    The first time the instance restarts.

    Yes

    The hostname contains only lowercase letters when you create the instance in the ECS console or by calling ECS API operations.

    No

    N/A

    No

    The hostname contains uppercase letters, and you modify the hostname after you log on to the instance.

    No

    N/A

    Yes

  • Solution: To retain uppercase letters in the hostname of an instance after you restart the instance, perform the following operations:

    1. Connect to an instance.

      For more information, see the Connection methods section of the "Connection method overview" topic.

    2. View the existing hostname.

      [testuser@izbp193*****3i161uynzzx ~]# hostname
      izbp193*****3i161uynzzx
    3. Run the following command to make the hostname static:

      hostnamectl set-hostname --static iZbp193*****3i161uynzzX
    4. Run the following command to view the updated hostname:

      [testuser@izbp193*****3i161uynzzx ~]# hostname
      iZbp193*****3i161uynzzX
  • What to do next: If you use an affected custom image, we recommend that you update cloud-init to the latest version and then create another custom image. You can use the new custom image to create instances to prevent this issue. For more information, see Install cloud-init and Create a custom image from an instance.

CentOS 6.8: An instance installed with the NFS client does not respond

  • Problem description: A CentOS 6.8 instance installed with the NFS client does not respond and must be restarted.

  • Cause: When you use the NFS service on instances whose operating system kernel versions range from 2.6.32-696 to 2.6.32-696.10, the NFS client attempts to end a TCP connection if a glitch occurs due to communication latency. If the NFS server is slow in responding to NFS requests, the connection initiated by the NFS client may remain in the FIN_WAIT2 state for an extended period of time. In most cases, the connection times out and is closed 1 minute after the connection enters the FIN_WAIT2 state. Then, the NFS client can initiate a new connection. However, kernel versions 2.6.32-696 to 2.6.32-696.10 have issues with establishing TCP connections. As a result, the connection remains in the FIN_WAIT2 state, the NFS client is unable to recover the TCP connection, and a new TCP connection cannot be initiated. This causes the requests to freeze, and the only way to fix the issue is to restart the instance.

  • Affected images: centos_6_08_32_40G_alibase_20170710.vhd and centos_6_08_64_20G_alibase_20170824.vhd.

  • Solution: Run the yum update command to update the kernel to 2.6.32-696.11 or later.

    Important

    Before you perform operations on the instance, you must create a snapshot to back up your data. For more information, see Create a snapshot for a disk.

Debian

Debian 9.6: Instances in the classic network have network configuration issues

  • Problem description: Instances in the classic network that were created from Debian 9 public images cannot be pinged.

  • Cause: By default, the systemd-networkd service is disabled in Debian 9. Instances in the classic network that were created from Debian 9 public images cannot be automatically assigned IP addresses by using the Dynamic Host Configuration Protocol (DHCP).

  • Affected image: debian_9_06_64_20G_alibase_20181212.vhd.

  • Solution: Run the following commands in sequence:

    systemctl enable systemd-networkd 
    systemctl start systemd-networkd

Fedora CoreOS

The hostnames of instances created from Fedora CoreOS custom images do not take effect

  • Problem description: After you use a Fedora CoreOS image to create Instance A, you create a Fedora CoreOS custom image from Instance A and use the custom image to create Instance B. The hostname of Instance B remains the same as that of Instance A and the hostname specified for Instance B does not take effect.

    For example, you create a Fedora CoreOS custom image from Instance A that runs a Fedora CoreOS operating system and set the hostname of Instance A to test001. Then, you create Instance B from the custom image and set the hostname of Instance B to test002. After Instance B is created and connected, the hostname of Instance B remains test001.

  • Cause: Fedora CoreOS public images provided by Alibaba Cloud use Ignition offered by Fedora CoreOS to initialize instance configurations. Ignition is a utility used by Fedora CoreOS and RHEL CoreOS to manage disks in the initramfs during startup. The first time a Fedora CoreOS instance starts, coreos-ignition-firstboot-complete.service in Ignition checks whether the /boot/ignition.firstboot file exists and determines whether to initialize instance configurations. If the /boot/ignition.firstboot file exists, the system initializes instance configurations (including the hostname configuration) and deletes the /boot/ignition.firstboot file.

    The Fedora CoreOS instance must have been started at least once before it is used to create a Fedora CoreOS custom image. The first time the instance starts, the system deletes the /boot/ignition.firstboot file from the image of the instance. Hence, the Fedora CoreOS custom image created from the instance does not contain the /boot/ignition.firstboot file. The first time instances created from the Fedora CoreOS custom image start, the system does not initialize the instance configurations. In this case, the hostnames of the instances remain unchanged.

  • Solution:

    Note

    To ensure the security of data stored in the Fedora CoreOS instance, we recommend that you create snapshots for the instance. If data exceptions occur on the instance, you can use snapshots to roll back the disks of the instance to the normal status. For more information, see Create a snapshot for a disk.

    Before you use the Fedora CoreOS instance to create custom images, use the root permissions (the administrator permissions) to create the /ignition.firstboot file in the /boot directory. Perform the following operations:

    1. Run the following command to re-mount /boot in read/write mode:

      sudo mount /boot -o rw,remount
    2. Run the following command to create the /ignition.firstboot file:

      sudo touch /boot/ignition.firstboot
    3. Run the following command to re-mount /boot in read-only mode:

      sudo mount /boot -o ro,remount

    For information about how to configure Ignition, see Change /boot/ignition/config.ign permissions to 0600 and delete it after provisioning.

openSUSE

openSUSE 15: Kernel updates may cause the system to freeze during startup

  • Problem description: When openSUSE kernel versions are updated to 4.12.14-lp151.28.52-default, instances that have specific CPU types may freeze during startup. The known CPU type is Intel® Xeon® CPU E5-2682 v4 @ 2.50 GHz. The following code describes the call trace debugging result:

    [    0.901281] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [    0.901281] CR2: ffffc90000d68000 CR3: 000000000200a001 CR4: 00000000003606e0
    [    0.901281] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [    0.901281] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [    0.901281] Call Trace:
    [    0.901281]  cpuidle_enter_state+0x6f/0x2e0
    [    0.901281]  do_idle+0x183/0x1e0
    [    0.901281]  cpu_startup_entry+0x5d/0x60
    [    0.901281]  start_secondary+0x1b0/0x200
    [    0.901281]  secondary_startup_64+0xa5/0xb0
    [    0.901281] Code: 6c 01 00 0f ae 38 0f ae f0 0f 1f 84 00 00 00 00 00 0f 1f 84 00 00 00 00 00 90 31 d2 65 48 8b 34 25 40 6c 01 00 48 89 d1 48 89 f0 <0f> 01 c8 0f 1f 84 00 00 00 00 00 0f 1f 84 00 00 00 00 00 ** **
  • Cause: The new kernel version is incompatible with the CPU microcode. For more information, see Issues of freezing during startup.

  • Affected image: opensuse_15_1_x64_20G_alibase_20200520.vhd.

  • Solution: In the /boot/grub2/grub.cfg file, add the idle kernel parameter to the row that starts with linux and set this parameter to nomwait. The following example shows how to modify the file:

    menuentry 'openSUSE Leap 15.1'  --class opensuse --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-20f5f35a-fbab-4c9c-8532-bb6c66ce****' {
            load_video
            set gfxpayload=keep
            insmod gzio
            insmod part_msdos
            insmod ext2
            set root='hd0,msdos1'
            if [ x$feature_platform_search_hint = xy ]; then
              search --no-floppy --fs-uuid --set=root --hint='hd0,msdos1'  20f5f35a-fbab-4c9c-8532-bb6c66ce****
            else
              search --no-floppy --fs-uuid --set=root 20f5f35a-fbab-4c9c-8532-bb6c66ce****
            fi
            echo    'Loading Linux 4.12.14-lp151.28.52-default ...'
            linux   /boot/vmlinuz-4.12.14-lp151.28.52-default root=UUID=20f5f35a-fbab-4c9c-8532-bb6c66ce****  net.ifnames=0 console=tty0 console=ttyS0,115200n8 splash=silent mitigations=auto quiet idle=nomwait
            echo    'Loading initial ramdisk ...'
            initrd  /boot/initrd-4.12.14-lp151.28.52-default
    }

Red Hat Enterprise Linux

Red Hat Enterprise Linux 8 64-bit: The kernel version cannot be updated by running the yum update command

  • Problem description: After you run the yum update command on an ECS instance that runs a RHEL 8 64-bit operating system to update its kernel version, the kernel version of the instance operating system remains unchanged even after the instance is restarted.

  • Cause: In the RHEL 8 64-bit operating system, the size of the /boot/grub2/grubenv file that stores GRUB2 environment variables is not 1,024 bytes. As a result, the kernel version cannot be updated.

  • Solution: After you update the kernel version, set the new kernel version to the default startup version. Perform the following operations:

    1. Run the following command to update the kernel version:

      yum update kernel -y
    2. Run the following command to obtain the kernel startup parameter of the operating system:

      grub2-editenv list | grep kernelopts
    3. Run the following command to back up the old /grubenv file:

      mv /boot/grub2/grubenv /home/grubenv.bak
    4. Run the following command to create the /grubenv file:

      grub2-editenv /boot/grub2/grubenv create
    5. Run the following command to set the new kernel version to the default startup version.

      In this example, the new kernel version is /boot/vmlinuz-4.18.0-305.19.1.el8_4.x86_64.

      grubby --set-default /boot/vmlinuz-4.18.0-305.19.1.el8_4.x86_64
    6. Run the following command to set the kernel startup parameter.

      In this example, run the - set kernelopts command to set the kernelopts value to the value of the kernel startup parameter obtained in Step ii.

      grub2-editenv - set kernelopts="root=UUID=0dd6268d-9bde-40e1-b010-0d3574b4**** ro crashkernel=auto net.ifnames=0 vga=792 console=tty0 console=ttyS0,115200n8 noibrs nosmt"
    7. Run the following command to restart the instance for the new kernel version to take effect:

      reboot
      Warning

      The restart operation stops the instance for a short period of time and may interrupt services that are running on the instance. We recommend that you restart instances during off-peak hours.

SUSE Linux Enterprise Server

SUSE Linux Enterprise Server: The SMT server cannot be connected

  • Problem description: When you use a paid Alibaba Cloud image for SUSE Linux Enterprise Server or SUSE Linux Enterprise Server for SAP, connection errors such as a connection timeout may occur on the simultaneous multithreading (SMT) server. When you download or update a component of the SMT server, error messages similar to the following ones are returned:

    • Registration server returned 'This server could not verify that you are authorized to access this service.' (500)

    • Problem retrieving the repository index file for service 'SMT-http_mirrors_cloud_aliyuncs_com' location ****

  • Affected images: SUSE Linux Enterprise Server and SUSE Linux Enterprise Server for SAP.

  • Solution: Register and activate SMT again.

    1. Run the following commands in sequence to register and activate SMT:

      SUSEConnect -d
      SUSEConnect --cleanup
      systemctl restart guestregister
    2. Run the following command to verify whether SMT is activated:

      SUSEConnect -s

      If SMT is activated, a command output similar to the following one is returned:

      [{"identifier":"SLES_SAP","version":"12.5","arch":"x86_64","status":"Registered"}]

SLES 12 SP5: Kernel updates may cause the system to freeze during startup

  • Problem description: When an earlier kernel version is updated to SLES 12 SP5 or when you update the kernel of SLES 12 SP5, instances that have specific CPU types may freeze during startup. These known CPU types are Intel® Xeon® CPU E5-2682 v4 @ 2.50 GHz and Intel® Xeon® CPU E7-8880 v4 @ 2.20 GHz. The following code describes the call trace debugging result:

    [    0.901281] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [    0.901281] CR2: ffffc90000d68000 CR3: 000000000200a001 CR4: 00000000003606e0
    [    0.901281] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [    0.901281] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [    0.901281] Call Trace:
    [    0.901281]  cpuidle_enter_state+0x6f/0x2e0
    [    0.901281]  do_idle+0x183/0x1e0
    [    0.901281]  cpu_startup_entry+0x5d/0x60
    [    0.901281]  start_secondary+0x1b0/0x200
    [    0.901281]  secondary_startup_64+0xa5/0xb0
    [    0.901281] Code: 6c 01 00 0f ae 38 0f ae f0 0f 1f 84 00 00 00 00 00 0f 1f 84 00 00 00 00 00 90 31 d2 65 48 8b 34 25 40 6c 01 00 48 89 d1 48 89 f0 <0f> 01 c8 0f 1f 84 00 00 00 00 00 0f 1f 84 00 00 00 00 00 ** **
  • Cause: The new kernel version is incompatible with the CPU microcode.

  • Solution: In the /boot/grub2/grub.cfg file, add the idle kernel parameter to the row that starts with linux and set this parameter to nomwait. The following example shows how to modify the file:

    menuentry 'SLES 12-SP5'  --class sles --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-fd7bda55-42d3-4fe9-a2b0-45efdced****' {
            load_video
            set gfxpayload=keep
            insmod gzio
            insmod part_msdos
            insmod ext2
            set root='hd0,msdos1'
            if [ x$feature_platform_search_hint = xy ]; then
              search --no-floppy --fs-uuid --set=root --hint='hd0,msdos1'  fd7bda55-42d3-4fe9-a2b0-45efdced****
            else
              search --no-floppy --fs-uuid --set=root fd7bda55-42d3-4fe9-a2b0-45efdced****
            fi
            echo    'Loading Linux 4.12.14-122.26-default ...'
            linux   /boot/vmlinuz-4.12.14-122.26-default root=UUID=fd7bda55-42d3-4fe9-a2b0-45efdced****  net.ifnames=0 console=tty0 console=ttyS0,115200n8 mitigations=auto splash=silent quiet showopts idle=nomwait
            echo    'Loading initial ramdisk ...'
            initrd  /boot/initrd-4.12.14-122.26-default
    }

Other issues

For specific instance types, a call trace may occur when instances that run operating systems with more recent kernel versions are started

  • Problem description: If an instance of a specific instance type such as ecs.i2.4xlarge runs an operating system with a more recent kernel version, such as Red Hat Enterprise Linux (RHEL) 8.3 or CentOS 8.3 with the 4.18.0-240.1.1.el8_3.x86_64 kernel version, a call trace may occur when the instance is started. Call trace example:

    Dec 28 17:43:45 localhost SELinux:  Initializing.
    Dec 28 17:43:45 localhost kernel: Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes)
    Dec 28 17:43:45 localhost kernel: Inode-cache hash table entries: 4194304 (order: 13, 33554432 bytes)
    Dec 28 17:43:45 localhost kernel: Mount-cache hash table entries: 131072 (order: 8, 1048576 bytes)
    Dec 28 17:43:45 localhost kernel: Mountpoint-cache hash table entries: 131072 (order: 8, 1048576 bytes)
    Dec 28 17:43:45 localhost kernel: unchecked MSR access error: WRMSR to 0x3a (tried to write 0x000000000000****) at rIP: 0xffffffff8f26**** (native_write_msr+0x4/0x20)
    Dec 28 17:43:45 localhost kernel: Call Trace:
    Dec 28 17:43:45 localhost kernel:  init_ia32_feat_ctl+0x73/0x28b
    Dec 28 17:43:45 localhost kernel:  init_intel+0xdf/0x400
    Dec 28 17:43:45 localhost kernel:  identify_cpu+0x1f1/0x510
    Dec 28 17:43:45 localhost kernel:  identify_boot_cpu+0xc/0x77
    Dec 28 17:43:45 localhost kernel:  check_bugs+0x28/0xa9a
    Dec 28 17:43:45 localhost kernel:  ? __slab_alloc+0x29/0x30
    Dec 28 17:43:45 localhost kernel:  ? kmem_cache_alloc+0x1aa/0x1b0
    Dec 28 17:43:45 localhost kernel:  start_kernel+0x4fa/0x53e
    Dec 28 17:43:45 localhost kernel:  secondary_startup_64+0xb7/0xc0
    Dec 28 17:43:45 localhost kernel: Last level iTLB entries: 4KB 64, 2MB 8, 4MB 8
    Dec 28 17:43:45 localhost kernel: Last level dTLB entries: 4KB 64, 2MB 0, 4MB 0, 1GB 4
    Dec 28 17:43:45 localhost kernel: FEATURE SPEC_CTRL Present
    Dec 28 17:43:45 localhost kernel: FEATURE IBPB_SUPPORT Present
  • Cause: The kernel version is updated by using the latest community update package to include the patches for writes to Model-Specific Registers (MSRs). However, some instance types such as ecs.i2.4xlarge do not support writes to MSRs due to the limits imposed by virtualization.

  • Solution: The call trace does not affect system operation or stability. You can ignore this issue.

Compatibility issues between specific Linux kernel versions and the hfg6 general-purpose instance family with high clock speeds may cause kernel panic

  • Problem description: When the kernels of some open source Linux distributions such as CentOS 8, SUSE Linux Enterprise Server (SLES) 15 SP2, and openSUSE 15.2 are updated to the latest versions in hfg6 instances, a kernel panic error may occur. The following figure shows an example of the call trace debugging method.kernel panic

  • Cause: Some Linux kernel versions are incompatible with the hfg6 general-purpose instance family with high clock speeds.

  • Solution:

    • The compatibility issue is fixed for the latest kernel versions of SLES 15 SP2 and openSUSE 15.2. The following code shows the information of the change commit. If your latest kernel version contains this information, the kernel version is compatible with the hfg6 instance family.

      commit 1e33d5975b49472e286bd7002ad0f689af33fab8
      Author: Giovanni Gherdovich <ggherdovich@suse.cz>
      Date:   Thu Sep 24 16:51:09 2020 +0200
      
          x86, sched: Bail out of frequency invariance if
          turbo_freq/base_freq gives 0 (bsc#1176925).
      
          suse-commit: a66109f44265ff3f3278fb34646152bc2b3224a5
          
          
      commit dafb858aa4c0e6b0ce6a7ebec5e206f4b3cfc11c
      Author: Giovanni Gherdovich <ggherdovich@suse.cz>
      Date:   Thu Sep 24 16:16:50 2020 +0200
      
          x86, sched: Bail out of frequency invariance if turbo frequency
          is unknown (bsc#1176925).
      
          suse-commit: 53cd83ab2b10e7a524cb5a287cd61f38ce06aab7
      
      commit 22d60a7b159c7851c33c45ada126be8139d68b87
      Author: Giovanni Gherdovich <ggherdovich@suse.cz>
      Date:   Thu Sep 24 16:10:30 2020 +0200
      
          x86, sched: check for counters overflow in frequency invariant
          accounting (bsc#1176925).
    • If you run the yum update command to update the kernel of CentOS 8 to kernel-4.18.0-240 or later in hfg6 instances, a kernel panic error may occur. If this error occurs, roll the kernel back to the previous version.

Pip requests time out

  • Problem description: Pip requests occasionally time out or fail.

  • Affected images: CentOS, Debian, Ubuntu, SUSE, openSUSE, and Alibaba Cloud Linux.

  • Cause: Alibaba Cloud provides three pip repository addresses. The default address is mirrors.aliyun.com. To access this address, instances must be able to access the Internet. If your instance is not assigned a public IP address, pip requests time out.

    • Default public repository address: mirrors.aliyun.com

    • Internal repository address in virtual private clouds (VPCs): mirrors.cloud.aliyuncs.com

    • Internal repository address in the classic network: mirrors.aliyuncs.com

  • Solution: You use one of the following methods to resolve the issue:

    • Method 1

      Assign a public IP address to your instance by associating an elastic IP address (EIP) with the instance. For more information, see Associate an EIP with an ECS instance.

      You can also re-assign a public IP address to a subscription instance when you change the instance configurations. For more information, see Upgrade the instance types of subscription instances.

    • Method 2

      If a pip request fails, you can run the fix_pypi.sh script in your instance and retry the pip operation. Perform the following steps:

      1. Connect to an instance.

        For more information, see Connect to an instance by using VNC.

      2. Run the following command to obtain the script file:

        wget http://image-offline.oss-cn-hangzhou.aliyuncs.com/fix/fix_pypi.sh
      3. Run one of the following scripts based on the network type of the instance:

        • If your instance resides in a VPC, run the bash fix_pypi.sh "mirrors.cloud.aliyuncs.com" script.

        • If your instance resides in the classic network, run the bash fix_pypi.sh "mirrors.aliyuncs.com" script.

      4. Retry the pip operation.

      The following section describes the fix_pypi.sh script:

      #!/bin/bash
      
      function config_pip() {
          pypi_source=$1
      
          if [[ ! -f ~/.pydistutils.cfg ]]; then
      cat > ~/.pydistutils.cfg << EOF
      [easy_install]
      index-url=http://$pypi_source/pypi/simple/
      EOF
          else
              sed -i "s#index-url.*#index-url=http://$pypi_source/pypi/simple/#" ~/.pydistutils.cfg
          fi
      
          if [[ ! -f ~/.pip/pip.conf ]]; then
          mkdir -p ~/.pip
      cat > ~/.pip/pip.conf << EOF
      [global]
      index-url=http://$pypi_source/pypi/simple/
      [install]
      trusted-host=$pypi_source
      EOF
          else
              sed -i "s#index-url.*#index-url=http://$pypi_source/pypi/simple/#" ~/.pip/pip.conf
              sed -i "s#trusted-host.*#trusted-host=$pypi_source#" ~/.pip/pip.conf
          fi
      }
      
      config_pip $1