If a local disk on a physical machine that hosts an Elastic Compute Service (ECS) instance is damaged, the instance remains on the physical machine after the local disk is isolated. This topic describes how to isolate damaged local disks in the ECS console. The procedure described in this topic can be performed only to handle the local disk-related system events of ECS instances.
Background information
System events for isolation of damaged local disks include the Disk:ErrorDetected
, SystemMaintenance.IsolateErrorDisk
, SystemMaintenance.RebootAndIsolateErrorDisk
, SystemMaintenance.ReInitErrorDisk
, and SystemMaintenance.RebootAndReInitErrorDisk
events. The Disk:ErrorDetected event is triggered when a damage alert is generated for a local disk. The SystemMaintenance.IsolateErrorDisk event is triggered when a damaged local disk needs to be isolated due to system maintenance. The SystemMaintenance.RebootAndIsolateErrorDisk event is triggered when an instance needs to be restarted and a damaged local disk used by the instance needs to isolated due to system maintenance. The SystemMaintenance.ReInitErrorDisk event is triggered when a damaged local disk needs to be re-initialized due to system maintenance. The SystemMaintenance.RebootAndReInitErrorDisk event is triggered when an instance needs to be restarted and a damaged local disk used by the instance needs to be re-initialized due to system maintenance. Only damaged local disks used by instances of big data instance types can be isolated. For more information, see O&M scenarios and system events for instances equipped with local disks.
Procedure
- Log on to the ECS console.
- In the left-side navigation pane, click Events.
- In the left-side navigation pane of the Events page, click Local Disk-based Instance Events.
- On the Local Disk-based Instance Events page, click the Local Disk Damaged Events tab.
- Find the instance whose damaged local disk event you want to handle, and click Repair in the Actions column.
- In the Configurations Modification step, modify the configuration file of the instance. Then, click Next.
For some Linux instances, if the Configurations Modification step is displayed, follow the on-screen instructions to perform the following operations. In this topic, a damaged local disk named /dev/vdd is used.
- In the Damaged Disk Isolation step, click OK. Refresh the page if the next step is not displayed.
- Optional:In the Instance Restart step, click Restart. If the Instance Restart step is displayed, click Restart to restart the instance.Note After the instance is restarted, the isolated damaged local disk is temporarily converted to a 1 MiB dummy hard disk to facilitate subsequent operations. At the application layer, you must continuously isolate read and write operations on the damaged local disk and configure the nofail parameter in the /etc/fstab file.
- After the instance is restarted, click OK in the New Disk Inserting step. Wait for Alibaba Cloud to replace the damaged local disk on the physical machine that hosts the instance. Typically, five business days are required to replace a damaged local disk. After the local disk is replaced, you receive an event that requires you to restore the disk.
- After you receive the event, click Restore in the Disk Restoration step. Refresh the page if the next step is not displayed.
- Optional:In the Instance Restart step, click Restart. If the Instance Restart step is displayed, click Restart to restart the instance.
- After the instance is restarted, click Complete in the Complete step.
Result
A few minutes after the damaged local disk is replaced, the local disk damaged event disappears.What to do next
After the damaged disk is isolated, check the status of the instance and local disk. The replaced local disk is restored to its original capacity, and you can reformat data disks. For more information, see Initialize a data disk up to 2 TiB in size on a Windows instance or Initialize a data disk whose size does not exceed 2 TiB on a Linux instance.