Local disks do not provide high availability of data. To enhance user experience on local disks, Alibaba Cloud provides various O&M capabilities to help you keep up on and handle exceptions that occur on your local disks. This topic describes common O&M scenarios and system events for Elastic Compute Service (ECS) instances equipped with local disks.

Common O&M scenarios

For ECS bare metal instances, you can install the xdragon_hardware_detect_plugin plug-in to check the health status of local disks on the instances on a regular basis. For more information, see Install the monitoring plug-in.

For more information about system events triggered in the scenarios shown in the preceding figure, see the following sections in this topic:
Note To ensure that your business is not affected, we recommend that you back up data for affected ECS instances and switch over to other instances before you execute O&M tasks on the instances. For example, you can divert traffic away from the affected ECS instances, disassociate the ECS instances from Server Load Balancer (SLB) instances, and back up disk data of the ECS instances.

Scenario ①

Procedure to handle a SystemMaintenance.Reboot system event:
  1. Receive an event notification when an instance is scheduled to be restarted.
  2. Use one of following methods to handle the event:
    • If you do not want the instance to be restarted within the scheduled time period, specify a different time at which to automatically restart the instance. For more information, see Modify the scheduled restart time.
    • Restart the instance within the user operation window. For more information, see Reboot the instance.
      Note You must restart the instance by using the ECS console or by calling the RebootInstance operation. You cannot restart the instance from within the instance.
    • Wait for the instance to be automatically restarted.
  3. Check whether the instance and applications continue to work as expected.

For information about the event states supported by SystemMaintenance.Reboot, see Summary. For the figure that shows the typical transitions between event states, see States and windows of system events.

Scenario ②

Procedure to handle a SystemMaintenance.Redeploy system event:
  1. Receive an event notification when an instance equipped with local disks is scheduled to be redeployed.
  2. Make preparations such as modifying the /etc/fstab configuration file and backing up data.

    For more information about preparations that you must make, see the "Prerequisites" section in Redeploy an instance equipped with local disks.

  3. Use one of following methods to handle the event:
    Note When an instance equipped with local disks is redeployed, the instance is migrated to a different physical machine, and the local disks of the instance are re-initialized and lose all their data.
  4. Check whether the instance and applications continue to work as expected. If yes, synchronize data based on your business requirements.

For information about the event states supported by SystemMaintenance.Redeploy, see Summary. For the figure that shows the typical transitions between event states, see States and windows of system events.

Scenario ③

Procedure to handle a SystemFailure.Reboot system event:
  1. The system restarts an instance due to a system error.
  2. Receive an event notification when the instance is being restarted.

    Wait until the instance is restarted without manual intervention.

  3. Check whether the instance and applications continue to work as expected.

For information about the event states supported by SystemFailure.Reboot, see Summary. For the figure that shows the typical transitions between event states, see States and windows of system events.

Scenario ④

Procedure to handle a SystemFailure.Redeploy system event:
  1. Receive an event notification when an instance equipped with local disks is scheduled to be redeployed.
  2. Make preparations such as modifying the /etc/fstab configuration file and backing up data.

    For more information about preparations that you must make, see the "Prerequisites" section in Redeploy an instance equipped with local disks.

  3. Use one of following methods to handle the event:
    Note When an instance equipped with local disks is redeployed, the instance is migrated to a different physical machine, and the local disks of the instance are re-initialized and lose all their data.
  4. Check whether the instance and applications continue to work as expected. If yes, synchronize data based on your business requirements.

For information about the event states supported by SystemFailure.Redeploy, see Summary. For the figure that shows the typical transitions between event states, see States and windows of system events.

Scenario ⑤

For Scenario ⑤ where a local disk is damaged on the host of an instance, you can redeploy the instance to another host or replace the disk.

  • When the instance is redeployed, its local disks are restored but lose all their data. For information about how to redeploy an instance equipped with local disks, see Redeploy an instance equipped with local disks.
  • When the damaged local disk is replaced, only data of the replaced local disk is lost but data of the other local disks on the instance is retained. Procedure to replace a damaged local disk on an instance:
    1. Receive an event notification when a local disk on an instance is damaged and scheduled to be isolated.
    2. Make preparations such as modifying the /etc/fstab configuration file and backing up data.
    3. Respond to the notification and authorize Alibaba Cloud to isolate the damaged local disk.
    4. If the name of the system event contains Reboot, you must restart the instance.
    5. Alibaba Cloud removes the damaged local disk from the host on which your instance resides, inserts a new disk, and then sends you a disk restoration notification.
    6. After you receive the notification, authorize Alibaba Cloud to restore the disk.
    7. If the name of the system event contains Reboot, you must restart the instance.
    Note To replace a damaged local disk, you must work together with Alibaba Cloud. For more information, see Isolate damaged local disks in the ECS console and Isolate damaged local disks by using Alibaba Cloud CLI.