All Products
Search
Document Center

Elastic Compute Service:Summary of ECS system events

Last Updated:Apr 07, 2025

System events are used to record and notify cloud resource information, such as O&M task executions, resource exceptions, and resource status changes. You can use system events to obtain information about risks and anomalies of Elastic Compute Service (ECS) resources. For example, a system event is generated when an instance must be migrated due to underlying upgrades or when an instance is restarted due to system maintenance. You can respond to and handle the system event at the earliest opportunity to prevent your business from being affected by ECS resource unavailability or performance degradation. This topic provides a summary of the system events supported by ECS, including scheduled O&M events, unexpected O&M events, instance billing events, and instance status change events. This topic also provides suggestions on how to handle the system events.

Formats of ECS event codes and CloudMonitor event names

ECS system events are synchronized to CloudMonitor. This allows you to set up an automated O&M mechanism based on system events. ECS event codes and CloudMonitor events follow specific naming conventions.

  • ECS event codes indicate the event causes and impacts on resources and are in the <Event cause>.<Event impact> format.

  • CloudMonitor event names indicate the resource types, event causes, event impacts on resources, and event status and are in the <Resource type>:<Event cause>.<Event impact>:<Event status> format.

Note

ECS event codes and CloudMonitor event names may include only some of the preceding information. For example, a CloudMonitor event name of Disk:ErrorDetected:Executing indicates that a disk is damaged and excludes information about the impacts on resources.

The following table describes some examples of ECS event codes and CloudMonitor event names.

Note

An ECS event code of Undefined indicates that ECS events are not displayed in the ECS console and cannot be handled in the ECS console or by calling API operations.

Category

Sample ECS event code

Sample CloudMonitor event name

Description

Scheduled O&M events

SystemMaintenance.Reboot

Instance:SystemMaintenance.Reboot:Inquiring

  • Resource type: Instance, which indicates ECS instance.

  • Event cause: SystemMaintenance, which indicates that Alibaba Cloud proactively initiates a system O&M task.

  • Event impact: Reboot, which indicates that the instance is restarted while the O&M task is being executed.

  • Event status: Inquiring, which indicates that the O&M task related to the event is pending authorization and the instance can be restarted only after you authorize the task to be executed.

Unexpected O&M events

ErrorDetected

Disk:ErrorDetected:Executing

  • Resource type: Disk, which indicates cloud disk.

  • Event cause: ErrorDetected, which indicates that the local disk is damaged.

  • Event status: Executing, which indicates that the damaged local disk has not been repaired.

Lifecycle status change events

Snapshot:CreateSnapshotCompleted

Snapshot:CreateSnapshotCompleted

  • Resource type: Snapshot, which indicates snapshot.

  • Event cause: CreateSnapshotCompleted, which indicates that the snapshot is created.

Scheduled O&M events

Important

If you perform a restart operation within the operating system of an instance on which a system event occurred, the maintenance action corresponding to the event cannot take effect. All instance restart operations in this topic are performed in the ECS console or by calling an API operation. For more information, see Restart an instance or RebootInstance.

Event code

Event name

Event severity level

CloudMonitor event name

Event description and impact

Handling suggestion

SystemMaintenance.Reboot

Instance Restart Due to System Maintenance

Critical

  • Instance:SystemMaintenance.Reboot:Inquiring

  • Instance:SystemMaintenance.Reboot:Scheduled

  • Instance:SystemMaintenance.Reboot:Executing

  • Instance:SystemMaintenance.Reboot:Executed

  • Instance:SystemMaintenance.Reboot:Avoided

  • Instance:SystemMaintenance.Reboot:Failed

  • Instance:SystemMaintenance.Reboot:Canceled

This system event is triggered 24 to 48 hours before the scheduled time of system maintenance when Alibaba Cloud detects a potential risk of hardware or software failure on the underlying host of an instance and the risk can cause instance restarts.

Note

Take note of the following risks:

  • Type 1: risks in hosts

  • Type 2: risks that the GPUs on an instance are unavailable

We recommend that you perform one of the following actions to handle the event:

Note
  • We recommend that you take note of the event status. If the event status remains unchanged after the instance is restarted, the event is not handled and the risk is not mitigated. To mitigate the risk, we recommend that you select a point in time that is at least 12 hours from the time of the current operation to restart the instance.

  • You can modify the maintenance attributes of the instance to specify the default action to take when an O&M event occurs on the instance. For more information, see Modify instance maintenance attributes.

SystemMaintenance.Stop

Instance Stopped Due to System Maintenance

Critical

  • Instance:SystemMaintenance.Stop:Scheduled

  • Instance:SystemMaintenance.Stop:Executing

  • Instance:SystemMaintenance.Stop:Executed

  • Instance:SystemMaintenance.Stop:Avoided

  • Instance:SystemMaintenance.Stop:Failed

  • Instance:SystemMaintenance.Stop:Canceled

This system event is triggered 24 to 48 hours before the scheduled time of system maintenance when Alibaba Cloud detects a potential risk of hardware or software failure on the underlying host of an instance and the risk can cause instance stops.

We recommend that you perform one of the following actions to handle the event:

  • Redeploy the instance. For more information, see Redeploy an instance equipped with local disks.

  • Wait for the instance to be automatically stopped and then perform instance operations, such as redeployment, based on your business requirements.

Note

You can modify the maintenance attributes of the instance to specify the default action to take when an O&M event occurs on the instance. For more information, see Modify instance maintenance attributes.

SystemMaintenance.Redeploy

Instance Redeployment Due to System Maintenance

Critical

  • Instance:SystemMaintenance.Redeploy:Inquiring

  • Instance:SystemMaintenance.Redeploy:Scheduled

  • Instance:SystemMaintenance.Redeploy:Executing

  • Instance:SystemMaintenance.Redeploy:Executed

  • Instance:SystemMaintenance.Redeploy:Avoided

  • Instance:SystemMaintenance.Redeploy:Canceled

This system event is triggered 24 to 48 hours before the scheduled time of system maintenance when Alibaba Cloud detects a potential risk of hardware or software failure on the underlying host of an instance and the risk can cause instance redeployment.

Important

If the instance is equipped with local SSDs or local HDDs, the data disks on the instance are re-initialized and the data stored on the local disks is cleared.

We recommend that you make preparations, such as modifying the /etc/fstab configuration file and backing up data, and then perform one of the following actions to handle the event:

Note
  • We recommend that you take note of the event status. If the event status remains unchanged after the instance is redeployed, the event is not handled and the risk is not mitigated. To mitigate the risk, we recommend that you select a point in time that is at least 12 hours from the time of the current operation to redeploy the instance.

  • You can modify the maintenance attributes of the instance to specify the default action to take when an O&M event occurs on the instance. For more information, see Modify instance maintenance attributes.

SystemMaintenance.IsolateErrorDisk

Damaged Disk Isolation Due to System Maintenance

Critical

  • Instance:SystemMaintenance.IsolateErrorDisk:Inquiring

  • Instance:SystemMaintenance.IsolateErrorDisk:Executing

  • Instance:SystemMaintenance.IsolateErrorDisk:Executed

  • Instance:SystemMaintenance.IsolateErrorDisk:Avoided

  • Instance:SystemMaintenance.IsolateErrorDisk:Failed

  • Instance:SystemMaintenance.IsolateErrorDisk:Canceled

This system event is immediately triggered when Alibaba Cloud detects hardware or software damage on a local disk of an instance.

Important

The procedure for handling a damaged local disk of an instance varies based on the instance type. For specific instance types, the instance must be restarted and the damaged local disk must be isolated. For other instance types, the damaged local disk can be isolated online and then repaired.

We recommend that you make preparations, such as modifying the /etc/fstab configuration file and backing up data, and then select an appropriate point in time to authorize the damaged disk to be isolated. Then, the local disk is isolated online without the need to restart the associated instance.

Note

For more information, see the Scenario ③ section of the "O&M scenarios and system events for instances equipped with local disks" topic.

SystemMaintenance.ReInitErrorDisk

Damaged Disk Re-initialization Due to System Maintenance

Critical

  • Instance:SystemMaintenance.ReInitErrorDisk:Inquiring

  • Instance:SystemMaintenance.ReInitErrorDisk:Executing

  • Instance:SystemMaintenance.ReInitErrorDisk:Executed

  • Instance:SystemMaintenance.ReInitErrorDisk:Avoided

  • Instance:SystemMaintenance.ReInitErrorDisk:Failed

  • Instance:SystemMaintenance.ReInitErrorDisk:Canceled

This system event is immediately triggered when Alibaba Cloud isolates and replaces a local disk on the host of an instance after Alibaba Cloud detects hardware or software damage on the local disk. In most cases, Alibaba Cloud isolates and replaces a damaged local disk within five business days after you authorize Alibaba Cloud to isolate the local disk.

Important

The procedure for handling a damaged local disk of an instance varies based on the instance type. For specific instance types, the instance must be restarted and the damaged local disk must be isolated. For other instance types, the damaged local disk can be isolated online and then repaired.

We recommend that you select an appropriate point in time to authorize the local disk to be repaired. Then, the local disk is repaired online without the need to restart the associated instance.

Note

For more information, see the Scenario ③ section of the "O&M scenarios and system events for instances equipped with local disks" topic.

SystemMaintenance.RebootAndIsolateErrorDisk

Damaged Disk Isolation and Instance Restart Due to System Maintenance

Critical

  • Instance:SystemMaintenance.RebootAndIsolateErrorDisk:Inquiring

  • Instance:SystemMaintenance.RebootAndIsolateErrorDisk:Executing

  • Instance:SystemMaintenance.RebootAndIsolateErrorDisk:Executed

  • Instance:SystemMaintenance.RebootAndIsolateErrorDisk:Avoided

  • Instance:SystemMaintenance.RebootAndIsolateErrorDisk:Canceled

This system event is immediately triggered when Alibaba Cloud detects hardware or software damage on a local disk of an instance and fails to isolate the local disk online.

Important

The procedure for handling a damaged local disk of an instance varies based on the instance type. For specific instance types, the instance must be restarted and the damaged local disk must be isolated. For other instance types, the damaged local disk can be isolated online and then repaired.

We recommend that you select an appropriate point in time to authorize the damaged disk to be isolated and restart the associated instance after the disk is isolated. In this case, the local disk is isolated offline. You must restart the associated instance for the isolation operation to take effect.

Note

For more information, see the Scenario ③ section of the "O&M scenarios and system events for instances equipped with local disks" topic.

SystemMaintenance.RebootAndReInitErrorDisk

Damaged Disk Re-initialization and Instance Restart Due to System Maintenance

Critical

  • Instance:SystemMaintenance.RebootAndReInitErrorDisk:Inquiring

  • Instance:SystemMaintenance.RebootAndReInitErrorDisk:Executing

  • Instance:SystemMaintenance.RebootAndReInitErrorDisk:Executed

  • Instance:SystemMaintenance.RebootAndReInitErrorDisk:Avoided

  • Instance:SystemMaintenance.RebootAndReInitErrorDisk:Canceled

This system event is immediately triggered when Alibaba Cloud detects hardware or software damage on a local disk of an instance and fails to repair the local disk online.

Important

The procedure for handling a damaged local disk of an instance varies based on the instance type. For specific instance types, the instance must be restarted and the damaged local disk must be isolated. For other instance types, the damaged local disk can be isolated online and then repaired.

We recommend that you select an appropriate point in time to authorize the local disk to be repaired and restart the associated instance after the disk is repaired. In this case, the local disk is repaired offline. You must restart the associated instance for the restoration operation to take effect.

Note

For more information, see the Scenario ③ section of the "O&M scenarios and system events for instances equipped with local disks" topic.

SystemMaintenance.StopAndRepair

In-place Repair of Instance Equipped with Local Disks

Critical

  • Instance:SystemMaintenance.StopAndRepair:Inquiring

  • Instance:SystemMaintenance.StopAndRepair:Scheduled

  • Instance:SystemMaintenance.StopAndRepair:Executing

  • Instance:SystemMaintenance.StopAndRepair:Executed

  • Instance:SystemMaintenance.StopAndRepair:Avoided

This system event is triggered 48 to 168 hours before the scheduled time of system maintenance when Alibaba Cloud detects a risk of hardware failure on the underlying host of an instance.

We recommend that you select an appropriate point in time to authorize Alibaba Cloud to repair or redeploy the instance that is equipped with local disks.

SystemMaintenance.CleanReleasedDisks

Disk Cleanup After EBS Disk Hot Swapping Failure

Warning

  • Instance:SystemMaintenance.CleanReleasedDisks.Inquiring

  • Instance:SystemMaintenance.CleanReleasedDisks.Executing

  • Instance:SystemMaintenance.CleanReleasedDisks.Executed

  • Instance:SystemMaintenance.CleanReleasedDisks.Failed

This system event is triggered when Alibaba Cloud detects the configurations of one or more cloud disks that were released due to overdue payments in the operating system of an instance.

We recommend that you select an appropriate point in time to authorize Alibaba Cloud to clear the configurations of the released cloud disks.

Important

Alibaba Cloud stops the instance at the specified point in time and then clears the configurations of the cloud disks. After the cloud disk configurations are cleared, the instance is restarted.

Unexpected O&M events

Event code

Event name

Event severity level

CloudMonitor event name

Event description and impact

Handling suggestion

SystemFailure.Reboot

Instance Restart Due to System Error

Critical

  • Instance:SystemFailure.Reboot:Executing

  • Instance:SystemFailure.Reboot:Executed

  • Instance:SystemFailure.Reboot:Failed

This system event is immediately triggered when Alibaba Cloud detects that an instance is restarted due to hardware or software failure on the underlying host, such as CPU or memory hardware damage.

We recommend that you wait until the instance is automatically restarted and then check whether the instance and applications work as expected.

When the instance is being restarted, Alibaba Cloud migrates the instance to a healthy host.

Note

You can modify the maintenance attributes of the instance to specify the default action to take when an O&M event occurs on the instance. For more information, see Modify instance maintenance attributes.

InstanceFailure.Reboot

Instance Restart Due to OS Error

Critical

  • Instance:InstanceFailure.Reboot:Scheduled

  • Instance:InstanceFailure.Reboot:Executing

  • Instance:InstanceFailure.Reboot:Executed

  • Instance:InstanceFailure.Reboot:Avoided

This system event is immediately triggered when Alibaba Cloud detects that an instance operating system is down due to issues, such as out-of-memory (OOM), blue screen, freeze, continuous printing of serial port logs, and kernel panic.

We recommend that you wait until the instance is automatically restarted and then check whether the instance and applications work as expected.

You can enable the kdump service of the operating system to troubleshoot the issue and prevent the issue from recurring. For more information, see How to enable the Kdump service for Linux instances and Enable the Kernel Memory Dump feature for a Windows instance.

SystemFailure.Stop

Instance Stop Due to System Error

Critical

  • Instance:SystemFailure.Stop:Executing

  • Instance:SystemFailure.Stop:Executed

This system event is immediately triggered when Alibaba Cloud detects that an instance is stopped due to hardware or software failure on the underlying host, such as CPU or memory hardware damage.

We recommend that you wait until the instance is automatically restarted and then start the instance.

When the instance is being started, Alibaba Cloud migrates the instance to a healthy host.

Note

You can modify the maintenance attributes of the instance to specify the default action to take when an O&M event occurs on the instance. For more information, see Modify instance maintenance attributes.

SystemFailure.Redeploy

Instance Redeployment Due to System Error

Critical

  • Instance:SystemFailure.Redeploy:Inquiring

  • Instance:SystemFailure.Redeploy:Executing

  • Instance:SystemFailure.Redeploy:Executed

  • Instance:SystemFailure.Redeploy:Avoided

  • Instance:SystemFailure.Redeploy:Canceled

This system event is immediately triggered when Alibaba Cloud detects hardware or software failure on the underlying host of an instance equipped with local disks and the instance must be redeployed.

Note

Only instances that depend on host hardware support this event, such as instances that are equipped with local disks or support Software Guard Extensions (SGX) confidential computing.

We recommend that you make preparations, such as modifying the /etc/fstab configuration file and backing up data, and then perform one of the following actions to handle the event:

Note

You can modify the maintenance attributes of the instance to specify the default action to take when an O&M event occurs on the instance. For more information, see Modify instance maintenance attributes.

SystemFailure.Delete

Automatic Cancellation of Bills Due to Instance Creation Failures

Critical

  • Instance:SystemFailure.Delete:Executing

  • Instance:SystemFailure.Delete:Executed

  • Instance:SystemFailure.Delete:Avoided

This system event is immediately triggered when Alibaba Cloud detects that an instance creation order is placed but the instance fails to be created.

We recommend that you wait for the instance to be automatically released. In most cases, an instance is automatically released within 5 minutes after the instance fails to be created.

Note

If you already paid for the order, the payment is refunded after the instance is released.

To ensure that instances can be created, we recommend that you perform the following actions:

  • Before you create an instance in a zone and region, query the availability of ECS instance resources in the region and zone and the number of private IP addresses of the vSwitch that resides in the zone. For example, you can call the DescribeAvailableResource operation to query the resources.

  • Use Auto Provisioning or Auto Scaling to flexibly create instances from larger resource pools.

ErrorDetected

Local Disk Damage

Critical

  • Disk:ErrorDetected:Executing

  • Disk:ErrorDetected:Executed

This system event is immediately triggered when Alibaba Cloud detects hardware or software failure on the local disk of an instance and data cannot be read from the disk or written to the disk.

We recommend that you make preparations, such as modifying the /etc/fstab configuration file and backing up data. Then, select a point in time to isolate and repair the damaged local disk.

The supported operations vary based on the instance type.

  • d1, d1ne, d2s, and d2c instance types: support online isolation, offline isolation, online repair, and redeployment.

  • d3c, d2c, i2, i2g, i2ne, i2gne, i3, and i3g instance types: support online isolation, offline isolation, and redeployment.

  • i1 instance types: support redeployment.

  • ebmi2g instance types: support authorized repair and redeployment.

Note

For more information, see the Scenario ③ section of the "O&M scenarios and system events for instances equipped with local disks" topic.

Stalled

Significant Block Storage Performance Impact

Critical

  • Disk:Stalled:Executing

  • Disk:Stalled:Executed

This system event is immediately triggered when Alibaba Cloud detects that an I/O hang occurs on a cloud disk of the instance. This significantly affects the disk performance and prevents the disk from processing read and write requests.

We recommend that you isolate reads and writes on the cloud disk at the application layer or disassociate the ECS instance from the associated Server Load Balancer (SLB) instance.

Instance migration events due to upgrades at the underlying layer

Event code

Event name

Event severity level

CloudMonitor event name

Event description and impact

Handling suggestion

SystemUpgrade.Migrate

Instance Migration Due to Upgrades at Underlying Layer

Critical

Undefined

This system event is triggered when instances are affected by the upgrades and improvements of physical infrastructure in regions and zones where the instances reside.

We recommend that you view event details in the ECS console and migrate affected instances as prompted. For more information, see Instance migration due to upgrades at the underlying layer.

Burstable instance performance degradation events

Event code

Event name

Event severity level

CloudMonitor event name

Event description and impact

Handling suggestion

Instance:BurstablePerformanceRestricted

Burstable Instance Performance Degradation

Warning

Instance:BurstablePerformanceRestricted

This system event is triggered when all accrued CPU credits of a burstable instance are consumed.

We recommend that you perform one of the following actions to handle the event:

  • If you want the burstable instance to run at a CPU utilization higher than the baseline for a short period of time, enable the unlimited mode for the instance for that period. For more information, see Enable or disable the unlimited mode.

  • If you want the burstable instance to run at a CPU utilization higher than the baseline for an extended period of time, upgrade the instance to a higher-specification instance type or change the instance to a non-burstable instance. For more information, see the Change the instance type section of the "Overview of instance configuration changes" topic.

If you want to specify thresholds for triggering notifications about this event, such as when you want an event notification to be sent when accrued CPU credits remain less than 10 for 10 consecutive minutes, you can configure event-triggered alert rules for the event in the CloudMonitor console. For more information, see Monitor burstable instances.

Status change events

Event code

Event name

Event severity level

CloudMonitor event name

Event description and impact

Handling suggestion

Instance:PreemptibleInstanceInterruption

Preemptible Instance Interruption

Warning

Instance:PreemptibleInstanceInterruption

This system event is triggered 5 minutes before a preemptible instance is reclaimed.

We recommend that you take one of the following actions:

  • Use preemptible instances for stateless applications, such as scalable web services and big data analytics applications.

  • Use Auto Provisioning to provision instances and mitigate the impacts of reclaimed preemptible instances on your business. You can also implement automated O&M based on this event. For example, you can configure notifications about this event in the CloudMonitor console and have preemptible instances automatically purchased when a notification is sent.

Instance:ModifyInstanceSpec.Reboot

Instance Restart Due to Instance Type Change

Critical

  • Instance:ModifyInstanceSpec.Reboot:Scheduled

  • Instance:ModifyInstanceSpec.Reboot:Executing

  • Instance:ModifyInstanceSpec.Reboot:Executed

  • Instance:ModifyInstanceSpec.Reboot:Avoid

After the instance type of an instance is changed, restart the instance for the new instance type to take effect. If you do not restart the instance within seven days after the new order takes effect, the system forcefully restarts the instance for the new instance type to take effect.

We recommend that you take one of the following actions:

  • Change the scheduled restart time of the instance. For more information, see Modify the scheduled restart time. You can select a time to schedule a restart within seven days after the new order takes effect.

  • Restart the instance. For more information, see Restart an instance. Restart the instance to avoid risks.

Instance:PerformanceModeChange

Performance Mode Switchover of Burstable Instance

Warning

Instance:PerformanceModeChange

This system event is triggered when a burstable instance switches between the unlimited mode and the standard mode.

We recommend that you determine whether to monitor the event. If you want to monitor the event, you can configure notifications for the event in the CloudMonitor console. For more information, see Subscribe to ECS system event notifications.

Instance:StateChange

Instance Status Change

Notification

Instance:StateChange

This system event is triggered when the status of an instance changes, such as from Running to Stopping or from Stopping to Stopped.

We recommend that you determine whether to monitor the event. If you want to monitor the event, you can configure notifications for the event in the CloudMonitor console. For more information, see Subscribe to ECS system event notifications.

Instance:AutoReactivateCompleted

Automatic Reactivation Completed

Notification

Instance:AutoReactivateCompleted

This system event is triggered when you complete overdue payments in your account and an instance is automatically reactivated.

We recommend that you determine whether to monitor the event. If you want to monitor the event, you can configure notifications for the event in the CloudMonitor console. For more information, see Subscribe to ECS system event notifications.

Instance:LiveMigrationAcrossDDH

Instance Hot Migration Between Dedicated Hosts

Notification

Instance:LiveMigrationAcrossDDH

This system event is triggered when an instance is hot migrated between dedicated hosts.

We recommend that you determine whether to monitor the event. If you want to monitor the event, you can configure notifications for the event in the CloudMonitor console. For more information, see Subscribe to ECS system event notifications.

Disk:DiskOperationCompleted

Disk Operations Completed

Notification

Disk:DiskOperationCompleted

This system event is triggered when a pay-as-you-go disk is manually attached or detached.

We recommend that you determine whether to monitor the event. If you want to monitor the event, you can configure notifications for the event in the CloudMonitor console. For more information, see Subscribe to ECS system event notifications.

Disk:ConvertToPostpaidCompleted

Billing Method of Disks Switched to Pay-as-you-go

Notification

Disk:ConvertToPostpaidCompleted

This system event is triggered when a subscription disk is changed to a pay-as-you-go disk.

We recommend that you determine whether to monitor the event. If you want to monitor the event, you can configure notifications for the event in the CloudMonitor console. For more information, see Subscribe to ECS system event notifications.

Snapshot:CreateSnapshotCompleted

Disk Snapshot Created

Notification

Snapshot:CreateSnapshotCompleted

This system event is triggered when a snapshot is created for a disk.

We recommend that you determine whether to monitor the event. If you want to monitor the event, you can configure notifications for the event in the CloudMonitor console. For more information, see Subscribe to ECS system event notifications.

Snapshot:SnapshotDeleted

Snapshot Deletion Completed

Notification

Snapshot:SnapshotDeleted

This system event is generated when a manual or automatic snapshot is deleted.

None.