ECS system events notify you of O&M tasks, resource exceptions, and status changes—such as instance migration, or maintenance restarts—so you can respond before availability is affected.
Formats of ECS event codes and CloudMonitor event names
ECS system events sync to CloudMonitor, enabling automated O&M workflows. Event codes and CloudMonitor event names follow these formats:
-
ECS event codes:
<Event cause>.<Impact on resource>. -
CloudMonitor event names:
<Resource type>:<Event cause>.<Impact on resource>:<Event status>.
Not all event codes include every field. For example, Disk:ErrorDetected:Executing omits the impact field because the disk damage itself is the impact.
The following table lists example ECS event codes and CloudMonitor event names.
If the ECS event code is Undefined, the event does not appear in the ECS console and cannot be handled via the console or OpenAPI.
|
Category |
Sample ECS event code |
Sample Cloud Monitor event name |
Description |
|
Scheduled O&M events |
SystemMaintenance.Reboot |
Instance:SystemMaintenance.Reboot:Inquiring |
|
|
Unexpected O&M events |
ErrorDetected |
Disk:ErrorDetected:Executing |
|
|
Lifecycle change events |
Snapshot:CreateSnapshotCompleted |
Snapshot:CreateSnapshotCompleted |
|
Scheduled O&M events
Restarting an instance from within its OS does not apply the maintenance action. All restart operations in this topic refer to restarts performed in the ECS console or by calling an OpenAPI operation. See Restart an instance or RebootInstance.
|
Event code |
Event name |
Event severity level |
CloudMonitor event name |
Event description and impact |
Recommendations for users |
|
SystemMaintenance.Reboot |
Instance restart because of system maintenance |
Critical |
|
Alibaba Cloud detects a potential host failure risk that may cause the instance to restart. The risk has not yet become an actual failure. This event is sent 24 to 48 hours before the scheduled maintenance. Note
Failure risks include:
|
Response options:
Note
|
|
SystemMaintenance.Stop |
Instance stop because of system maintenance |
Critical |
|
Sent 24 to 48 hours before scheduled maintenance when Alibaba Cloud detects a potential host failure risk that may shut down the instance. The risk has not yet become an actual failure. |
Response options:
Note
You can modify instance maintenance properties to specify the default action for O&M events. See Modify instance maintenance properties. |
|
SystemMaintenance.Redeploy |
Instance redeployment because of system maintenance |
Critical |
|
Sent 24 to 48 hours before scheduled maintenance when Alibaba Cloud detects a potential host failure risk that may require instance redeployment. The risk has not yet become an actual failure. Important
For an instance that uses local SSDs or local HDDs, the data disks are re-initialized and the data on the local disks is cleared. |
Back up data and modify /etc/fstab. Then respond as needed:
Note
|
|
SystemMaintenance.IsolateErrorDisk |
Damaged disk isolation because of system maintenance |
Critical |
|
Sent immediately when Alibaba Cloud detects software or hardware damage on a local disk of an ECS instance. Important
The procedure for handling a damaged local disk varies based on the instance type. For some instance types, the instance must be restarted to isolate the damaged disk. For other instance types, the damaged disk can be isolated and repaired online. |
Back up data and modify /etc/fstab. Then authorize disk isolation at an appropriate time. The disk is isolated online without restarting the instance. |
|
SystemMaintenance.ReInitErrorDisk |
Damaged disk re-initialization because of system maintenance |
Critical |
|
Sent immediately after Alibaba Cloud detects software or hardware damage on a local disk and replaces it. This typically occurs within five business days after you authorize disk isolation. Important
The procedure for handling a damaged local disk varies based on the instance type. For some instance types, the instance must be restarted to isolate the damaged disk. For other instance types, the damaged disk can be isolated and repaired online. |
Authorize disk restoration at an appropriate time. The disk is restored online without restarting the instance. |
|
SystemMaintenance.RebootAndIsolateErrorDisk |
Instance restart and damaged disk isolation because of system maintenance |
Critical |
|
Sent immediately when Alibaba Cloud detects software or hardware damage on a local disk and fails to isolate the disk online. Important
The procedure for handling a damaged local disk varies based on the instance type. For some instance types, the instance must be restarted to isolate the damaged disk. For other instance types, the damaged disk can be isolated and repaired online. |
Authorize disk isolation at an appropriate time and restart the instance. The disk is isolated offline, requiring a restart. |
|
SystemMaintenance.RebootAndReInitErrorDisk |
Instance restart and damaged disk re-initialization because of system maintenance |
Critical |
|
Sent immediately when Alibaba Cloud detects software or hardware damage on a local disk and fails to restore it online. Important
The procedure for handling a damaged local disk varies based on the instance type. For some instance types, the instance must be restarted to isolate the damaged disk. For other instance types, the damaged disk can be isolated and repaired online. |
Authorize disk restoration at an appropriate time and restart the instance. The disk is restored offline, requiring a restart. |
|
SystemMaintenance.StopAndRepair |
In-place repair event for an instance with local disks |
Critical |
|
Sent 48 to 168 hours before the scheduled maintenance when Alibaba Cloud detects a hardware failure risk on the host. |
Authorize the repair or redeployment of the instance with local disks at an appropriate time. |
|
SystemMaintenance.CleanReleasedDisks |
Cleanup event after EBS hot-plug failure |
Warning |
|
Sent when Alibaba Cloud detects configuration information for released cloud disks (released due to overdue payments) in the OS of an ECS instance. |
Authorize Alibaba Cloud to clear the configuration information of the released cloud disks at an appropriate time. Important
Alibaba Cloud shuts down the instance at the time you specify, cleans up the disks, and then starts the instance again. |
Unexpected O&M events
|
Event code |
Event name |
Event severity level |
Cloud Monitor event name |
Event description and impact |
Handling suggestion |
|
SystemFailure.Reboot |
Instance restart due to system error |
Critical |
|
An ECS instance restarted due to an unexpected software or hardware failure on its host. Common causes:
|
Wait for the instance to automatically restart, then verify that the instance and its applications run correctly. During the restart, Alibaba Cloud migrates the instance to a healthy host. Note
You can modify instance maintenance properties to specify the default action for O&M events. See Modify instance maintenance properties. |
|
InstanceFailure.Reboot |
Instance restart required due to an operating system error |
Critical |
|
Sent immediately when Alibaba Cloud detects that an ECS instance is down due to an internal OS issue, such as an out-of-memory (OOM) error, blue screen, freeze, continuous serial port log printing, or kernel panic. |
Wait for the instance to automatically restart, then verify that the instance and its applications run correctly. You can enable the Kdump service for the operating system to identify the cause of the crash and prevent similar issues from recurring. For more information, see Enable the Kdump service for a Linux instance or Enable the Kernel Memory Dump feature for a Windows instance. |
|
SystemFailure.Stop |
Instance stop due to system error |
Critical |
|
Sent immediately when Alibaba Cloud detects that an ECS instance is shut down due to a host failure, such as CPU or memory hardware damage. |
Wait for the instance to stop, then start it. When you start the instance, Alibaba Cloud migrates the instance to a healthy host. Note
You can modify instance maintenance properties to specify the default action for O&M events. See Modify instance maintenance properties. |
|
SystemFailure.Redeploy |
Instance redeployment due to system error |
Critical |
|
Sent immediately when Alibaba Cloud detects that an instance with local disks must be redeployed due to a host failure. Note
This type of event is supported only for instances that depend on host hardware, such as instances that have local disks attached or support SGX-based confidential computing. |
Back up data and modify /etc/fstab. Then respond as needed:
Note
You can modify instance maintenance properties to specify the default action for O&M events. See Modify instance maintenance properties. |
|
SystemFailure.Delete |
Automatic bill cancellation due to instance creation failure |
Critical |
|
Sent immediately when an instance creation order succeeds but the instance fails to be created. |
Wait for the system to release the instance. The instance is typically released within five minutes after creation failure. Note
If you have paid for the order, you receive a refund after the instance is released. To increase the success rate of instance creation:
|
|
ErrorDetected |
Alert for local disk damage |
Critical |
|
Sent immediately when Alibaba Cloud detects unexpected software or hardware damage on a local disk, preventing reads and writes. |
Back up data and modify /etc/fstab. Then isolate and restore the damaged disk at an appropriate time. Supported operations vary by instance type:
|
|
Stalled |
Disk performance is severely affected |
Critical |
|
Sent immediately when Alibaba Cloud detects an I/O hang on a cloud disk attached to an ECS instance, severely affecting disk performance and preventing reads and writes. |
Isolate read/write operations on the cloud disk at the application layer, or temporarily remove the instance from SLB. |
Instance migration events due to underlying upgrades
|
Event code |
Event name |
Event severity level |
Cloud Monitor event name |
Event description and impact |
Handling suggestion |
|
SystemUpgrade.Migrate |
Instance migration required due to underlying upgrades |
Critical |
Undefined |
When Alibaba Cloud upgrades the physical infrastructure, instances in the affected region and zone may need migration. This event is sent in advance. |
View event details in the ECS console and migrate the instance as prompted. See Instance migration due to underlying upgrades. |
Burstable instance performance restriction events
|
Event code |
Event name |
Event severity level |
Cloud Monitor event name |
Event description and impact |
Handling suggestion |
|
Instance:BurstablePerformanceRestricted |
Burstable instance performance is restricted |
Warning |
Instance:BurstablePerformanceRestricted: Burstable instance performance is restricted |
Sent immediately when the accrued CPU credits of a burstable instance are depleted. |
Response options:
To customize the notification threshold (for example, alert when CPU credits are less than 10 for 10 consecutive minutes), set a threshold-based alert rule in CloudMonitor. See Monitor burstable instances. |
Status change events
|
Event code |
Event name |
Event severity level |
Cloud Monitor event name |
Event description and impact |
Handling suggestion |
|
Instance:PreemptibleInstanceInterruption |
Spot instance interruption notification |
Warning |
Instance:PreemptibleInstanceInterruption: Spot instance interruption notification |
Sent 5 minutes before a spot instance is reclaimed. |
Recommendations:
|
|
Instance:ModifyInstanceSpec.Reboot |
Instance restart required for instance type change to take effect |
Critical |
|
After an instance type change, the instance must restart for the new configuration to take effect. If you do not restart within seven days, the system forcibly restarts the instance. |
Recommendations:
|
|
Instance:PerformanceModeChange |
Performance mode switchover of burstable instance |
Warning |
Instance:PerformanceModeChange: Performance mode switchover of burstable instance |
Generated when a burstable instance switches between unlimited mode and standard mode. |
To follow this event, subscribe to system event notifications in CloudMonitor. |
|
Instance:StateChange |
Instance status change notification |
Information |
Instance:StateChange: Instance status change notification |
Generated when the instance status changes, for example, from Running to Stopping or from Stopping to Stopped. |
To follow this event, subscribe to system event notifications in CloudMonitor. |
|
Instance:AutoReactivateCompleted |
Automatic reboot completion |
Information |
Instance:AutoReactivateCompleted: Automatic reactivation completed |
Generated when you pay the overdue bill and the instance automatically restarts. |
To follow this event, subscribe to system event notifications in CloudMonitor. |
|
Instance:LiveMigrationAcrossDDH |
Instance hot migration between dedicated hosts |
Information |
Instance:LiveMigrationAcrossDDH: Instance hot migration between dedicated hosts |
Generated when an instance is hot migrated. |
To follow this event, subscribe to system event notifications in CloudMonitor. |
|
Disk:DiskOperationCompleted |
Disk operation completed |
Information |
Disk:DiskOperationCompleted: Disk operation completed |
Generated when a pay-as-you-go disk is manually attached or detached. |
To follow this event, subscribe to system event notifications in CloudMonitor. |
|
Disk:ConvertToPostpaidCompleted |
Disk converted to pay-as-you-go |
Information |
Disk:ConvertToPostpaidCompleted: Disk converted to pay-as-you-go |
Generated when a subscription disk is converted to pay-as-you-go. |
To follow this event, subscribe to system event notifications in CloudMonitor. |
|
Snapshot:CreateSnapshotCompleted |
Disk snapshot created |
Information |
Snapshot:CreateSnapshotCompleted: Disk snapshot created |
Generated when a disk snapshot is created. |
To follow this event, subscribe to system event notifications in CloudMonitor. |
|
Snapshot:SnapshotDeleted |
Snapshot deletion completed event |
Information |
Snapshot:SnapshotDeleted: Snapshot deletion completed event |
Generated when a manual or automatic snapshot is deleted. |
None |
Instance performance risk events
|
Event code |
Event name |
Event severity level |
Cloud Monitor event name |
Event description and impact |
Handling suggestion |
|
Instance:CPUPerformanceReachLimit |
Instance CPU performance reaches the upper limit of the instance type |
Warning |
Instance:CPUPerformanceReachLimit:Executed : Instance CPU performance reaches the upper limit of the instance type |
Alibaba Cloud detects that the CPU utilization of the instance has reached 100% or the upper limit of its instance type. Note
The event is sent if the CPU upper limit defined for the instance type is reached twice within the last three minutes. |
Sustained CPU usage at the instance type limit may affect your business. Adjust your configuration as needed. See Discover and troubleshoot instance issues. |
|
Instance:StoragePerformanceReachLimit |
Instance storage performance reaches the upper limit of the instance type |
Warning |
Instance:StoragePerformanceReachLimit:Executed : Instance storage performance reaches the upper limit of the instance type |
Alibaba Cloud detects that the disk bandwidth or IOPS of the instance has reached the upper limit of its instance type. Examples:
Note
This event is not supported for ECS instances of generations earlier than Generation 6. The event is sent if the storage performance upper limit defined for the instance type is reached twice within the last three minutes. |
Sustained storage performance at the instance type limit may affect your business. Adjust your configuration as needed. See Discover and troubleshoot instance issues. |
|
Instance:NetworkPerformanceReachLimit |
Instance network performance reaches the upper limit of the instance type |
Warning |
Instance:NetworkPerformanceReachLimit:Executed : Instance network performance reaches the upper limit of the instance type |
Alibaba Cloud detects that the network performance of the instance has reached the upper limit of its instance type. Examples:
Note
The event is sent if the network performance upper limit defined for the instance type is reached twice within the last three minutes. |
Sustained network performance at the instance type limit may affect your business. Adjust your configuration as needed. See Discover and troubleshoot instance issues. |
|
Instance:StatusCheckFailed |
Instance status check failed |
Warning |
|
Alibaba Cloud detects a connectivity exception for the instance. Examples:
|
Connectivity exception detected. Troubleshoot promptly. See Diagnose network connectivity. |