System events are defined by Alibaba Cloud to record and notify resource information, such as the execution states of O&M tasks, resource exceptions, and resource state changes.

System event categories

System events are classified into the following categories based on their causes:
  • Unexpected O&M events: This category of system event is triggered when Elastic Compute Service (ECS) instances restart or go down due to specific issues, such as kernel panic, out-of-memory errors, or failures in underlying hosts caused by hardware or software. Unexpected O&M events are displayed in the ECS console. Alibaba Cloud sends these events in real time and restores affected ECS resources as soon as possible. At the same time, Alibaba Cloud notifies you of the execution states of system O&M tasks related to the events.
  • Scheduled O&M events: Alibaba Cloud may plan to upgrade host software for security reasons or to take actions against foreseen failure risks that lie in underlying host hardware and software. In these cases, if O&M tasks to be executed by Alibaba Cloud may affect the availability or performance of your ECS resources, Alibaba Cloud triggers and sends scheduled O&M events as prior notifications to notify you of task details such as execution time, objects, and impacts. After you receive a scheduled O&M event, you can select an off-peak period within the event execution window to handle the event with minimum impacts on your business. This category of system events is displayed in the ECS console.
  • Instance billing events: This category of system events is triggered due to the upcoming stop or release of instances. For example, instance billing events are triggered when instances expire or when payments for instances become overdue. Instance billing events are displayed in the ECS console.
  • Security events: This category of system events is triggered when instances face security threats. For example, security events are triggered when instances suffer DDoS attacks or when blackhole filtering is triggered. Security events are displayed in the ECS console.
  • State change events: This category of system events is triggered when your operations on instances cause instance lifecycle states to change, such as manually starting or stopping instances, or when instance attributes changes cause instance lifecycle states or other states to change. State change events are classified into the following categories:
    • Lifecycle state change events: For example, lifecycle state change events are triggered when instances enter a different state, when preemptible instances are interrupted, and when snapshots are created. This category of system events is not displayed in the ECS console.
    • Other attributes change events: For example, other attributes change events are triggered when the performance mode of burstable instances is changed or when subscription disks are changed to pay-as-you-go disks. Specific other attributes change events are displayed in the ECS console.
System events are classified into the following categories based on their severity:
  • Critical events: This category of system events may result in instance unavailability and must be handled at your earliest opportunity. For example, a critical system event is triggered when an instance is released due to an overdue payment or when an instance is redeployed due to an instance error.
  • Warning events: This category of system events has impact on your business. For example, a warning event is triggered when a burstable instance cannot burst above its performance baseline. You must pay close attention to these events or handle them when appropriate.
  • Notification events: You can choose to pay attention to this category of system events. For example, a notification event is triggered when a snapshot is created for a disk.
For information about system events that ECS supports, see Summary. ECS event types and CloudMonitor events follow specific naming conventions for easy understanding. For more information about the naming conventions of ECS event types and CloudMonitor events, see the Formats of ECS event type and CloudMonitor event names section in this topic.
Note Many Alibaba Cloud services support system events, such as ECS, ApsaraDB RDS, and Server Load Balancer (SLB). This topic describes ECS system events. For information about system events of other Alibaba Cloud services, see the corresponding documentation.

Use scenarios of system events

  • Notification of risks and exceptions

    After system events that can be displayed in the ECS console are triggered, Alibaba Cloud pushes the events to the ECS console. These events include those that affect the availability and performance of ECS resources, such as SystemMaintenance.Reboot events among scheduled O&M events and InstanceExpiration.Stop events among instance billing events. For some critical system events, Alibaba Cloud sends additional emails or internal messages. You can handle these events by using the ECS console or by calling API operations. We recommend that you handle the system events at your earliest opportunity to ensure resource availability and performance. For more information, see Query and handle ECS system events.

    For example, when a subscription instance is about to expire, the ECS console prompts you to renew the instance within a specified period of time to ensure service continuity.

  • Automated O&M
    Event states are defined for system events displayed in the ECS console to help you understand the execution states of system O&M tasks. Meanwhile, new system events and changes in system event states are reported to CloudMonitor so that you can build an event-driven automated O&M system based on your business requirements. For more information about event states, see States and windows of system events.
    Note Each event state corresponds to a CloudMonitor event. For example, the Executing and Executed states that the InstanceFailure.Reboot ECS event type supports correspond to the Instance:InstanceFailure.Reboot:Executing and Instance:InstanceFailure.Reboot:Executed CloudMonitor events.

    Some state change events are not displayed in the ECS console and cannot be handled by using the ECS console or by calling API operations. Examples: events that indicate instance state changes or interruptions of preemptible instances. No event states are defined for these system events. However, these events are still reported to CloudMonitor when they are triggered so that you can build an event-triggered automated O&M system based on your business requirements.

    For example, state change events are triggered when you manually start or stop instances. These events do not indicate risks or exceptions. If you want to log your operations to your system, you can configure event notifications for state change events and use the alert callback feature to write the startup and stop information of instances to operation logs.

States and windows of system events

The following table describes the event states defined for system events that are displayed in the ECS console.
Note For information about the event states that different system events support, see the "CloudMonitor event" columns of tables in Summary.
Event state Attribute Description
Inquiring Intermediate The O&M task related to the system event is pending authorization. After you authorize the task to be executed, the event enters the Executing state.
Scheduled Intermediate The O&M task related to the system event is scheduled and pending execution. When the O&M task is executed, the event enters the Executing state.
Executing Intermediate The O&M task related to the system event is being executed.
Executed Stable The O&M task related to the system event is completed.
Avoided Stable The impacts of the system event are prevented because you have migrated the affected instance within the user operation window.
Failed Stable The O&M task related to the system event failed.
Canceled Stable The O&M task related to the system event is automatically canceled.
System events have the following windows:
  • User operation window
    The user operation window of a system event starts when the event is sent and ends when the O&M task related to the event is executed as scheduled. You can manually handle the event within the user operation window or wait for the O&M task to be automatically handled. Take note of the following items about the lengths of user operation windows:
    • In most cases, the user operation window of a scheduled O&M event ranges from 24 to 48 hours.
      Note The lengths of user operation windows are unlimited for system events in the Inquiring state. The O&M tasks related to the events can start only after you authorize the tasks to be executed.
    • Typically, unexpected O&M system events caused by failures or invalid operations do not have a user operation window.
    • For system events indicating that subscription instances are about to expire, the window is three days.
    • For system events indicating that pay-as-you-go instances are to be stopped due to overdue payments, the window is less than 1 hour.
  • Event execution window
    The execution window of a system event starts when the O&M task related to the event is executed and ends when the task is completed. Take note of the following items about the lengths of event execution windows:
    • For system events such as failure recovery events, the window is within 10 minutes.
    • Unexpected O&M events caused by failures or invalid operations have a short event execution window.

Formats of ECS event type and CloudMonitor event names

ECS event types and CloudMonitor events follow specific naming conventions for easy understanding.
  • ECS event types are named in the <Event cause>.<Event impact> format to indicate event causes and impacts on resources.
  • CloudMonitor events are named in the <Resource type>:<Event cause>.<Event impact>:<Event state> format to indicate resource types, event causes, event impacts on resources, and event states.
Note ECS event types and CloudMonitor events may include only some of the preceding information in their names. For example, a CloudMonitor event name of Disk:ErrorDetected:Executing indicates that a disk is damaged, and excludes information about impacts on resources.
The following table describes some examples of ECS event types and CloudMonitor events.
Note The Undefined event type indicates that ECS events are not displayed in the ECS console and cannot be handled by using the ECS console or by calling API operations. Example: the Instance:StateChange event.
Category Example ECS event type Example CloudMonitor event Description
Scheduled O&M events SystemMaintenance.Reboot Instance:SystemMaintenance.Reboot:Inquiring
  • Resource type: Instance indicates ECS instance.
  • Event cause: SystemMaintenance indicates that Alibaba Cloud proactively initiates a system O&M task.
  • Event impact: Reboot indicates that the instance is to be restarted while the O&M task is being executed.
  • Event state: Inquiring indicates that the O&M task related to the event is pending authorization and the instance can be restarted only after you authorize the task to be executed.
SystemMaintenance.Reboot Instance:SystemMaintenance.Reboot:Executed
  • Resource type: Instance indicates ECS instance.
  • Event cause: SystemMaintenance indicates that Alibaba Cloud proactively initiates a system O&M task.
  • Event impact: Reboot indicates that the instance is to be restarted while the O&M task is being executed.
  • Event state: Executed indicates that the instance is stopped.
Unexpected O&M events ErrorDetected Disk:ErrorDetected:Executing
  • Resource type: Instance indicates ECS instance.
  • Event cause: ErrorDetected indicates that the local disk is damaged.
  • Event state: Executing indicates that the damaged local disk has not been repaired.
SystemFailure.Redeploy Instance:SystemFailure.Redeploy:Executed
  • Resource type: Instance indicates ECS instance.
  • Event cause: SystemFailure indicates that the O&M task is caused by a system error.
  • Event impact: Redeploy indicates that the instance is to be deployed to another host while the O&M task is being executed.
  • Event state: Executed indicates that the instance is redeployed.
Lifecycle state change events Undefined Instance:StateChange
  • Resource type: Instance indicates ECS instance.
  • Event cause: StateChange indicates that the instance state changes.
Undefined Snapshot:CreateSnapshotCompleted
  • Resource type: Snapshot indicates snapshot.
  • Event cause: CreateSnapshotCompleted indicates that the snapshot is created.
Instance billing events InstanceExpiration.Stop Instance:InstanceExpiration.Stop:Scheduled
  • Resource type: Instance indicates ECS instance.
  • Event cause: InstanceExpiration indicates the subscription instance expires.
  • Event impact: Stop indicates that the instance is to be stopped on expiration.
  • Event state: Scheduled indicates that the instance is pending a scheduled stop.
AccountUnbalanced.Stop Instance:AccountUnbalanced.Stop:Avoided
  • Resource type: Instance indicates ECS instance.
  • Event cause: AccountUnbalanced indicates that you have overdue payments.
  • Event impact: Stop indicates that the instance is to be stopped due to an overdue payment.
  • Event state: Avoided indicates that you have added funds to your account and the scheduled stop operation on the instance is canceled.

Limits

Retired instance families do not support the system event feature. For more information, see Retired instance types.