All Products
Search
Document Center

Elasticsearch:Event hub

Last Updated:Nov 28, 2025

You can use the Event Center to view system O&M events for Alibaba Cloud Elasticsearch (ES). This helps you promptly detect service anomalies and quickly analyze and locate issues. This topic describes the event categories for ES and how to view and handle events.

Event categories

ES events are categorized by cause and impact as follows.

Note

For more information, see Appendix: Event details.

Event category

Definition

Cause and impact

Examples

System change

System change events are initiated by Alibaba Cloud. You are notified of these events and must check if your cluster is affected.

System change events caused by infrastructure changes or faults may affect cluster access. When this type of event is triggered, the system sends a notification. Check the notification and your cluster status promptly.

  • Kibana feature upgrade causes a brief service suspension.

  • AMD instance families are upgraded to the latest generation.

Cluster health

The system regularly inspects cluster health based on actual usage. It displays unexpected diagnostic results as events.

To ensure the sustainability of the Alibaba Cloud service, the system automatically triggers a cluster health event when it detects a cluster resource anomaly or risk. This minimizes the impact.

Note

During the execution of an O&M event, the cluster may experience brief jitter but normal access is not affected. If automatic execution fails, you can manually trigger a node restart on the Event Center page. The manual intervention window is 24 to 48 hours. For specific execution times, see View and handle events.

An inspection finds that an ES node is offline.

Cluster change

These are operation events that you initiate to change a cluster. Failures or blocks can occur during the change process.

Cluster change events caused by instance type changes or kernel upgrades trigger a restart of the corresponding nodes. During the execution of an O&M event, the cluster may experience brief jitter but normal access is not affected.

  • Scale-in

  • Restart a node

View and handle events

On the Event Center page, you can view information about events generated under the current account and handle them as needed.

  1. Go to the Event Center.

    1. Log on to the Alibaba Cloud Elasticsearch console.

    2. In the navigation pane on the left, click Event Center.

  2. View event information.

    On the Event Center page, you can filter by conditions to view all events for a target instance of a selected type within a specified time period. Then, you can perform operations based on the event details.image

    Note

    You can view all event information in the Event Center. You can also subscribe to events and set notifications for critical alerts that require prompt handling. When an alert is triggered, the system automatically sends an alert notification to the specified alert contacts by phone, text message, or email.

    The event information and related handling operations are described in the following table.

    Event information

    Description

    Cluster ID

    The ID of the Alibaba Cloud ES instance that generated the event.

    Node ID

    The ID of the instance node that generated the event.

    Event Level

    The severity of the event. Levels include the following:

    • Info: Records the status or operations of the system during normal operation. Often used for system status observation or debugging.

    • Warning: A potential issue or anomaly exists in the system but does not affect the current operation. Continuous monitoring is required.

    • Critical: A serious error or fault has occurred in the system. Immediate handling is required. Otherwise, service unavailability or data loss may occur.

    Event Status

    The execution status of the event. Statuses include To Be Handled, In Progress, Handled, Handling Failed, Handling Interrupted, Canceled, Execution to be confirmed, and Ready to continue. Among them:

    • To Be Handled: The event is waiting to be executed at the system-set time or your scheduled time.

    • Execution to be confirmed: You can decide whether to execute the event immediately or create a snapshot backup for the event based on the event details.

      Note
      • Only some events related to local disks in system change events support this status.

      • Only deployment events, such as an ES cluster upgrade or deploying a new version to a specified node, support snapshot backups.

    • Ready to continue: The current change task has completed the grayscale change. You need to confirm the stability of the changed nodes and cluster and decide whether to execute subsequent tasks. For example, a change operation needs to be tested on some nodes first. After the change is verified in a small scope, it is then executed on all nodes.

    For events in the Handling Failed or Handling Interrupted state, find the cause and handle them promptly to avoid affecting normal business operations.

    Event Description

    The cause and impact of the event.

    Occurred At and Ended At

    The start and end time of the event execution.

    Scheduled Handling Time and Execution End Time

    The scheduled start time and estimated end time of the event.

    Note

    Only system change events support this setting.

    Source

    The source of the event. Sources include the following:

    • Proactive Notification: ES proactively pushes events to Event Center after they are generated.

    • Event Subscription: You subscribe to listen for specified events. When an event occurs, the system receives a corresponding notification.

    Suggestion

    You can handle related events based on the recommended operations. The supported handling operations vary for different events. The actual interface prevails.

    • Contact Technical Support: If you have questions about an event, you can contact technical support for consultation.

    • Restart: Immediately restart the specified node of the related instance.

    • Schedule Restart: You must specify a restart time. The system will restart the specified node of the related instance at the scheduled time. The node restart time must be at least 5 minutes later than the scheduled time. The system will restart the node for you within 5 minutes of the scheduled time.

    Note

    When you restart, forcibly restart, or perform a grayscale restart on the current instance or node, the system automatically triggers the execution of a restart event for that instance or node. However, for redeployment events, such as an ES version upgrade, you still need to submit a ticket to contact technical support personnel.

Appendix: Event details

Event type

Event code and name

Event level

CloudMonitor event name

Description and impact

System change event

  • SystemUpdate.InfraDiskError

  • System change event due to infrastructure disk failure

Critical

  • Instance:SystemUpdate.InfraDiskError:Executing: System change event in progress due to infrastructure disk failure

  • Instance:SystemUpdate.InfraDiskError:Executed: System change event completed due to infrastructure disk failure

An infrastructure failure makes the local disk unavailable.

  • SystemUpdate.InfraDiskStalled

  • System change event due to infrastructure disk performance issues

Critical

  • Instance:SystemUpdate.InfraDiskstalled:Executing: System change event in progress due to infrastructure disk performance issues

  • Instance:SystemUpdate.InfraDiskstalled:Executed: System change event completed due to infrastructure disk performance issues

The performance of the cloud disk is degraded due to an infrastructure failure.

  • SystemUpdate.InfraFailureStop

  • System change event due to an infrastructure-related instance stop

Critical

  • Instance:SystemUpdate.InfraFailureStop:Scheduled: Scheduled system change event due to an infrastructure-related instance stop

  • Instance:SystemUpdate.InfraFailureStop:Executing: System change event in progress due to an infrastructure-related instance stop

  • Instance:SystemUpdate.InfraFailureStop:Executed: System change event completed due to an infrastructure-related instance stop

  • Instance:SystemUpdate.InfraFailureStop:Failed: System change event failed due to an infrastructure-related instance stop

The instance may stop due to a potential infrastructure failure.

  • SystemUpdate.InfraMigrate

  • System change event due to infrastructure maintenance

Critical

  • Instance:SystemUpdate.InfraMigrate:Scheduled: Scheduled system change event due to infrastructure maintenance

  • Instance:SystemUpdate.InfraMigrate:Executing: System change event in progress due to infrastructure maintenance

  • Instance:SystemUpdate.InfraMigrate:Executed: System change event completed due to infrastructure maintenance

  • Instance:SystemUpdate.InfraMigrate:Failed: System change event failed due to infrastructure maintenance

  • The instance node restarts due to infrastructure maintenance.

  • The instance node is redeployed due to infrastructure maintenance.

  • SystemUpdate.SoftwareRepair

  • System change event due to a software update

Warning

  • Instance:SystemUpdate.SoftwareRepair:Scheduled: Scheduled system change event due to a software update

  • Instance:SystemUpdate.SoftwareRepair:Executing: System change event in progress due to a software update

  • Instance:SystemUpdate.SoftwareRepair:Executed: System change event completed due to a software update

  • Description: The cluster control system restarts due to an upgrade. This upgrade involves changes to the Alibaba Cloud instance architecture, where the control deployment mode is upgraded from Basic Control (v2) to Cloud-native Control (v3).

    Note

    You can view the control deployment mode on the instance's Basic Information page.

  • Impact:

    • The upgrade is performed through a blue-green deployment within a scheduled time period. During this process, the number of cluster nodes doubles, but no extra fees are incurred.

    • The upgrade process takes several hours, depending on the data volume. The old nodes are taken offline during the O&M window you set. This process involves a service interruption of about 1 to 2 seconds. Instance change operations are not supported during the upgrade. Please make the necessary business preparations in advance.

    • Clusters of version 6.8.6 are upgraded to version 6.8.23. The engine is fully compatible, and your services are not affected.

    • After the upgrade, the Kibana private network is disabled. You need to log on to the Kibana console to enable it.

Cluster health event

  • HealthCheck.ClusterAbnormal

  • Cluster health event due to an abnormal cluster status

Critical

  • Instance:HealthCheck.ClusterAbnormal:Executed: Cluster health event completed due to an abnormal cluster status

  • Instance:HealthCheck.ClusterAbnormal:Failed: Cluster health event failed due to an abnormal cluster status

The instance restarts due to an abnormal cluster status.

Cluster change event

  • UserOperator.InstanceSpecModify

  • Cluster change event due to an instance type change

Info

  • Instance:UserOperator.InstanceSpecModify:Executing: Cluster change event in progress due to an instance type change

  • Instance:UserOperator.InstanceSpecModify:Executed: Cluster change event completed due to an instance type change

  • The instance restarts due to an instance type change.

  • The instance node restarts due to an instance node change.

  • UserOperator.InstanceUpdate

  • Cluster change event due to an instance change operation

Info

  • Instance:UserOperator.InstanceUpdate:Executing: Cluster change event in progress due to an instance change operation

  • Instance:UserOperator.InstanceUpdate:Executed: Cluster change event completed due to an instance change operation

  • The instance restarts due to an instance configuration change.

  • The instance plugin is updated.

  • The IK dictionary plugin for the instance is hot-updated.

  • UserOperator.InstanceCoreUpdate

  • Cluster change event due to an instance kernel upgrade

Info

  • Instance:UserOperator.InstanceCoreUpdate:Executing: Cluster change event in progress due to an instance kernel upgrade

  • Instance:UserOperator.InstanceCoreUpdate:Executed: Cluster change event completed due to an instance kernel upgrade

The instance restarts due to a kernel version update.