All Products
Search
Document Center

Elasticsearch:Configure cluster alerts

Last Updated:Mar 26, 2026

When cluster status turns yellow or red, or when node disk or JVM heap usage climbs toward unsafe levels, you need to know immediately — before performance degrades or data loss occurs. Alibaba Cloud Elasticsearch integrates with Cloud Monitor to deliver that visibility. Two paths are available: one-click alerts cover the most critical conditions immediately with no per-instance setup, while Cloud Monitor alerts let you customize thresholds, combine multiple metrics, and route notifications to the right contacts.

Prerequisites

Before you begin, make sure you have:

Enable one-click alerts

One-click alerts are powered by Cloud Monitor and disabled by default. Enabling this feature automatically creates alert rules for every Elasticsearch instance under your account — no per-instance configuration needed. The preconfigured rules cover:

  • Abnormal cluster status

  • Node disk usage exceeding 75%

  • Node JVM heap usage exceeding 85%

To enable one-click alerts:

  1. Log on to the Alibaba Cloud Elasticsearch console.

  2. In the left-side navigation pane, click Elasticsearch Clusters.

  3. On the Elasticsearch Clusters page, click Initiative Alert.

    image

  4. In the Initiative Alert dialog box, click Enable Now.

    If the button displays Disable Now, the one-click alert feature is already enabled. No further action is needed.
  5. In the Cloud Monitor console, enable the Proactive Alerting switch for the Elasticsearch service.

To verify that the feature is active:

  1. In the Elasticsearch console, click the target instance ID on the Elasticsearch Clusters page.

  2. In the left navigation pane, choose Monitoring and Logs > Cluster Monitoring.

  3. Click the Basic Monitoring tab and check the Initiative Alert status in the top-right corner. If Initiative Alert shows Enabled, the feature is active.一键告警为开启状态

Configure Cloud Monitor alerts

Use Cloud Monitor alerts when you need custom thresholds, multi-metric conditions, or fine-grained notification routing. The steps below show an example that combines cluster status, node disk usage, and node heap memory usage into a single combined-metric rule.

Recommended metrics and thresholds

Configure alert rules based on the metrics and thresholds in the following table. For detailed metric descriptions and troubleshooting guidance, see Metric descriptions and troubleshooting suggestions.

MetricRequiredAlert thresholdNotes
Cluster StatusYes>= 2 (Yellow or Red)Green = 0, Yellow = 1, Red = 2
NodeDiskUtilization(%)Yes>= 75%; do not exceed 80%
NodeHeapMemoryUtilization(%)Yes>= 85%; do not exceed 90%
NodeCPUUtilization(%)Optional< 95%
Node Workload Within One MinuteOptional80% of CPU core countUse as a reference value
ClusterQueryQPS (Count/Second)OptionalBased on test results
ClusterIndexQPS (Count/Second)OptionalBased on test results
Full GC CountOptionalAny value > 0 indicates an issue
Exception CountOptionalAny value > 0 indicates an issue
Snapshot StatusOptional-1 or 0 = normal; 2 = abnormal

Create an alert rule

  1. Go to the Cloud Monitor console.

  2. In the left navigation pane, choose Alert > Alert Rules.

  3. Click Create Alert Rule.

  4. On the Create Alert Rule page, configure the following parameters. For parameters not listed here, use the default values. For a full parameter reference, see Create an alert rule. Configuring the rule description (combined metrics): Click Add Rule > Combined Metrics. In the Configure Rule Description panel, enter an Alert Rule name and set the following:

    • Metric Type: Combined Metrics

    • Alert Level: Warning (Warn)

    • Multi-metric Alert Condition — click Add Metric to add each metric:

      • Metric 1: Cluster ID > ClusterStatus, value >= 2

      • Metric 2: nodeName > Node Disk Usage, average >= 75%

      • Metric 3: nodeName > Node Heap Memory Usage_ES Business, average >= 85%

    • Relationship Between Metrics: Generate alert If one of the conditions is met (||)

    • Alert Threshold Triggers: 3 Consecutive Cycles (1 Cycle = 1 Minutes)

    To route alerts to an external system, expand Advanced Settings and enter a publicly accessible URL in the Alert Callback section. Cloud Monitor sends alert data to this URL via HTTP POST requests. Only HTTP is supported. For details, see Use alert callbacks.
    ParameterValue
    ProductSelect Elasticsearch
    Resource RangeSelect Instances
    Associated ResourcesAdd the instance to monitor
    Rule DescriptionSee the combined-metric example below
    Alert Contact GroupSelect a contact group
  5. Click OK. When an alert triggers, the configured contact group receives a notification via the configured channel.

Example: Configure disk alerts

To monitor disk usage at multiple severity levels, follow the steps in Create an alert rule and use a Simple Metric rule with the following configuration:

ParameterExample value
Alert RuleDisk Usage Alert
Metric TypeSimple Metric
MetricnodeName > Node Disk Usage
Threshold and alert levelCritical: average >= 80% for 3 consecutive epochs
Warning: average >= 75% for 3 consecutive epochs
Info: average >= 70% for 3 consecutive epochs
Chart PreviewPreview of the monitoring chart for the selected metric

What's next