You may need to restart a cluster or its nodes to apply configuration changes or resolve cluster exceptions. To perform this operation safely and efficiently, it is important to understand the scenarios and risks associated with different restart methods.
Preparations
To ensure a smooth restart, complete the following health checks and preparations before you begin.
Check the cluster health status
Connect to the cluster through Kibana and run theGET _cluster/healthcommand. Ensure that the value of thestatusfield isgreen.Exception: You can perform a forced restart only if the cluster status is
yelloworred.Ensure data redundancy
Run theGET _cat/indices?vcommand to check the value of therep(number of replicas) field for all critical indices.Ensure that the number of replicas is at least
1. Indices without replicas become inaccessible during the restart.For multi-zone instances, ensure that the number of replicas for any index is less than the number of zones.
Check and handle closed indices
Run theGET _cat/indices?vcommand to check for any indices that have astatusofclose.Reason: Closed indices cause the cluster health check to fail and prevent shards from being allocated. This blocks the restart process.
Action: If closed indices exist, run the
POST /<index_name>/_opencommand to open them.
Assess the cluster load
On the Cluster Monitoring page of the instance, check the following core metrics. Ensure that resource usage is within the required limits to reserve sufficient resources for shard migration during the restart.Node CPU usage: Must be below 80%.
Node HeapMemory usage: Must be around 50%.
Node load_1m: Must be below the number of CPU cores of the data node.
Restart
After you complete the health checks, follow these steps to restart the instance.
Log on to the Alibaba Cloud Elasticsearch console. In the navigation pane on the left, click Elasticsearch Clusters.
In the top menu bar, select the region where the target instance is located. Click the ID of the target instance. On the Basic Information page, click Restart in the upper-right corner.

In the Restart dialog box that appears, configure the following parameters as required.

Object
Cluster: Restarts all nodes in the instance. This option is suitable for cluster-level changes.
Node Restart: Restarts one or more specified nodes that you select. This option is suitable for resolving issues with individual nodes.
Node Role (For basic management cluster v2 only): Restarts nodes of a specific role that you select, such as data nodes or Kibana nodes.
Blue-green Update and Restart Mode
A restart operation can affect the stability and availability of your cluster. Before you restart the cluster, select a restart method that is suitable for your specific scenario, cluster status, and risk tolerance.
Restart Method
Required Cluster Status
Use cases
Service Impact
Applicable Instance Versions
Blue-green change
Normal (green)
This operation adds new nodes to the cluster, migrates data from the original nodes to the new nodes, and then deletes the original nodes.
This method is suitable for scenarios where a single node in the cluster has poor performance, such as consistently high CPU usage, and you have high requirements for cluster availability but are not sensitive to the change duration.
ImportantA blue-green change cannot be used with a forced restart.
The node IP addresses change. The cluster performance may experience brief fluctuations.
Not supported for 1-core 2 GB specifications
Restart (Standard)
Normal (green)
Planned maintenance and regular cluster configuration.
The node IP addresses do not change. The restart takes a long time. If replica shards exist, the service remains available but may experience brief fluctuations.
All versions
Grayscale restart
Normal (green)
Use this method in a production environment to verify the restart effect in batches and reduce overall risks.
If you select this option, you must first select the nodes for the grayscale restart. After the first batch of nodes is restarted and the cluster is stable, manually trigger the subsequent change to restart the remaining nodes.
The node IP addresses do not change. Some nodes are restarted first for observation, and then the remaining nodes are restarted.
For cloud-native new management (v3) clusters only
Forced restart
Abnormal (yellow/red)
When the instance is in an unhealthy state (yellow or red), other restart operations are disabled. You must perform a forced restart.
ImportantWhen disk usage exceeds the
cluster.routing.allocation.disk.watermark.lowthreshold, the cluster may enter an unhealthy state (yellow or red). During this period, avoid the following operations:Node scale-out
Disk scale-out
Restart (standard or forced)
Password change
Other configuration changes
Perform these operations only after the instance returns to a healthy state (green).
The node IP addresses do not change.
Increasing the concurrency can significantly speed up the forced restart, but it also has a greater impact:
High concurrency risk: If set to 100%, all nodes are restarted at the same time. This causes a complete service interruption and may lead to the loss of cached data that is not persisted.
Recommendation: Use a high concurrency setting when the cluster is abnormal and needs to be recovered urgently.
Concurrency: The percentage of nodes that are restarted at the same time. The default value is 10% of the total number of nodes in the cluster, rounded up to at least 1 node. For example, if the concurrency is set to 10%, 10% of the nodes in the cluster are restarted at a time.
This parameter is displayed only in forced restart mode.
All versions
After you confirm the parameters, click OK.
If you perform a forced restart, you must also select Restart Cluster Forcibly. After the operation starts, the instance status changes to Applying. You can view the restart progress in the task list in the upper-right corner of the page. After the restart is complete, the instance status changes back to Normal.