When an underlying host fails or is at risk, Alibaba Cloud sends a system event to the affected Elastic Compute Service (ECS) instances. By default, your instance automatically restarts in response to such events. To prevent the instance from restarting automatically, modify its maintenance attributes.
Background information
The instance maintenance attribute defines the default behavior of an ECS instance during both scheduled maintenance and unexpected system events. You can modify this attribute to specify whether the instance should, for example, automatically restart or remain in a stopped state after an event. The following table describes the supported attributes.
You can customize instance maintenance attributes, but the change will not affect an ongoing maintenance process. For example, if an instance is already undergoing automatic restart for recovery, modifying its attribute will not stop the restart or alter the current recovery procedure.
Instance maintenance attribute | Related system events | Applicable instance types | Description |
Automatic restart for recovery (Default) |
| All instances that support system events. | The instance returns to its state before the O&M task was executed:
|
Restart recovery is disabled. |
| All instances that support system events. | The instance enters the Stopped state. This is useful if you have implemented disaster recovery mechanisms at the application level, such as failover or Node Failover, to prevent conflicts caused by multiple active nodes. |
Automatic redeployment |
| Only instances that depend on host hardware, such as instances that have local disks attached or support Software Guard Extensions (SGX) encrypted computing. For information about the related instance families, see Instance family. Note After an instance is redeployed, data on its local disks is cleared, and the SGX feature is reset. | The instance is automatically redeployed to another host and then continues to provide services. |
To minimize the impact of maintenance, increase your workload's fault tolerance by using these attributes effectively. For example:
Add your core applications, such as SAP HANA, to the startup list to prevent business interruptions.
Enable automatic reconnection features in your applications. For example, let applications automatically reconnect to services like MySQL, SQL Server, or Apache Tomcat.
If you use Server Load Balancer (SLB), deploy multiple ECS instances in a cluster. While one ECS instance recovers automatically, the others can continue to handle service requests.
Regularly back up data from your local disks. This practice provides data redundancy, ensuring you have the necessary files if an instance is redeployed.
Procedure
-
Go to ECS console - Instances.
-
In the top navigation bar, select the region and resource group of the resource that you want to manage.
Click the ID of the target instance. On the instance details page, click All Actions. Then, search for and click .
In the Modify Instance Maintenance Attribute dialog box, change the attributes as needed. Then, click OK.
If the instance has only cloud disks attached, select one of the following options:
Automatic restart for recovery
Prevent Resuming After Restart
If the instance has local disks attached, select one of the following options:
Automatic restart for recovery
Disable recovery on restart
Automatic redeployment
On the Instance Details page, in the Other Information section, verify the updated Maintenance Attribute.
