EAS service scrolling updates and graceful exit - Platform For AI

Restarting an Elastic Algorithm Service (EAS) service or updating its parameters triggers a scrolling update. This release policy gradually replaces old instances with new ones, allowing you to complete version upgrades without downtime and ensuring high availability (HA).

Scrolling update

During an update, the system creates new instances and gradually replaces old ones based on the configuration parameters. If a new instance fails to start, the update is aborted. The failed instance does not accept traffic, and the remaining old instances continue to provide the service. Your service is not affected. You can choose to roll back or restart the update. A new update first removes the failed instances from any previous incomplete updates.

The scrolling update behavior is controlled by the following two key parameters:

Max Surge Instances (JSON parameter: rolling_strategy.max_surge)
- Description: The maximum number of extra instances that can be created during an update. This can be a positive integer or a percentage. A larger value results in a faster update.
- Example: If you have 100 instances and set this parameter to 20, the system creates 20 new instances when the update starts.
- Default value: 2% of the total number of instances. If the result is less than 1, the value is set to 1.
Important
If the value of Max Surge Instances is too large, many new instances are created at once, replacing an equal number of old instances. If the new instances are not prefetched, the sudden increase in traffic may affect service stability.
Max Unavailable Instances (JSON parameter: rolling_strategy.max_unavailable)
- Description: The maximum number of instances that can be unavailable during an update. This helps release resources and prevents the update process from being blocked by insufficient resources.
- Example: If you set this parameter to N, the system immediately stops N old instances when the update starts.
- Default value:
  - Dedicated resource group: For services created before September 1, 2025, the default value is 1. For services created on or after September 1, 2025, the default value is 0 if the elastic resource pool is enabled, and 1 if it is not enabled.
  - Public resource group: 0.
  - Lingjun AI Computing Service Quota: For services created before September 1, 2025, the default value is 0. For services created on or after September 1, 2025, the default value is 2% of the total number of instances. If the result is less than 1, the value is set to 1.
Important
- For a single-instance service, if you set Max Unavailable Instances to 1, the old instance exits before the new one starts during a scrolling update. The service will have no active instances and will be temporarily unavailable.
- If the value of Max Unavailable Instances is too large, too many instances may go offline at the same time. The remaining instances may not be able to handle the traffic, which affects service availability.

Graceful exit

The graceful exit parameters affect the stability of instance termination during a scrolling update.

Graceful Exit Time (JSON parameter: eas.termination_grace_period)
- Description: The time, in seconds, that the system waits for an instance to exit gracefully. After an instance enters the Terminating state, traffic is no longer routed to it. The system waits for the specified period to allow the instance to finish processing any ongoing requests before it goes offline. If the request processing time is long, you should increase this value.
- Default value: 30.
Send SIGTERM (JSON parameter: rpc.enable_sigterm)
- Description: SIGTERM is a signal to terminate a process. The JSON parameter value can be true or false.
  - false: The system does not send a SIGTERM signal when the instance exits.
  - true: The system immediately sends a SIGTERM signal when the instance exits. The main process of the service must implement custom graceful exit logic in the signal handler. Otherwise, the process may terminate immediately, which causes the graceful exit procedure to fail.
- Default value: Do not send (false).

By default, the system does not send a SIGTERM signal. This is because the application container does not block the SIGTERM signal by default. If the application container receives a SIGTERM signal, the application process exits immediately. This causes the graceful exit procedure to fail and may disrupt the service.

For services with large variations in request processing time, enable SIGTERM. For example, if the processing time for requests ranges from a few seconds to 30 minutes, setting a fixed graceful exit time of 30 minutes slows down the service update. In this case, you must configure the application container to exit only after it receives the SIGTERM signal and finishes processing all ongoing requests. This provides more flexible control over the exit process.

You do not need to enable SIGTERM for asynchronous inference services. When an instance exits, the EAS control layer automatically responds to the SIGTERM signal. It stops accepting new requests and waits for ongoing requests to be processed before the instance exits.