Configure and observe graceful shutdown - Microservices Engine

During deployments, rollbacks, scale-in events, and restarts, a microservice instance that is shutting down can still receive requests from consumers that have not yet detected the shutdown. This causes request errors and traffic loss. Graceful shutdown in Microservices Engine (MSE) Microservices Governance solves this by draining in-transit requests and notifying consumers before the instance stops.

How it works

In a typical microservice architecture, a provider instance (Provider A) registers with Microservices Registry so that consumer instances (Consumer B) can discover and call it. Without graceful shutdown, a race condition occurs when Provider A shuts down:

Provider A begins to shut down and Microservices Registry is notified of the shutdown event.
Consumer B has a cached copy of the provider list and does not immediately detect the change.
Consumer B continues to send requests to Provider A.
Requests fail because Provider A is no longer available.

Graceful shutdown eliminates this gap through a two-phase process:

Drain phase: After Provider A receives a shutdown command, it continues to process in-transit requests but adds a special tag to each response. When Consumer B receives a tagged response, it refreshes its provider list from Microservices Registry and stops routing new requests to Provider A.
Wait phase: Provider A waits for all remaining in-transit requests to complete, then shuts down.

Note

Graceful shutdown is automatically enabled when MSE Microservices Governance is enabled for an application. No manual setup is required for the basic capability. The graceful start and shutdown features of MSE also provide observability that helps you determine whether an application is gracefully shut down. To use the optional proactive notification feature, see Enable proactive notification.

Kubernetes implementation details

In Kubernetes (ACK) clusters, MSE implements graceful shutdown by injecting a lifecycle.preStop hook into the pod. This hook runs before the kubelet sends SIGTERM to the application container, giving MSE time to deregister the instance and drain traffic.

`preStop` hook injection

The injection behavior depends on whether your business container already has a custom preStop hook:

Scenario	Injection behavior
No custom `preStop` hook	MSE injects a `preStop` hook directly into the business container
Custom `preStop` hook exists	MSE injects a sidecar container named gracefulshutdown with its own `preStop` hook. The sidecar shares network namespaces with the business container, so its `preStop` hook can also trigger graceful shutdown for the business container

Important

If your custom preStop hook only deregisters the application from Microservices Registry, remove it entirely and rely on the MSE-injected hook instead.

Sidecar container resource usage

The gracefulshutdown sidecar container uses minimal resources: 0.05 CPU cores and 50 MiB of memory.

Configure `terminationGracePeriodSeconds`

The preStop hook execution time counts toward the pod's terminationGracePeriodSeconds budget. The default value of 30 seconds is often insufficient. Here is how the time budget breaks down:

Phase	Duration	Description
`preStop` hook	~30 s	MSE deregisters the instance and drains traffic
Application shutdown hook	Varies	The application releases resources and connections
SIGKILL (forced termination)	0 s	The kubelet force-kills the container when the total time exceeds `terminationGracePeriodSeconds`

If the combined time of the preStop hook and the application shutdown hook exceeds terminationGracePeriodSeconds, the kubelet sends SIGKILL and the application is force-terminated. Resources may not be released properly.

Set terminationGracePeriodSeconds to at least 90 in your pod spec:

apiVersion: v1
kind: Pod
spec:
  terminationGracePeriodSeconds: 90
  containers:
    - name: your-app
      # ...

Important

If your business container has a custom preStop hook and the sidecar approach is used, specify a sleep duration of at least 30 seconds in your custom preStop hook. This allows the sidecar's preStop hook enough time to complete the graceful shutdown process.

Enable graceful shutdown

ACK clusters

No action is required. Graceful shutdown is automatically enabled when MSE Microservices Governance is enabled for an application in a Container Service for Kubernetes (ACK) cluster.

ECS instances

Add the following command to the beginning of your application's shutdown script:

curl http://127.0.0.1:54199/offline 2>/tmp/null; sleep 30;

This command notifies MSE to begin the graceful shutdown process and waits 30 seconds for in-transit requests to drain.

Verify graceful shutdown

After graceful shutdown is enabled and triggered, verify that traffic drains correctly on the application governance page.

Log on to the MSE console, and select a region in the top navigation bar.
In the left-side navigation pane, choose Microservices Governance > Application Governance. Click the resource card of the target application.
In the left-side navigation pane of the application details page, click Traffic management, and then click the Graceful Start/Shutdown tab.
On the Start and Shutdown Overview subtab, find and click the target application instance. The right-side pane shows the shutdown event timeline and QPS graph.

A successful graceful shutdown shows QPS dropping to zero before the instance stops. No traffic reaches the instance after the shutdown process completes.

Important

If QPS does not drop to zero after a shutdown event, check for non-microservice calls (such as local calls) that bypass Microservices Registry.

Note

Shutdown events are reported only when the MSE agent version is later than 4.2.0. If shutdown events are not visible, upgrade the agent.

Enable proactive notification

Proactive notification is an advanced graceful shutdown capability that is disabled by default. It addresses a specific issue with Spring Cloud applications: even after a provider deregisters, a Spring Cloud consumer may continue to route requests to the provider due to its internal caching behavior. With proactive notification enabled, the provider explicitly notifies consumers during shutdown, and consumers immediately stop sending requests.

When to use proactive notification

Use proactive notification when:

Your application uses the Spring Cloud framework and
Consumer call errors occur during provider shutdown despite basic graceful shutdown being enabled

For most other scenarios, the default graceful shutdown behavior is sufficient.

Prerequisites

Before you begin, make sure that you have:

Microservices Governance activated. For more information, see Activate Microservices Governance
Microservices Governance enabled for applications in an ACK cluster. For more information, see Enable Microservices Governance for microservice applications in an ACK cluster

Enable proactive notification in the console

Log on to the MSE console, and select a region in the top navigation bar.
In the left-side navigation pane, choose Microservices Governance > Application Governance. Click the resource card of the target application.
In the left-side navigation pane of the application details page, click Traffic management, and then click the Graceful Start/Shutdown tab.
In the Settings section, click Revised. In the Graceful Start and Shutdown Settings panel, expand the Graceful Shutdown block, turn on the Proactive Notification switch, and then click OK.

Limitations

Graceful shutdown

In Kubernetes scenarios, graceful shutdown is implemented based on the preStop hook. Graceful shutdown is supported only when pods stop normally, including these scenarios:

Scale-in
Restart
Rolling upgrade

Graceful shutdown does not apply when pods stop abnormally, such as during an OOM kill.

Proactive notification

Microservices Governance does not support graceful shutdown of the following applications:

Non-Java applications
Applications that do not use WebFlux or Spring MVC
Provider applications whose consumers are not microservice applications
Applications where Microservices Governance is disabled on either the consumer or provider side

Microservices Engine:Graceful shutdown

How it works

Kubernetes implementation details

`preStop` hook injection

Sidecar container resource usage

Configure `terminationGracePeriodSeconds`

Enable graceful shutdown

ACK clusters

ECS instances

Verify graceful shutdown

Enable proactive notification

When to use proactive notification

Prerequisites

Enable proactive notification in the console

Limitations

Graceful shutdown

Proactive notification

See also

How it works

Kubernetes implementation details

preStop hook injection

Sidecar container resource usage

Configure terminationGracePeriodSeconds

Enable graceful shutdown

ACK clusters

ECS instances

Verify graceful shutdown

Enable proactive notification

When to use proactive notification

Prerequisites

Enable proactive notification in the console

Limitations

Graceful shutdown

Proactive notification

See also

`preStop` hook injection

Configure `terminationGracePeriodSeconds`