Transient failures -- momentary network issues, temporary service overloads, connection resets -- can cause individual requests to fail even when backend services are healthy. A retry policy on your cloud-native gateway automatically resends failed requests, so your application recovers without client-side changes. Configure retry policies at the route level in Microservices Engine (MSE).
Retries increase the total number of requests to backend services. Set retry limits conservatively (0, 1, or 2) to avoid overwhelming an already-degraded service.
How it works
When a request matches a retry condition, the gateway resends the request up to the configured number of times.
Default behavior (no custom policy): The gateway retries failed requests up to 2 times using these conditions: connect-failure, refused-stream, unavailable, cancelled, and retriable-status-codes. This default applies even when a custom retry policy is disabled.
Disabling a custom retry policy does not stop all retries. The gateway falls back to the default behavior described above. To stop retries entirely, enable a custom policy and set Retry Times to 0.
Retry conditions
The gateway supports separate retry conditions for HTTP and gRPC traffic.
HTTP retry conditions
| Condition | Triggers a retry when... |
|---|---|
5xx | The backend returns any 5xx status code, or a disconnect, reset, or read timeout occurs. This is a superset that includes connect-failure and refused-stream. |
reset | A disconnect, reset, or read timeout occurs (no response from the backend). |
connect-failure | The request failed due to a disconnection. |
refused-stream | The backend returns the REFUSED_STREAM error code. |
retriable-status-codes | The backend returns one of the status codes specified in Retry Status Code. |
Specify status codes in Retry Status Code only when retriable-status-codes is selected as a retry condition.
gRPC retry conditions
| Condition | gRPC status code | Typical cause |
|---|---|---|
cancelled | CANCELLED (1) | The caller cancelled the operation. |
deadline-exceeded | DEADLINE_EXCEEDED (4) | The operation timed out. |
internal | INTERNAL (13) | An internal server error occurred. |
resource-exhausted | RESOURCE_EXHAUSTED (8) | A resource limit was reached (for example, rate limiting). |
unavailable | UNAVAILABLE (14) | The service is temporarily unavailable. |
Recommended conditions by traffic type
| Traffic type | Recommended conditions | Use case |
|---|---|---|
| HTTP | 5xx | General-purpose retry for server errors and connection issues. Covers connect-failure and refused-stream. |
| HTTP | connect-failure, refused-stream | Retry only on connection-level failures, not on 5xx responses. |
| HTTP | retriable-status-codes | Retry on specific status codes such as 429 (rate limited) or 503 (service unavailable). |
| gRPC | cancelled, unavailable | Standard retry pair for gRPC services. |
Prerequisites
Before you begin, make sure that you have:
A cloud-native gateway created in MSE
At least one route configured on the gateway
Configure the retry policy
Log on to the MSE console. In the top navigation bar, select a region.
In the left-side navigation pane, choose Cloud-native Gateway > Gateways. Click the name of the gateway.
In the left-side navigation pane, click Routes, and then click the Routes tab.
Find the route to modify and click Policies in the Actions column.
Click the Retry tab.
Configure the following parameters and click OK.
Parameter Description Retry Times Maximum number of retry attempts. Valid values: 0 to 10. Recommended: 0, 1, or 2. A value of 0disables retries entirely.Retry Condition One or more conditions that trigger a retry. See Retry conditions. Retry Status Code One or more HTTP status codes that trigger a retry. Takes effect only when retriable-status-codesis selected for Retry Condition.Enable Turn on the switch to activate the retry policy. Turn off the switch to deactivate it.
Verify the retry policy
After you enable the retry policy, verify that it works as expected:
Simulate a backend failure that matches your configured retry condition (for example, return a
503response or drop the connection).Send a request through the gateway to the affected route.
Check backend access logs to confirm the gateway retried the request the expected number of times.
Idempotency and retry storms
Idempotency: Enable retries only for idempotent operations -- requests that produce the same result when repeated. Retrying non-idempotent operations such as payment submissions can cause duplicate side effects.
Retry storms: In high-traffic scenarios, aggressive retry settings can amplify load on degraded backend services. Combine retry policies with circuit breaker or rate limiting policies to protect backend services.
Related operations
Configure a timeout policy -- Control the maximum time the gateway waits for a backend response.
Configure a circuit breaker policy -- Automatically stop forwarding requests to unhealthy backend services.