Session affinity is an advanced routing mechanism in Function Compute that ensures requests from the same session are always routed to the same function instance. This feature maintains state consistency and is ideal for applications that must preserve context, process long-running tasks, or support real-time interactions.
Session affinity types
Cookie affinity: Identifies sessions using the value of the Cookie field in an HTTP request.
HeaderField affinity: Identifies sessions using the value of a specified field in the HTTP request header.
MCP SSE affinity: Identifies sessions using the SessionId in the MCP SSE protocol.
MCP Streamable HTTP affinity: Identifies sessions using the
Mcp-Session-Idfield in the HTTP header.
General limitations
If you enable session isolation, session affinity is automatically enabled and cannot be disabled. You must also select a suitable affinity type.
If request isolation is enabled, session affinity is unavailable.
Asynchronous requests do not support session capabilities, including session affinity and session isolation.
If you select a built-in runtime, the concurrency per instance is limited to 1. In this case, the number of concurrent sessions per instance is also limited to 1.
The number of concurrent sessions per instance cannot exceed the maximum number of concurrent requests per instance.
Limitations specific to each affinity type
1. Limitations specific to Cookie affinity
Only server-side cookie injection is supported:
When a client makes its first request, Function Compute automatically injects a cookie into the response using theSet-Cookieheader. The client must parse and save this cookie, and include it in subsequent requests.SessionAPI management is supported:
You can use the SessionAPI for session lifecycle management, such as creating, updating, querying, and terminating sessions.
2. Limitations specific to HeaderField affinity
Session ID source:
If the client passes a session ID in a predefined header, this value is used as the session identity.
If no ID is passed, the server-side generates a globally unique session ID. It is returned in the response header using the header field predefined in CreateFunction.
Header field definition:
When you create the function, specify the header field name for transmitting the session ID inSessionAffinityConfig.SessionAPI management is supported:
You can use the SessionAPI to monitor and control sessions.
3. Limitations specific to MCP Streamable HTTP affinity
Protocol version requirements:
This affinity type supports MCP protocol versions2025-03-26and2025-06-18. The client and function must follow the transport layer specifications of the corresponding version.Compatibility note: If a function has MCP Streamable HTTP affinity enabled, do not use MCP HTTP with Server-Sent Events (SSE) to invoke it. The session management mechanisms are incompatible and will cause invocations to fail.
Access method limitations:
Access is supported only through an HTTP trigger or a custom domain name.HTTP trigger configuration requirements:
The HTTP trigger must support at least theGET,POST, andDELETEmethods.Necessity of the DELETE method:
A client can actively terminate a session by sending aDELETErequest. Function Compute then revokes the resources for that session, including the instance concurrency quota. If the DELETE method is not enabled, the system rejects the request, and the session cannot be released properly.SessionAPI management is not supported.
4. Limitations specific to MCP SSE affinity
Runtime limitations:
If you use a built-in runtime, MCP SSE affinity is not supported.
If you use an MCP runtime, only MCP affinity (including SSE) is supported.
Other runtimes do not have this limitation.
Client requirements:
Requests must be initiated using an official MCP standard client or software development kit (SDK). Otherwise, a valid affinity connection cannot be established.Session lifecycle:
The maximum session lifecycle is equal to the function's maximum timeout period. After this period, the server-side disconnects. Reconnecting generates a new session ID and does not guarantee routing to the original instance.Access method limitations:
Access is supported only through an HTTP trigger or a custom domain name.Request limitations:
The first SSE request does not currently support including query parameters.
The maximum concurrency per instance is 200.
SessionAPI management is not supported.
Core principles
1. Core flow
Client sends request → Identify/Generate Session ID → Attach to an active instance → Route subsequent requests to that instance
2. Resource model (unified frame)
Each session consumes one session concurrency quota.
Each request (including POST, GET, and Message) consumes one request concurrency quota.
Total concurrency quota per instance: 200 (not adjustable)
Multiple sessions share this quota of 200.
Formula:
TotalQuota = Σ(Concurrency consumed by each Session) ≤ 2003. Lifecycle management
Phase | Trigger condition | Behavior |
Creation | The first request does not carry a valid session ID. | A unique ID is generated and a binding is established. |
The first request carries a valid session ID. | The server-side establishes a binding between this session ID and an instance. | |
Active | A request is received. | The last active time is updated. |
Idle timeout | The idle time exceeds the | The session is automatically destroyed. |
Expiration | The session duration exceeds the | The session is automatically destroyed. |
Manual termination | A DELETE request is sent (MCP Streamable) or the connection is disconnected (SSE). | Resources are actively released. |
Concurrency management mechanism
Concurrency parameter descriptions
Parameter type | Meaning | Adjustable | Limit | Consumption rule | Concurrent Collection Mechanism |
Maximum concurrency per instance | The maximum number of concurrent requests that a single instance can process simultaneously. | Not adjustable | 200 | Each request or persistent connection consumes 1. | Released asynchronously after the request is complete. |
Concurrent sessions per instance | The maximum number of sessions that a single instance can process at the same time. | Configurable | [1, 200] | Each session consumes 1. | Depends on the affinity type. |
Concurrency resource model for a single session
Type | Formula | Description |
Cookie, HeaderField, or MCP Streamable HTTP affinity |
| Includes only sync requests. Each request consumes one unit of concurrency. |
MCP SSE affinity |
| Includes one SSE persistent connection and N Message requests. |
Configuration recommendations for concurrent sessions per instance:
Isolation scenario: Set the value to 1. A single session exclusively uses the computing resources for better security and reliability.
Multi-tenant sharing scenario: Set the value to a number in the range (1, 200]. Multiple sessions share the resources of a single instance to improve resource utilization.
Detailed instance scheduling rules
Scenario 1: Session and instance binding rules and scale-out mechanism
Assume the function is configured with two concurrent sessions per instance.
Client1 sends a request. Instance1 is allocated, consuming one session quota.
Client2 sends a request. The scheduler determines that Instance1 has an available session quota and successfully binds the session.
Client3 sends a request. Because Instance1 is full, a new instance, Instance2, is created, and the session is successfully bound to it.
Key points:
The scheduling system tries to reuse existing instances, but this is not guaranteed.
A scale-out occurs only when all existing instances are full.
This implements dynamic load balancing and elastic scaling.
Scenario 2: Resource throttling mechanism for multiple sessions on a single instance
Assume the function is configured with two concurrent sessions per instance.
Client1 sends a request, which consumes one session quota and one unit of request concurrency.
Client2 sends a request, which consumes one session quota and one unit of request concurrency.
Before the first two requests are complete, both clients concurrently send another 198 requests, consuming a total of 200 units of request concurrency.
The next concurrent request exceeds the limit of 200, and the system returns a `429 Too Many Requests` error.
Key points:
The maximum concurrency per instance is fixed at 200 and is not adjustable.
Multiple sessions share this quota.
When the concurrency quota is exhausted, the system rejects new requests.
Fault handling mechanism
Scenario | System behavior | Status code | Client-side response strategy |
Concurrent session quota per instance is exhausted | If the number of existing instances is below the regional limit, the system automatically scales out a new instance to handle the session request. | 200 | - |
The number of existing instances reaches the maximum limit for the region | The system throttles and rejects the request. | 429 | 1. Use a backoff retry strategy. |
The concurrency quota of 200 per instance is exhausted | The system throttles and rejects the request. | 429 | If the number of sessions is greater than 1, consider reducing this value. |
Session is invalid or expired | The system rejects the request. | 401 | Send a new request to generate a new session. |
Invalid HeaderField value | The system rejects the request. | 400 | Check the header name and format. |