Session affinity is an advanced routing mechanism in Function Compute that uses various affinity types to ensure requests are routed to the same instance. It is suitable for stateful scenarios but is subject to constraints such as runtime, concurrency limits, and protocol compatibility. - Function Compute

Session affinity is an advanced routing mechanism in Function Compute that ensures requests from the same session are always routed to the same function instance. This feature maintains state consistency and is ideal for applications that must preserve context, process long-running tasks, or support real-time interactions.

Session affinity types

Cookie affinity: Identifies sessions using the value of the Cookie field in an HTTP request.
HeaderField affinity: Identifies sessions using the value of a specified field in the HTTP request header.
MCP SSE affinity: Identifies sessions using the SessionId in the MCP SSE protocol.
MCP Streamable HTTP affinity: Identifies sessions using the Mcp-Session-Id field in the HTTP header.

General limitations

If you enable session isolation, session affinity is automatically enabled and cannot be disabled. You must also select a suitable affinity type.
If request isolation is enabled, session affinity is unavailable.
Asynchronous requests do not support session capabilities, including session affinity and session isolation.
If you select a built-in runtime, the concurrency per instance is limited to 1. In this case, the number of concurrent sessions per instance is also limited to 1.
The number of concurrent sessions per instance cannot exceed the maximum number of concurrent requests per instance.

Limitations specific to each affinity type

1. Limitations specific to Cookie affinity

Only server-side cookie injection is supported:
When a client makes its first request, Function Compute automatically injects a cookie into the response using the Set-Cookie header. The client must parse and save this cookie, and include it in subsequent requests.
SessionAPI management is supported:
You can use the SessionAPI for session lifecycle management, such as creating, updating, querying, and terminating sessions.

2. Limitations specific to HeaderField affinity

Session ID source:
- If the client passes a session ID in a predefined header, this value is used as the session identity.
- If no ID is passed, the server-side generates a globally unique session ID. It is returned in the response header using the header field predefined in CreateFunction.
Header field definition:
When you create the function, specify the header field name for transmitting the session ID in SessionAffinityConfig.
SessionAPI management is supported:
You can use the SessionAPI to monitor and control sessions.

3. Limitations specific to MCP Streamable HTTP affinity

Protocol version requirements:
This affinity type supports MCP protocol versions 2025-03-26 and 2025-06-18. The client and function must follow the transport layer specifications of the corresponding version.
Compatibility note: If a function has MCP Streamable HTTP affinity enabled, do not use MCP HTTP with Server-Sent Events (SSE) to invoke it. The session management mechanisms are incompatible and will cause invocations to fail.
Access method limitations:
Access is supported only through an HTTP trigger or a custom domain name.
HTTP trigger configuration requirements:
The HTTP trigger must support at least the GET, POST, and DELETE methods.

Necessity of the DELETE method:
A client can actively terminate a session by sending a DELETE request. Function Compute then revokes the resources for that session, including the instance concurrency quota. If the DELETE method is not enabled, the system rejects the request, and the session cannot be released properly.
SessionAPI management is not supported.

4. Limitations specific to MCP SSE affinity

Runtime limitations:
- If you use a built-in runtime, MCP SSE affinity is not supported.
- If you use an MCP runtime, only MCP affinity (including SSE) is supported.
- Other runtimes do not have this limitation.
Client requirements:
Requests must be initiated using an official MCP standard client or software development kit (SDK). Otherwise, a valid affinity connection cannot be established.
Session lifecycle:
The maximum session lifecycle is equal to the function's maximum timeout period. After this period, the server-side disconnects. Reconnecting generates a new session ID and does not guarantee routing to the original instance.
Access method limitations:
Access is supported only through an HTTP trigger or a custom domain name.
Request limitations:
- The first SSE request does not currently support including query parameters.
- The maximum concurrency per instance is 200.
SessionAPI management is not supported.

Core principles

1. Core flow

Client sends request → Identify/Generate Session ID → Attach to an active instance → Route subsequent requests to that instance

2. Resource model (unified frame)

Each session consumes one session concurrency quota.
Each request (including POST, GET, and Message) consumes one request concurrency quota.
Total concurrency quota per instance: 200 (not adjustable)
Multiple sessions share this quota of 200.

Formula:

TotalQuota = Σ(Concurrency consumed by each Session) ≤ 200

3. Lifecycle management

Phase	Trigger condition	Behavior
Creation	The first request does not carry a valid session ID.	A unique ID is generated and a binding is established.
Creation	The first request carries a valid session ID.	The server-side establishes a binding between this session ID and an instance.
Active	A request is received.	The last active time is updated.
Idle timeout	The idle time exceeds the `Session Idle Time` (default: 1800 seconds).	The session is automatically destroyed.
Expiration	The session duration exceeds the `Single Session Lifecycle` (default: 21600 seconds).	The session is automatically destroyed.
Manual termination	A DELETE request is sent (MCP Streamable) or the connection is disconnected (SSE).	Resources are actively released.

Concurrency management mechanism

Concurrency parameter descriptions

Parameter type	Meaning	Adjustable	Limit	Consumption rule	Concurrent Collection Mechanism
Maximum concurrency per instance	The maximum number of concurrent requests that a single instance can process simultaneously.	Not adjustable	200	Each request or persistent connection consumes 1.	Released asynchronously after the request is complete.
Concurrent sessions per instance	The maximum number of sessions that a single instance can process at the same time.	Configurable	[1, 200]	Each session consumes 1.	Depends on the affinity type.

Concurrency resource model for a single session

Type	Formula	Description
Cookie, HeaderField, or MCP Streamable HTTP affinity	`TotalQuota(s) = N (N ≥ 1)`	Includes only sync requests. Each request consumes one unit of concurrency.
MCP SSE affinity	`TotalQuota(s) = 1 + N (N ≥ 1)`	Includes one SSE persistent connection and N Message requests.

Configuration recommendations for concurrent sessions per instance:

Isolation scenario: Set the value to 1. A single session exclusively uses the computing resources for better security and reliability.
Multi-tenant sharing scenario: Set the value to a number in the range (1, 200]. Multiple sessions share the resources of a single instance to improve resource utilization.

Detailed instance scheduling rules

Scenario 1: Session and instance binding rules and scale-out mechanism

Assume the function is configured with two concurrent sessions per instance.

Client1 sends a request. Instance1 is allocated, consuming one session quota.
Client2 sends a request. The scheduler determines that Instance1 has an available session quota and successfully binds the session.
Client3 sends a request. Because Instance1 is full, a new instance, Instance2, is created, and the session is successfully bound to it.

Key points:

The scheduling system tries to reuse existing instances, but this is not guaranteed.
A scale-out occurs only when all existing instances are full.
This implements dynamic load balancing and elastic scaling.

Scenario 2: Resource throttling mechanism for multiple sessions on a single instance

Assume the function is configured with two concurrent sessions per instance.

Client1 sends a request, which consumes one session quota and one unit of request concurrency.
Client2 sends a request, which consumes one session quota and one unit of request concurrency.
Before the first two requests are complete, both clients concurrently send another 198 requests, consuming a total of 200 units of request concurrency.
The next concurrent request exceeds the limit of 200, and the system returns a `429 Too Many Requests` error.

Key points:

The maximum concurrency per instance is fixed at 200 and is not adjustable.
Multiple sessions share this quota.
When the concurrency quota is exhausted, the system rejects new requests.

Fault handling mechanism

Scenario	System behavior	Status code	Client-side response strategy
Concurrent session quota per instance is exhausted	If the number of existing instances is below the regional limit, the system automatically scales out a new instance to handle the session request.	200	-
The number of existing instances reaches the maximum limit for the region	The system throttles and rejects the request.	429	1. Use a backoff retry strategy. 2. Request a quota increase in Quota Center.
The concurrency quota of 200 per instance is exhausted	The system throttles and rejects the request.	429	If the number of sessions is greater than 1, consider reducing this value.
Session is invalid or expired	The system rejects the request.	401	Send a new request to generate a new session.
Invalid HeaderField value	The system rejects the request.	400	Check the header name and format.