MCP officially introduces a new Streamable HTTP transport layer, which represents a significant improvement over the original HTTP+SSE transport mechanism. This article will:
Relevant project links are as follows:
● Complete runnable example: https://github.com/springaialibaba/spring-ai-alibaba-examples
● Spring AI Alibaba official blog article: https://java2ai.com/
● Spring AI Alibaba open-source project address: https://github.com/alibaba/spring-ai-alibaba
● Higress official website: https://higress.ai/
In the original MCP implementation, communication between the client and server occurs through two primary channels:
● HTTP request/response: The client sends messages to the server via standard HTTP requests.
● Server-Sent Events (SSE): The server pushes messages to the client through a dedicated /sse endpoint.
While this design is simple and intuitive, there are several key issues:
1. No support for reconnecting/recovery:
When the SSE connection is dropped, all session states are lost, necessitating the client to re-establish the connection and initialize the entire session. For example, a large document analysis task being executed could be completely interrupted due to unstable WiFi, forcing the user to restart the entire process.
2. The server must maintain long connections:
The server must maintain a long-lived SSE connection for each client, leading to a significant increase in resource consumption with a large number of concurrent users. When the server needs to restart or scale, all connections are interrupted, negatively affecting user experience and system reliability.
3. Server messages can only be transmitted via SSE:
Even for simple request-response interactions, the server must return information through the SSE channel, creating unnecessary complexity and overhead. This approach is unsuitable for certain environments (such as cloud functions) due to the need to maintain long-lived SSE connections.
4. Infrastructure compatibility limitations:
Many existing web infrastructures such as CDNs, load balancers, and API gateways may not correctly handle long-lived SSE connections. Corporate firewalls might force close timed-out connections, leading to unreliable services.
Compared to the original HTTP+SSE mechanism, Streamable HTTP introduces several key improvements:
The workflow of Streamable HTTP is as follows:
1. Session Initialization (Optional, Suitable for Stateful Implementation Scenarios):
2. Client Communication with the Server:
3. Server Response Methods:
4. Active Establishment of SSE Stream:
5. Connection Recovery:
Scenario: Simple tool API services, such as mathematical calculations, text processing, etc.
Implementation:
Client Server
| |
|-- POST /message (Calculation Request) -------->|
| |-- Perform Calculation
|<------- HTTP 200 (Calculation Result) -------|
| |
Advantages: Extremely simple deployment, no state management required, suitable for serverless architecture and microservices.
Scenario: Long-running tasks, such as large file processing, complex AI generation, etc.
Implementation:
Client Server
| |
|-- POST /message (Processing Request) -------->|
| |-- Start Processing Task
|<------- HTTP 200 (SSE Starts) --------|
| |
|<------- SSE: Progress 10% ---------------|
|<------- SSE: Progress 30% ---------------|
|<------- SSE: Progress 70% ---------------|
|<------- SSE: Completion + Result ------------|
| |
Advantages: Provides real-time feedback without needing to maintain a permanent connection state.
Scenario: Multi-turn dialogue AI assistants that require context maintenance.
Implementation:
Client Server
| |
|-- POST /message (Initialization) ---------->|
|<-- HTTP 200 (Session ID: abc123) ------|
| |
|-- GET /message (Session ID: abc123) --->|
|<------- SSE Stream Established -----------------|
| |
|-- POST /message (Question 1, abc123) --->|
|<------- SSE: Thinking... -------------|
|<------- SSE: Answer 1 ----------------|
| |
|-- POST /message (Question 2, abc123) --->|
|<------- SSE: Thinking... -------------|
|<------- SSE: Answer 2 ----------------|
Advantages: Maintains session context, supports complex interactions, while allowing for horizontal scaling.
Scenario: AI applications used in unstable network environments.
Implementation:
Client Server
| |
|-- POST /message (Initialization) ---------->|
|<-- HTTP 200 (Session ID: xyz789) ------|
| |
|-- GET /message (Session ID: xyz789) --->|
|<------- SSE Stream Established -----------------|
| |
|-- POST /message (Long Task, xyz789) -->|
|<------- SSE: Progress 30% ---------------|
| |
| [Network Disruption] |
| |
|-- GET /message (Session ID: xyz789) --->|
|<------- SSE Stream Re-established --------------|
|<------- SSE: Progress 60% ---------------|
|<------- SSE: Completion ------------------|
Advantages: Increases reliability in weak network environments, improving user experience.
In the previous sections, we theoretically outlined the advantages and disadvantages of the HTTP+SSE and Streamable modes. In actual applications, the fragmented request and response patterns of the HTTP+SSE model lead to a very tricky problem in architectural implementation and scalability: it forces the maintenance of sticky session connections between the client and server, even for stateless communication where we need to maintain a session ID and ensure that requests with the same session ID are sent to the same server machine. This poses a heavy burden on both the client and server implementations.
For the Streamable mode, if the goal is merely to maintain stateless communication, there is no need to manage sticky sessions at all. Considering that over 90% of MCP services may be stateless, this presents a significant improvement in the overall architecture's scalability.
Of course, if stateful communication needs to be implemented, the Streamable HTTP mode still requires maintaining a session ID.
Currently, neither MCP nor Spring AI's official documentation has provided the Streamable mode; we have only provided the Stream HTTP Client implementation, which only supports stateless mode and can connect to the official Typescript server implementation and Higress community server implementation.
A complete runnable example can be found at: https://github.com/springaialibaba/spring-ai-alibaba-examples
Due to the ongoing development of the MCP Java SDK implementation for the Streamable HTTP solution, this example repository contains customized source code from the following two repositories:
- MCP Java SDK, located in the io.modelcontextprotocol package.
- Spring AI, located in the org.springframework.ai.mcp.client.autoconfigure package.
The examples integrate a Higress gateway that supports the MCP Streamable HTTP protocol implementation. This implementation still has many limitations, such as not supporting GET requests, not supporting session-id management, etc.
The client can actively establish an SSE connection by sending a GET request to the /mcp endpoint, which will serve as the subsequent request-response channel.
return Mono.defer(() -> Mono.fromFuture(() -> {
final HttpRequest.Builder builder = requestBuilder.copy().GET().uri(uri);
final String lastId = lastEventId.get();
if (lastId != null) {
builder.header("Last-Event-ID", lastId);
}
return httpClient.sendAsync(builder.build(), HttpResponse.BodyHandlers.ofInputStream());
}).flatMap(response -> {
if (response.statusCode() == 405 || response.statusCode() == 404) {
// .....
}
return handleStreamingResponse(response, handler);
})
.retryWhen(Retry.backoff(3, Duration.ofSeconds(3)).filter(err -> err instanceof IllegalStateException))
.doOnSuccess(v -> state.set(TransportState.CONNECTED))
.doOnTerminate(() -> state.set(TransportState.CLOSED))
.onErrorResume(e -> {
System.out.println("Ignore GET connection error.");
LOGGER.error("Streamable transport connection error", e);
state.set(TransportState.CONNECTED);
return Mono.just("Streamable transport connection error").then();
}));
Example of an equivalent HTTP request, where listTool and callTool are similar requests.
curl -X POST -H "Content-Type: application/json" -H "Accept: application/json" -H "Accept: text/event-stream" -d '{
"jsonrpc" : "2.0",
"method" : "initialize",
"id" : "9afdedcc-0",
"params" : {
"protocolVersion" : "2024-11-05",
"capabilities" : {
"roots" : {
"listChanged" : true
}
},
"clientInfo" : {
"name" : "Java SDK MCP Client",
"version" : "1.0.0"
}
}
}' -i http://localhost:3000/mcp
You can start and test with the Streamable Server provided by the official TypeScript SDK in conjunction with the current client implementation.
// Send POST request to /mcp, including
public Mono<Void> sendMessage(final McpSchema.JSONRPCMessage message,
final Function<Mono<McpSchema.JSONRPCMessage>, Mono<McpSchema.JSONRPCMessage>> handler) {
// ...
return sentPost(message, handler).onErrorResume(e -> {
LOGGER.error("Streamable transport sendMessage error", e);
return Mono.error(e);
});
}
// Actually send the POST request and process the response
private Mono<Void> sentPost(final Object msg,
final Function<Mono<McpSchema.JSONRPCMessage>, Mono<McpSchema.JSONRPCMessage>> handler) {
return serializeJson(msg).flatMap(json -> {
final HttpRequest request = requestBuilder.copy()
.POST(HttpRequest.BodyPublishers.ofString(json))
.uri(uri)
.build();
return Mono.fromFuture(httpClient.sendAsync(request, HttpResponse.BodyHandlers.ofInputStream()))
.flatMap(response -> {
// If the response is 202 Accepted, there's no body to process
if (response.statusCode() == 202) {
return Mono.empty();
}
if (response.statusCode() == 405 || response.statusCode() == 404) {
// ...
}
if (response.statusCode() >= 400) {
// ...
}
return handleStreamingResponse(response, handler);
});
});
}
// Handle different types of responses that the server might return
private Mono<Void> handleStreamingResponse(final HttpResponse<InputStream> response,
final Function<Mono<McpSchema.JSONRPCMessage>, Mono<McpSchema.JSONRPCMessage>> handler) {
final String contentType = response.headers().firstValue("Content-Type").orElse("");
if (contentType.contains("application/json-seq")) {
return handleJsonStream(response, handler);
}
else if (contentType.contains("text/event-stream")) {
return handleSseStream(response, handler);
}
else if (contentType.contains("application/json")) {
return handleSingleJson(response, handler);
}
else {
return Mono.error(new UnsupportedOperationException("Unsupported Content-Type: " + contentType));
}
}
@AutoConfiguration
@ConditionalOnClass({ McpSchema.class, McpSyncClient.class })
@EnableConfigurationProperties({ McpStreamableClientProperties.class, McpClientCommonProperties.class })
@ConditionalOnProperty(prefix = McpClientCommonProperties.CONFIG_PREFIX, name = "enabled", havingValue = "true",
matchIfMissing = true)
public class StreamableHttpClientTransportAutoConfiguration {
@Bean
public List<NamedClientMcpTransport> mcpHttpClientTransports(McpStreamableClientProperties streamableProperties,
ObjectProvider<ObjectMapper> objectMapperProvider) {
ObjectMapper objectMapper = objectMapperProvider.getIfAvailable(ObjectMapper::new);
List<NamedClientMcpTransport> sseTransports = new ArrayList<>();
for (Map.Entry<String, McpStreamableClientProperties.StreamableParameters> serverParameters : streamableProperties.getConnections().entrySet()) {
var transport = StreamableHttpClientTransport.builder(serverParameters.getValue().url()).withObjectMapper(objectMapper).build();
sseTransports.add(new NamedClientMcpTransport(serverParameters.getKey(), transport));
}
return sseTransports;
}
}
By configuring the following, you can enable Streamable HTTP Transport. The configuration shows the MCP Server address provided by Higress (supporting limited Streamable HTTP Server implementation).
spring:
ai:
mcp:
client:
toolcallback:
enabled: true
streamable:
connections:
server1:
url: http://env-cvpjbjem1hkjat42sk4g-ap-southeast-1.alicloudapi.com/mcp-quark
@SpringBootApplication(exclude = {
org.springframework.ai.mcp.client.autoconfigure.SseHttpClientTransportAutoConfiguration.class,
})
@ComponentScan(basePackages = "org.springframework.ai.mcp.client")
public class Application {
@Bean
public CommandLineRunner predefinedQuestions(ChatClient.Builder chatClientBuilder, ToolCallbackProvider tools,
ConfigurableApplicationContext context) {
return args -> {
var chatClient = chatClientBuilder
.defaultTools(tools)
.build();
System.out.println("\n>>> QUESTION: " + "Alibaba Xixi Park");
System.out.println("\n>>> ASSISTANT: " + chatClient.prompt("Alibaba Xixi Park").call().content());
System.out.println("\n>>> QUESTION: " + "Gold price trend");
System.out.println("\n>>> ASSISTANT: " + chatClient.prompt("Gold price trend").call().content());
};
}
}
After running the example, you should see a successful connection to the MCP Server and the execution of a list of tools. The Higress example has two built-in tools.
{
"jsonrpc": "2.0",
"id": "32124bd9-1",
"result": {
"nextCursor": "",
"tools": [{
"description": "Performs a web search using the Quark Search API, ideal for general queries, news, articles, and online content.\nUse this for broad information gathering, recent events, or when you need diverse web sources.\nBecause Quark search performs poorly for English searches, please use Chinese for the query parameters.",
"inputSchema": {
"additionalProperties": false,
"properties": {
"contentMode": {
"default": "summary",
"description": "Return the level of content detail, choose to use summary or full text",
"enum": ["full", "summary"],
"type": "string"
},
"number": {
"default": 5,
"description": "Number of results",
"type": "integer"
},
"query": {
"description": "Search query, please use Chinese",
"examples": ["Gold price trend"],
"type": "string"
}
},
"required": ["query"],
"type": "object"
},
"name": "web_search"
}]
}
}
The example initiates a chat session, and the model will guide the agent to call the web_search tool and return results.
Currently, the implementation is based on the official Java SDK, adding the Streamable HTTP mode for the McpClientTransport. However, this modification does not fully support Streamable HTTP because its workflow is inconsistent with many aspects of HTTP+SSE, and many processes in the original Java SDK are strongly tailored to the HTTP+SSE design, resulting in the current SDK implementation requiring some structural changes.
For example, here are a few points where the current implementation is limited:
Currently, several core contributors from the Spring AI Alibaba community are actively involved in the development of the official MCP SDK, including bug fixes and the implementation of the Streamable HTTP solution. We have already submitted the relevant pull requests (PRs) to the official community. Below are the PRs for the community's Streamable solution:
1. Spring AI Alibaba official website: https://java2ai.com/
2. Spring AI Alibaba open-source project source repository: https://github.com/alibaba/spring-ai-alibaba
3. mcp-streamable-http: https://www.claudemcp.com/blog/mcp-streamable-http
4. MCP java-sdk: https://github.com/modelcontextprotocol/java-sdk
5. Streamable HTTP: https://modelcontextprotocol.io/specification/2025-03-26/basic/transports#streamable-http
Java Development with MCP: From Claude Automation to Spring AI Alibaba Ecosystem Integration
A Comprehensive Analysis and Practical Implementation of the New Features in the MCP Specification
553 posts | 53 followers
FollowAlibaba Cloud Native Community - April 18, 2025
Alibaba Cloud Native Community - April 3, 2025
Alibaba Cloud Native Community - May 8, 2025
Alibaba Cloud Native Community - April 15, 2025
Alibaba Cloud Native Community - April 24, 2025
Alibaba Cloud Native Community - April 17, 2025
553 posts | 53 followers
FollowMulti-source metrics are aggregated to monitor the status of your business and services in real time.
Learn MoreAccelerate and secure the development, deployment, and management of containerized applications cost-effectively.
Learn MoreAccelerate AI-driven business and AI model training and inference with Alibaba Cloud GPU technology
Learn MoreTop-performance foundation models from Alibaba Cloud
Learn MoreMore Posts by Alibaba Cloud Native Community