By Wu Tong
MCP Specification released the latest version on 2025-03-26, which provides a detailed introduction and explanation of the main changes.
Comparison table of the main updates between the 2025-03-26 version and the 2024-11-05 version:
| Category | 2024-11-05 Version | 2025-03-26 Version | Significance and Impact of Updates |
|---|---|---|---|
| Authorization Mechanism | Based on OAuth 2.0, supports implicit authorization flow and basic permission control | Upgraded to OAuth 2.1, deprecated implicit authorization flow, enforces PKCE and HTTPS | Increased security, reduced token leakage risks, adapts to public client scenarios (such as mobile and local applications). |
| Transport Protocol | Uses HTTP + SSE (dual endpoints), supports unidirectional stream communication | Replaced with Streamable HTTP (single endpoint), supports bidirectional communication and disconnection recovery | Simplifies deployment complexity, supports flexible communication modes (one-time response or stream push), optimizes network stability. |
| JSON-RPC Batching | Not enforced, some implementations optional | Protocol-level enforcement of batching (Batching), requires MUST implementation | Reduces network overhead, supports parallel task processing, enhances batch operation efficiency (e.g., atomic transactions). |
| Tool Metadata | Only inputSchema and description provided | Added Tool Annotations (operational and display metadata) | Explicitly marks tool risks (e.g., destructive), supports automatic permission control and frontend UI adaptation, enhances security compliance. |
| Progress Notifications | Supports only percentage or numerical progress | New message field added, supports dynamic status descriptions | Enhances user interaction experience (e.g., displays "Data loading, 50% remaining"). |
| Multimodal Support | Supports text and images | New audio data stream support added | Expands capabilities for voice assistants, real-time audio processing, and other scenarios. |
| Parameter Completion | Not explicitly supported | New completions capability declaration, supports automatic parameter completion suggestions | Increases developer efficiency, reduces manual input errors. |
| Session Management | No explicit session identification | Introduces Mcp-Session-Id header, supports reconnection and state recovery | Enhances reliability for long-running tasks (e.g., voice interactions), reduces the impact of network fluctuations. |
| Security Requirements | Relies on recommended practices of OAuth 2.0 | Mandatory HTTPS, token binding and storage encryption, supports short-lived token rotation | Reduces risk of man-in-the-middle attacks, minimizes the effective window after token leakage. |
Key Difference Summary:
1. Security
2. Communication Efficiency
3. Tool Controllability
4. Multimodal Extension
5. Developer Friendliness
The old version of OAuth 2.0 had three major deadly vulnerabilities for a long time:
| Risk Type | Specific Vulnerability | OAuth 2.1 Fix |
|---|---|---|
| Authorization Code Leakage | Implicit authorization flow transmits token through URL fragments | Completely deprecated implicit authorization (Implicit Flow) |
| Man-in-the-Middle Attack | Public clients transmit authorization code without encryption | Mandatory PKCE (Proof Key for Code Exchange) |
| Redirect Hijacking | Open redirect vulnerabilities lead to phishing attacks | Strict validation of redirect URI whitelist |
In the context of AI tools, these vulnerabilities could lead to catastrophic consequences. For example, by intercepting unencrypted authorization codes, attackers could forge legitimate call requests for a "database cleanup tool".
PKCE completely eliminates man-in-the-middle attacks through cryptographic challenge-response mechanisms:
# Example of client generating PKCE parameters
import hashlib, base64, os
code_verifier = base64.urlsafe_b64encode(os.urandom(32)).decode('utf-8').rstrip('=')
code_challenge = hashlib.sha256(code_verifier.encode()).digest()
code_challenge = base64.urlsafe_b64encode(code_challenge).decode('utf-8').rstrip('=')
Traditional OAuth 2.0: Client → Authorization Server: Request authorization code Authorization Server → Client: Return raw authorization code OAuth 2.1 + PKCE: Client → Authorization Server: Request authorization code + code_challenge Authorization Server → Client: Return encrypted authorization code Client → Token endpoint: code_verifier + authorization code
In response to the fragmented nature of the AI tool ecosystem, MCP mandates support for the RFC7591 dynamic registration protocol:

This mechanism allows:
Implements self-describing protocol through standardized discovery endpoints:
GET /.well-known/oauth-authorization-server HTTP/1.1
Host: api.example.com
MCP-Protocol-Version: 2025-03-26
HTTP/1.1 200 OK
{
"issuer": "https://api.example.com",
"authorization_endpoint": "https://auth.example.com/authorize",
"token_endpoint": "https://auth.example.com/token",
"capabilities": ["PKCE", "TOKEN_ROTATION"]
}
In case of discovery failure, the client automatically falls back to preset endpoint paths to ensure compatibility.
| Token Type | Recommended Lifespan | Refresh Rules |
|---|---|---|
| Access Token | ≤15 minutes | Invalidated immediately after single use |
| Refresh Token | ≤24 hours | A new token is generated with each refresh |
// Example of token metadata
{
"token": "eyJhbGciOi...",
"binding": {
"client_id": "mcp-client-xyz",
"ip_range": "192.168.1.0/24",
"device_fingerprint": "SHA3-256(hardware features)"
}
}
// Secure token validation pseudocode
public boolean verifyToken(String token) {
try {
JWT jwt = decode(token);
if (jwt.isExpired()) throw new TokenExpiredException();
if (!jwt.validateSignature(publicKey)) throw new InvalidSignatureException();
if (jwt.getClaim("scope").contains("destructive")) {
requireMfa(); // High-risk operations trigger multi-factor authentication
}
return true;
} catch (JWTException e) {
auditLog.logSecurityEvent("INVALID_TOKEN", token);
return false;
}
}
The metadata defined by the **<font style="color:rgb(235, 87, 87);background-color:rgb(236, 236, 236);">ToolAnnotations</font>** interface (see code block) allows developers to provide clients with non-mandatory prompts about tool behaviors. These annotations have the following impacts on the toolchain ecosystem:
1. Increased Interaction Transparency
**<font style="color:rgb(235, 87, 87);background-color:rgb(236, 236, 236);">title</font>** provides semantic naming**<font style="color:rgb(235, 87, 87);background-color:rgb(236, 236, 236);">readOnlyHint/destructiveHint</font>** indicates whether the operation is destructive**<font style="color:rgb(235, 87, 87);background-color:rgb(236, 236, 236);">openWorldHint</font>** distinguishes between internal and external scopes (e.g., search engines vs memory access) allowing the frontend to dynamically render operation confirmation pop-ups or risk warning icons based on these annotations.2. Optimized Call Strategy
**<font style="color:rgb(235, 87, 87);background-color:rgb(236, 236, 236);">idempotentHint</font>** allows clients to automatically retry idempotent requests (e.g., querying operations)Ensuring Ecosystem Compatibility: All annotations are only intended as behavioral suggestions, and the client must not replace security controls based on them. For example:
if (tool.annotations.destructiveHint) {
showDestructiveWarningDialog(); // Frontend prompt
}
await enforceRBACPolicy(); // Actual permissions verified by RBAC engine
| Feature Item | 2024-11-05 Version | 2025-03-26 Version |
|---|---|---|
| Authorization Endpoint Discovery | Manual Configuration | Automatic Discovery + Fallback Mechanism |
| PKCE Support | Optional | Mandatory Enablement |
| Token Storage | Allows Memory Cache | Must Use Secure Storage |
| Error Handling | Basic HTTP Status Codes | Refined OAuth Error Codes (e.g., invalid_scope) |
Old code snippet:
// OAuth 2.0 Implicit Flow
const token = getTokenFromURLFragment();
callMCPService(token);
New version secure implementation:
// OAuth 2.1 PKCE Flow
const { verifier, challenge } = generatePKCE();
startAuthFlow(challenge);
// Callback Handling
function handleCallback(code) {
fetchToken(code, verifier).then(token => {
secureStorage.save('mcp_token', token);
callMCPService(token);
});
}
The HTTP+SSE dual-channel scheme adopted by the 2024-11-05 version has three structural flaws:
| Problem Type | Specific Manifestation | Technical Consequences |
|---|---|---|
| Complex Connection Management | Needs to maintain dual channels of POST request and SSE event stream | Clients need to implement dual connection keep-alive mechanisms |
| Difficult Disconnection Recovery | SSE stream interruptions require rebuilding a complete session | Long task scenarios may lose contextual data |
| Protocol Redundancies | Simple requests are forced to use streaming transmission | Extra 30% network resource consumption (based on MCP working group benchmark tests) |
Typical case: When the AI assistant simultaneously performs "speech-to-text + real-time translation", the old solution needs to establish 4 independent connections (2 tools × 2 protocols), leading to an average latency increase of 400ms on mobile.
The new protocol transforms the communication paradigm through three major innovations:

Key Technical Features
1. Intelligent Protocol Negotiation
2. Bidirectional Communication Tunnel
3. Breakpoint Resume Mechanism
Servers can choose to:
Network Efficiency Comparison Test
Data from the MCP official testing platform shows:
| Metric | Old Protocol (HTTP+SSE) | Streamable HTTP | Improvement Rate |
|---|---|---|---|
| Connection Establishment Time | 320ms ±50ms | 180ms ±20ms | 43.75% |
| Data Transmission Redundancy | 18% | 5% | 72.2% |
| Disconnection Recovery Success Rate | 68% | 93% | 36.8% |
The new specification clearly states in section 4.2:
All MCP implementations must support the JSON-RPC 2.0 batching specification. For batch requests that include notifications, the server should return an HTTP 202 Accepted status code after processing.
Example of a valid request:
json[
{"jsonrpc":"2.0","id":1,"method":"text_analyze","params":{"text":"Hello"}},
{"jsonrpc":"2.0","id":2,"method":"image_tag","params":{"url":"img.jpg"}},
{"jsonrpc":"2.0","method":"log_event"} // Notification type without ID
]
Response handling rules:
Assuming processing 100 independent requests:
| Metric | Single Request Mode | Batch Mode | Optimization Ratio |
|---|---|---|---|
| TCP Handshake Count | 100 | 1 | 99% |
| Total Header Size | ~150KB | ~2KB | 98.7% |
| Total Time (3G Network) | 12.3s | 1.8s | 85.4% |
// Go Language Implementation of Batch Processing in Parallel
func HandleBatch(ctx context.Context, batch []RPCRequest) []RPCResponse {
var wg sync.WaitGroup
resChan := make(chan RPCResponse, len(batch))
for _, req := range batch {
wg.Add(1)
go func(r RPCRequest) {
defer wg.Done()
result := processSingle(r)
resChan <- result
}(req)
}
wg.Wait()
close(resChan)
var responses []RPCResponse
for res := range resChan {
responses = append(responses, res)
}
return responses
}
Points to consider:
tools:
- name: database_backup
annotations:
# Standard behavior hints (following ToolAnnotations interface definition)
title: "Database Backup" # Semantic title
readOnlyHint: false # Non-read-only operation
destructiveHint: false # Non-destructive operation
idempotentHint: true # Idempotent operation (no side effects on repeated execution)
openWorldHint: false # Closed scope (limited to local database)

When detected destructiveHint: true, the following actions occur:
Audit log example:
json{
"action": "data_purge",
"user": "ai_agent_123",
"riskLevel": "critical",
"annotations": {"destructiveHint": true},
"timestamp": "2025-03-27T08:15:30Z",
"mfaUsed": true
}
Policy engine based on metadata:
def generate_policy(tool):
policy = {
"effect": "allow" if tool.requiredScopes else "deny",
"conditions": []
}
if tool.annotations.get('destructiveHint'):
policy['conditions'].append({
"type": "mfa",
"required": True
})
return policy
New message field supports structured status descriptions:
{
"type": "ProgressNotification",
"progress": 65,
"message": {
"phase": "Data Cleaning",
"detail": "Processed 12000/20000 records",
"next_step": "Feature extraction is about to begin"
}
}
Application value:
New audio/* content type support:
httpPOST /voice-process
Content-Type: audio/webm
Transfer-Encoding: chunked
<Binary audio stream>
Key technical features:
| Function | Parameters |
|---|---|
| Encoding Format | WebM/MP3/WAV |
| Streaming | Supports chunked uploads and real-time transcription |
| Metadata Binding | Parameters such as sampling rate passed via X-Audio-Metadata header |
Scenario case: Intelligent customer service system can simultaneously receive user voice streams and respond with text in real-time.
1. The client discovers the server's declaration of completions capability
2. Completion request triggered when user inputs:
GET /completions?prefix=dat
Response:["date_format", "data_source", "dataset"]
3. Dynamically generates list of parameter suggestions. Design advantages:
Core identification:
Mcp-Session-Id: sess_XYZ123 (UUIDv7 format)
Disconnection recovery process:
1. The client caches the last received Event-ID (e.g., 159).
2. When reconnecting, carry:
Last-Event-ID: 159
Mcp-Session-Id: sess_XYZ123
3. The server can either resume from the breakpoint or return incremental updates.
Technical adaptation challenges
Experience upgrade opportunities
Architectural transformation requirements
| Transformation Item | Implementation Cost | Benefit Level |
|---|---|---|
| Session State Management | High | ★★★★☆ |
| Streamable HTTP Gateway (e.g., Higress) | Low | ★★★★★ |
| Batch Atomic Transactions | Medium | ★★★☆☆ |
Key upgrades in SDK:
# New Generation SDK Pseudocode Example
class MCPClient:
def __init__(self):
self.session = ResilientSession() # Automatic reconnection and checkpoint resuming
self.annotator = ToolAnnotationParser() # Metadata parsing engine
self.auditor = SecurityAuditHook() # Security audit hook
def call_tool(self, tool_name):
if self.annotator.risk_level(tool_name) == 'critical':
self.auditor.log_operation(tool_name) # Automatically trigger auditing
Toolchain upgrades lead to:
Higress has taken the lead in supporting the Streamable HTTP transmission format and continues to prioritize aligning with various features of MCP 2025-03-26, such as session management with the Mcp-Session-Id header, supporting batch requests, responses, and notifications, as well as SSE stream recoverability.
See API is MCP | Higress Releases MCP Marketplace, Accelerating Legacy API into the MCP Era
On the commercial product side, the cloud-native API gateway will also align with the various capabilities of open-source Higress later, providing all enterprise-level MCP features, we welcome your inquiries and attention.
Higress Has Been Selected for the Global Top 100 MCP Servers List | MCPMarket.com
626 posts | 54 followers
FollowAlibaba Cloud Native Community - May 23, 2025
Alibaba Cloud Native Community - October 20, 2025
Alibaba Cloud Native Community - April 16, 2025
Alibaba Cloud Native Community - April 29, 2025
Alibaba Cloud Native Community - April 18, 2025
Alibaba Cloud Native Community - August 25, 2025
626 posts | 54 followers
Follow
Managed Service for Prometheus
Multi-source metrics are aggregated to monitor the status of your business and services in real time.
Learn More
Cloud-Native Applications Management Solution
Accelerate and secure the development, deployment, and management of containerized applications cost-effectively.
Learn More
AI Acceleration Solution
Accelerate AI-driven business and AI model training and inference with Alibaba Cloud GPU technology
Learn More
Offline Visual Intelligence Software Packages
Offline SDKs for visual production, such as image segmentation, video segmentation, and character recognition, based on deep learning technologies developed by Alibaba Cloud.
Learn MoreMore Posts by Alibaba Cloud Native Community