By Wu Tong
MCP Specification released the latest version on 2025-03-26, which provides a detailed introduction and explanation of the main changes.
Comparison table of the main updates between the 2025-03-26 version and the 2024-11-05 version:
Category | 2024-11-05 Version | 2025-03-26 Version | Significance and Impact of Updates |
---|---|---|---|
Authorization Mechanism | Based on OAuth 2.0, supports implicit authorization flow and basic permission control | Upgraded to OAuth 2.1, deprecated implicit authorization flow, enforces PKCE and HTTPS | Increased security, reduced token leakage risks, adapts to public client scenarios (such as mobile and local applications). |
Transport Protocol | Uses HTTP + SSE (dual endpoints), supports unidirectional stream communication | Replaced with Streamable HTTP (single endpoint), supports bidirectional communication and disconnection recovery | Simplifies deployment complexity, supports flexible communication modes (one-time response or stream push), optimizes network stability. |
JSON-RPC Batching | Not enforced, some implementations optional | Protocol-level enforcement of batching (Batching), requires MUST implementation | Reduces network overhead, supports parallel task processing, enhances batch operation efficiency (e.g., atomic transactions). |
Tool Metadata | Only inputSchema and description provided | Added Tool Annotations (operational and display metadata) | Explicitly marks tool risks (e.g., destructive), supports automatic permission control and frontend UI adaptation, enhances security compliance. |
Progress Notifications | Supports only percentage or numerical progress | New message field added, supports dynamic status descriptions | Enhances user interaction experience (e.g., displays "Data loading, 50% remaining"). |
Multimodal Support | Supports text and images | New audio data stream support added | Expands capabilities for voice assistants, real-time audio processing, and other scenarios. |
Parameter Completion | Not explicitly supported | New completions capability declaration, supports automatic parameter completion suggestions | Increases developer efficiency, reduces manual input errors. |
Session Management | No explicit session identification | Introduces Mcp-Session-Id header, supports reconnection and state recovery | Enhances reliability for long-running tasks (e.g., voice interactions), reduces the impact of network fluctuations. |
Security Requirements | Relies on recommended practices of OAuth 2.0 | Mandatory HTTPS, token binding and storage encryption, supports short-lived token rotation | Reduces risk of man-in-the-middle attacks, minimizes the effective window after token leakage. |
Key Difference Summary:
1. Security
2. Communication Efficiency
3. Tool Controllability
4. Multimodal Extension
5. Developer Friendliness
The old version of OAuth 2.0 had three major deadly vulnerabilities for a long time:
Risk Type | Specific Vulnerability | OAuth 2.1 Fix |
---|---|---|
Authorization Code Leakage | Implicit authorization flow transmits token through URL fragments | Completely deprecated implicit authorization (Implicit Flow) |
Man-in-the-Middle Attack | Public clients transmit authorization code without encryption | Mandatory PKCE (Proof Key for Code Exchange) |
Redirect Hijacking | Open redirect vulnerabilities lead to phishing attacks | Strict validation of redirect URI whitelist |
In the context of AI tools, these vulnerabilities could lead to catastrophic consequences. For example, by intercepting unencrypted authorization codes, attackers could forge legitimate call requests for a "database cleanup tool".
PKCE completely eliminates man-in-the-middle attacks through cryptographic challenge-response mechanisms:
# Example of client generating PKCE parameters
import hashlib, base64, os
code_verifier = base64.urlsafe_b64encode(os.urandom(32)).decode('utf-8').rstrip('=')
code_challenge = hashlib.sha256(code_verifier.encode()).digest()
code_challenge = base64.urlsafe_b64encode(code_challenge).decode('utf-8').rstrip('=')
Traditional OAuth 2.0: Client → Authorization Server: Request authorization code Authorization Server → Client: Return raw authorization code OAuth 2.1 + PKCE: Client → Authorization Server: Request authorization code + code_challenge Authorization Server → Client: Return encrypted authorization code Client → Token endpoint: code_verifier + authorization code
In response to the fragmented nature of the AI tool ecosystem, MCP mandates support for the RFC7591 dynamic registration protocol:
This mechanism allows:
Implements self-describing protocol through standardized discovery endpoints:
GET /.well-known/oauth-authorization-server HTTP/1.1
Host: api.example.com
MCP-Protocol-Version: 2025-03-26
HTTP/1.1 200 OK
{
"issuer": "https://api.example.com",
"authorization_endpoint": "https://auth.example.com/authorize",
"token_endpoint": "https://auth.example.com/token",
"capabilities": ["PKCE", "TOKEN_ROTATION"]
}
In case of discovery failure, the client automatically falls back to preset endpoint paths to ensure compatibility.
Token Type | Recommended Lifespan | Refresh Rules |
---|---|---|
Access Token | ≤15 minutes | Invalidated immediately after single use |
Refresh Token | ≤24 hours | A new token is generated with each refresh |
// Example of token metadata
{
"token": "eyJhbGciOi...",
"binding": {
"client_id": "mcp-client-xyz",
"ip_range": "192.168.1.0/24",
"device_fingerprint": "SHA3-256(hardware features)"
}
}
// Secure token validation pseudocode
public boolean verifyToken(String token) {
try {
JWT jwt = decode(token);
if (jwt.isExpired()) throw new TokenExpiredException();
if (!jwt.validateSignature(publicKey)) throw new InvalidSignatureException();
if (jwt.getClaim("scope").contains("destructive")) {
requireMfa(); // High-risk operations trigger multi-factor authentication
}
return true;
} catch (JWTException e) {
auditLog.logSecurityEvent("INVALID_TOKEN", token);
return false;
}
}
The metadata defined by the **<font style="color:rgb(235, 87, 87);background-color:rgb(236, 236, 236);">ToolAnnotations</font>**
interface (see code block) allows developers to provide clients with non-mandatory prompts about tool behaviors. These annotations have the following impacts on the toolchain ecosystem:
1. Increased Interaction Transparency
**<font style="color:rgb(235, 87, 87);background-color:rgb(236, 236, 236);">title</font>**
provides semantic naming**<font style="color:rgb(235, 87, 87);background-color:rgb(236, 236, 236);">readOnlyHint/destructiveHint</font>**
indicates whether the operation is destructive**<font style="color:rgb(235, 87, 87);background-color:rgb(236, 236, 236);">openWorldHint</font>**
distinguishes between internal and external scopes (e.g., search engines vs memory access) allowing the frontend to dynamically render operation confirmation pop-ups or risk warning icons based on these annotations.2. Optimized Call Strategy
**<font style="color:rgb(235, 87, 87);background-color:rgb(236, 236, 236);">idempotentHint</font>**
allows clients to automatically retry idempotent requests (e.g., querying operations)Ensuring Ecosystem Compatibility: All annotations are only intended as behavioral suggestions, and the client must not replace security controls based on them. For example:
if (tool.annotations.destructiveHint) {
showDestructiveWarningDialog(); // Frontend prompt
}
await enforceRBACPolicy(); // Actual permissions verified by RBAC engine
Feature Item | 2024-11-05 Version | 2025-03-26 Version |
---|---|---|
Authorization Endpoint Discovery | Manual Configuration | Automatic Discovery + Fallback Mechanism |
PKCE Support | Optional | Mandatory Enablement |
Token Storage | Allows Memory Cache | Must Use Secure Storage |
Error Handling | Basic HTTP Status Codes | Refined OAuth Error Codes (e.g., invalid_scope) |
Old code snippet:
// OAuth 2.0 Implicit Flow
const token = getTokenFromURLFragment();
callMCPService(token);
New version secure implementation:
// OAuth 2.1 PKCE Flow
const { verifier, challenge } = generatePKCE();
startAuthFlow(challenge);
// Callback Handling
function handleCallback(code) {
fetchToken(code, verifier).then(token => {
secureStorage.save('mcp_token', token);
callMCPService(token);
});
}
The HTTP+SSE dual-channel scheme adopted by the 2024-11-05 version has three structural flaws:
Problem Type | Specific Manifestation | Technical Consequences |
---|---|---|
Complex Connection Management | Needs to maintain dual channels of POST request and SSE event stream | Clients need to implement dual connection keep-alive mechanisms |
Difficult Disconnection Recovery | SSE stream interruptions require rebuilding a complete session | Long task scenarios may lose contextual data |
Protocol Redundancies | Simple requests are forced to use streaming transmission | Extra 30% network resource consumption (based on MCP working group benchmark tests) |
Typical case: When the AI assistant simultaneously performs "speech-to-text + real-time translation", the old solution needs to establish 4 independent connections (2 tools × 2 protocols), leading to an average latency increase of 400ms on mobile.
The new protocol transforms the communication paradigm through three major innovations:
Key Technical Features
1. Intelligent Protocol Negotiation
2. Bidirectional Communication Tunnel
3. Breakpoint Resume Mechanism
Servers can choose to:
Network Efficiency Comparison Test
Data from the MCP official testing platform shows:
Metric | Old Protocol (HTTP+SSE) | Streamable HTTP | Improvement Rate |
---|---|---|---|
Connection Establishment Time | 320ms ±50ms | 180ms ±20ms | 43.75% |
Data Transmission Redundancy | 18% | 5% | 72.2% |
Disconnection Recovery Success Rate | 68% | 93% | 36.8% |
The new specification clearly states in section 4.2:
All MCP implementations must support the JSON-RPC 2.0 batching specification. For batch requests that include notifications, the server should return an HTTP 202 Accepted status code after processing.
Example of a valid request:
json[
{"jsonrpc":"2.0","id":1,"method":"text_analyze","params":{"text":"Hello"}},
{"jsonrpc":"2.0","id":2,"method":"image_tag","params":{"url":"img.jpg"}},
{"jsonrpc":"2.0","method":"log_event"} // Notification type without ID
]
Response handling rules:
Assuming processing 100 independent requests:
Metric | Single Request Mode | Batch Mode | Optimization Ratio |
---|---|---|---|
TCP Handshake Count | 100 | 1 | 99% |
Total Header Size | ~150KB | ~2KB | 98.7% |
Total Time (3G Network) | 12.3s | 1.8s | 85.4% |
// Go Language Implementation of Batch Processing in Parallel
func HandleBatch(ctx context.Context, batch []RPCRequest) []RPCResponse {
var wg sync.WaitGroup
resChan := make(chan RPCResponse, len(batch))
for _, req := range batch {
wg.Add(1)
go func(r RPCRequest) {
defer wg.Done()
result := processSingle(r)
resChan <- result
}(req)
}
wg.Wait()
close(resChan)
var responses []RPCResponse
for res := range resChan {
responses = append(responses, res)
}
return responses
}
Points to consider:
tools:
- name: database_backup
annotations:
# Standard behavior hints (following ToolAnnotations interface definition)
title: "Database Backup" # Semantic title
readOnlyHint: false # Non-read-only operation
destructiveHint: false # Non-destructive operation
idempotentHint: true # Idempotent operation (no side effects on repeated execution)
openWorldHint: false # Closed scope (limited to local database)
When detected destructiveHint: true, the following actions occur:
Audit log example:
json{
"action": "data_purge",
"user": "ai_agent_123",
"riskLevel": "critical",
"annotations": {"destructiveHint": true},
"timestamp": "2025-03-27T08:15:30Z",
"mfaUsed": true
}
Policy engine based on metadata:
def generate_policy(tool):
policy = {
"effect": "allow" if tool.requiredScopes else "deny",
"conditions": []
}
if tool.annotations.get('destructiveHint'):
policy['conditions'].append({
"type": "mfa",
"required": True
})
return policy
New message field supports structured status descriptions:
{
"type": "ProgressNotification",
"progress": 65,
"message": {
"phase": "Data Cleaning",
"detail": "Processed 12000/20000 records",
"next_step": "Feature extraction is about to begin"
}
}
Application value:
New audio/* content type support:
httpPOST /voice-process
Content-Type: audio/webm
Transfer-Encoding: chunked
<Binary audio stream>
Key technical features:
Function | Parameters |
---|---|
Encoding Format | WebM/MP3/WAV |
Streaming | Supports chunked uploads and real-time transcription |
Metadata Binding | Parameters such as sampling rate passed via X-Audio-Metadata header |
Scenario case: Intelligent customer service system can simultaneously receive user voice streams and respond with text in real-time.
1. The client discovers the server's declaration of completions capability
2. Completion request triggered when user inputs:
GET /completions?prefix=dat
Response:["date_format", "data_source", "dataset"]
3. Dynamically generates list of parameter suggestions. Design advantages:
Core identification:
Mcp-Session-Id: sess_XYZ123 (UUIDv7 format)
Disconnection recovery process:
1. The client caches the last received Event-ID (e.g., 159).
2. When reconnecting, carry:
Last-Event-ID: 159
Mcp-Session-Id: sess_XYZ123
3. The server can either resume from the breakpoint or return incremental updates.
Technical adaptation challenges
Experience upgrade opportunities
Architectural transformation requirements
Transformation Item | Implementation Cost | Benefit Level |
---|---|---|
Session State Management | High | ★★★★☆ |
Streamable HTTP Gateway (e.g., Higress) | Low | ★★★★★ |
Batch Atomic Transactions | Medium | ★★★☆☆ |
Key upgrades in SDK:
# New Generation SDK Pseudocode Example
class MCPClient:
def __init__(self):
self.session = ResilientSession() # Automatic reconnection and checkpoint resuming
self.annotator = ToolAnnotationParser() # Metadata parsing engine
self.auditor = SecurityAuditHook() # Security audit hook
def call_tool(self, tool_name):
if self.annotator.risk_level(tool_name) == 'critical':
self.auditor.log_operation(tool_name) # Automatically trigger auditing
Toolchain upgrades lead to:
Higress has taken the lead in supporting the Streamable HTTP transmission format and continues to prioritize aligning with various features of MCP 2025-03-26, such as session management with the Mcp-Session-Id header, supporting batch requests, responses, and notifications, as well as SSE stream recoverability.
See API is MCP | Higress Releases MCP Marketplace, Accelerating Legacy API into the MCP Era
On the commercial product side, the cloud-native API gateway will also align with the various capabilities of open-source Higress later, providing all enterprise-level MCP features, we welcome your inquiries and attention.
Higress Has Been Selected for the Global Top 100 MCP Servers List | MCPMarket.com
562 posts | 54 followers
FollowAlibaba Cloud Native Community - May 23, 2025
Alibaba Cloud Native Community - April 16, 2025
Alibaba Cloud Native Community - April 29, 2025
Alibaba Cloud Native Community - April 18, 2025
Alibaba Cloud Native Community - April 15, 2025
Alibaba Cloud Native Community - May 29, 2025
562 posts | 54 followers
FollowMulti-source metrics are aggregated to monitor the status of your business and services in real time.
Learn MoreAccelerate and secure the development, deployment, and management of containerized applications cost-effectively.
Learn MoreAccelerate AI-driven business and AI model training and inference with Alibaba Cloud GPU technology
Learn MoreTop-performance foundation models from Alibaba Cloud
Learn MoreMore Posts by Alibaba Cloud Native Community