By Chengtan
Recently, the AI community has been a bit too lively.
On February 11, Zhipu released GLM-5: 744B total parameters, activating only 44B of the MoE architecture, 202K context window, achieving open-source SOTA in Coding and Agent capabilities, officially positioned as "the domestic open-source alternative to Opus 4.6 and GPT-5.3".
A day later, MiniMax released M2.5: SWE-Bench Verified reached 80.2%, Multi-SWE-Bench 51.3% took first place, but costs only 1/10 of Opus—100 TPS running continuously for an hour costs only 1 dollar. Officially, this is said to be "the first cutting-edge model that can be used infinitely without considering usage costs".
As domestic large models reach this level, as a user, it should be a good thing. But in reality, I feel a bit anxious.
I wanted to try these new models in OpenClaw, but it gave me an error:
Error: Unknown model: zai/glm-5
After checking, I found that the default models of various providers in OpenClaw are basically hard-coded. Someone in the community has raised a related issue, but the maintainers are busy dealing with various issues and PRs, and there has been no support for a long time.
The problem is:
● GLM-5 is not supported
● MiniMax M2.5 is not supported
● If Qwen/DeepSeek releases a new model, it is likely still not supported
After new models come out, they cannot be supported through configuration; you must wait for official version upgrades—this is the pain point.
And what is the speed of model iteration now? MiniMax released three versions M2, M2.1, and M2.5 in 108 days, averaging more than once a month. Zhipu, DeepSeek, and Qwen are similar. If each time you have to wait for OpenClaw's official follow-up, the opportunity is already gone.
In contrast, Higress's design philosophy is completely different: model configuration is decoupled from the gateway, new models do not require upgrades, and hot updates take effect immediately.
Through Higress's OpenClaw Integration Skill, the entire access process only requires saying one sentence to OpenClaw:
Download and install this skill for me: https://higress.cn/skills/higress-openclaw-integration.zip
Then use it to help me configure the Higress AI Gateway
OpenClaw will automatically:
Once the configuration is complete, you can use whatever model you want:
# Use GLM-5
model: "higress/glm-5"
# Or MiniMax M2.5
model: "higress/minimax-m25"
# Or use automatic routing (intelligently selects based on the message content)
model: "higress/auto"
Assuming DeepSeek releases V4 next week or Qwen launches QwQ-Max-2, you only need to say:
Add DeepSeek’s API Key for me: sk-xxx
Or:
Help me switch the default model to deepseek-v4
No need to restart OpenClaw Gateway, no need to upgrade any components, configuration hot loading takes effect immediately.
This is the core value of Higress as an AI gateway: transform model access into a dialogue issue rather than a development issue.
GLM-5 adopts the MoE architecture, activating only 44B out of 744B total parameters each time, combined with DeepSeek's sparse attention mechanism, significantly reducing deployment costs while maintaining capabilities. The officials state that GLM-5 is good at "complex system engineering and long-range Agents", with performance in real coding scenarios approaching that of Claude Opus 4.5.
M2.5 is focused on "born for real-world productivity", demonstrating the ability to "think and construct like an architect" in programming scenarios—before writing code, it actively decomposes functions, structures, and UI design. It supports 10+ languages, including Go, C, C++, TypeScript, Rust, Python, covering all platforms like Web, Android, iOS, Windows, and Mac.
Most importantly, the cost: the output price for the 50 TPS version is 1/10 to 1/20 of Opus/Gemini 3 Pro/GPT-5. The officials did the math: $10,000 can enable 4 Agents to work continuously for a year.
These two models have different orientations—GLM-5 has strong architecture capabilities, while M2.5 offers high cost-performance. Higress's automatic routing can intelligently schedule based on task types:
Help me configure automatic routing rules:
- Use glm-5 for "deep thinking", "complex problems", and "architecture design"
- Use minimax-m25-lite for "simple", "fast", and "translation" tasks
- Use minimax-m25 for everyday coding tasks (cost-effective and strong)
When using, simply specify higress/auto, and the system will automatically choose the most suitable model for inference based on the message content.
| Comparison Item | OpenClaw Native | OpenClaw + Higress |
|---|---|---|
| New model support | Requires version upgrade | Configuration through dialogue in one sentence |
| Model switching | Modify configuration and restart | IM dialogue is enough |
| Vendor management | Hard-coded | Dialogue addition, hot updates |
| Maintenance costs | Waiting for official updates | Self-controlled, immediate response |
The competition among domestic large models is becoming increasingly fierce, with new models emerging endlessly. Turning model access into a "release issue" is inherently a reverse pattern. Higress's design philosophy is: keep the architecture of AI applications in line with the evolution speed of AI models.
If you are also an OpenClaw user struggling with model support issues, you might want to try Higress OpenClaw Integration Skill, which may help you solve your urgent problem.
💡 Friendly Reminder: If your current model's capabilities are weak and cannot automatically complete the configuration, you can refer to the Skill documentation from the link above and follow the steps to manually configure the Higress AI Gateway.
P.S. When writing this article, MiniMax M2.5 had just been released for a day, and I had already used it through Higress. Waiting for OpenClaw's official support? The next new model may be out already.
Higress Has Supported the New Gateway API and Its AI Inference Extension
668 posts | 55 followers
FollowAlibaba Cloud Community - January 30, 2026
Alibaba Cloud Native Community - March 5, 2026
Alibaba Cloud Community - January 30, 2026
Alibaba Cloud Community - February 12, 2026
Alibaba Cloud Native Community - February 4, 2026
CloudSecurity - February 3, 2026
668 posts | 55 followers
Follow
AI Acceleration Solution
Accelerate AI-driven business and AI model training and inference with Alibaba Cloud GPU technology
Learn More
Offline Visual Intelligence Software Packages
Offline SDKs for visual production, such as image segmentation, video segmentation, and character recognition, based on deep learning technologies developed by Alibaba Cloud.
Learn More
Tongyi Qianwen (Qwen)
Top-performance foundation models from Alibaba Cloud
Learn More
Network Intelligence Service
Self-service network O&M service that features network status visualization and intelligent diagnostics capabilities
Learn MoreMore Posts by Alibaba Cloud Native Community