Publishing Nacos configurations or MCP service definitions can expose your applications to prompt injection, malicious URLs, sensitive data leaks, brute-force attacks, and non-compliant content. The content security guardrail in MSE Nacos scans content at publish time and blocks threats based on policies you define.
Typical use cases:
Block prompt injection or jailbreak attempts embedded in MCP tool definitions before they reach your AI models.
Prevent accidental exposure of API keys, tokens, or personally identifiable information (PII) in Nacos configuration values.
Enforce content compliance by catching politically sensitive, violent, or prohibited content before it goes live.
Prerequisites
Before you begin, make sure that you have:
Enterprise Edition with database engine version
3.1.1.0or later (Developer Edition and Professional Edition do not support this feature)
Detection dimensions and protection levels
The guardrail scans content across four detection dimensions. You set an independent protection level for each dimension to control whether flagged content is logged or blocked.
Detection dimensions
| Dimension | What it detects |
|---|---|
| Malicious URL detection | Malicious links, phishing websites, and other dangerous URLs |
| Prompt attack detection | Prompt injection, jailbreak attacks, and other malicious prompt patterns |
| Content compliance detection | Politically sensitive, violent, terrorist, or otherwise prohibited content |
| Sensitive content detection | Privacy leaks and other sensitive data |
Protection levels
Each dimension supports four protection levels. Higher tolerance means fewer items are blocked.
| Protection level | Behavior |
|---|---|
| Do not block | Detects and logs risks without blocking. Publishing proceeds normally. |
| Low risk | Blocks content rated low risk or higher. Most restrictive. |
| Medium risk | Blocks content rated medium risk or higher. Allows low-risk content through. |
| High risk | Blocks only high-risk content. Most tolerant. |
The following table summarizes what each level blocks:
| Detected risk level | Do not block | Low risk | Medium risk | High risk |
|---|---|---|---|---|
| Low | Log only | Blocked | Allowed | Allowed |
| Medium | Log only | Blocked | Blocked | Allowed |
| High | Log only | Blocked | Blocked | Blocked |
Protection scope
Choose which operations trigger security scanning:
| Scope | Description |
|---|---|
| Configuration creation and modification | Scans content when Nacos configurations are created or updated |
| MCP service creation and modification | Scans content when MCP servers are created or updated |
Enable the guardrail
Log on to the MSE Management Console.
In the left-side navigation pane, choose Service Registry & Configuration Center > Instances.
Click the name of your instance.
In the left-side navigation pane, choose Security Protection > Content Security Guardrail.
Click Enable.
NoteAfter you enable this feature, the system automatically checks the security and compliance of your content when you publish a configuration. Security policies are used to detect sensitive content and compliance issues, such as privacy leaks, malicious script injections, and risks from publishing non-compliant content.
On first use, you are prompted to authorize the
AliyunServiceRoleForMSEEngineServiceservice-linked role. Follow the on-screen instructions to grant the required permissions.
Configure mitigation policies
After you enable the guardrail, configure the detection policies and protection scope.
On the Mitigation Policy Settings page, set the Blocking Policy for each detection dimension. Select a protection level for each of the four dimensions based on your security requirements. For example, set prompt attack detection to Low risk (block all flagged content) and malicious URL detection to Medium risk (allow low-risk URLs through).
Under Protection Scope, select the operations that trigger security scanning:
Configuration creation and modification
MCP service creation and modification
Click Save Changes.
Changes take effect immediately. All subsequent publish operations are scanned against the configured policies.
What happens when content is blocked
When you publish a configuration or an MCP service, the guardrail scans the content against all enabled detection dimensions:
If the content passes all checks, it is published normally.
If a policy violation is detected:
At the Do not block level, the system logs the finding, sends alerts, and allows publishing to proceed.
At other protection levels, the system sends alerts and blocks publishing when the detected risk meets or exceeds the configured threshold.
To review detection history, go to the Content Security Guardrail page in the console.
Considerations
Each detection dimension operates independently. Adjusting the protection level for one dimension does not affect the others.
The Do not block level still logs detected risks. Use this level to evaluate the guardrail's detection accuracy before enforcing stricter policies.