All Products
Search
Document Center

Microservices Engine:Enable the content security guardrail

Last Updated:Mar 11, 2026

Publishing Nacos configurations or MCP service definitions can expose your applications to prompt injection, malicious URLs, sensitive data leaks, brute-force attacks, and non-compliant content. The content security guardrail in MSE Nacos scans content at publish time and blocks threats based on policies you define.

Typical use cases:

  • Block prompt injection or jailbreak attempts embedded in MCP tool definitions before they reach your AI models.

  • Prevent accidental exposure of API keys, tokens, or personally identifiable information (PII) in Nacos configuration values.

  • Enforce content compliance by catching politically sensitive, violent, or prohibited content before it goes live.

Prerequisites

Before you begin, make sure that you have:

  • An MSE Nacos instance

  • Enterprise Edition with database engine version 3.1.1.0 or later (Developer Edition and Professional Edition do not support this feature)

Detection dimensions and protection levels

The guardrail scans content across four detection dimensions. You set an independent protection level for each dimension to control whether flagged content is logged or blocked.

Detection dimensions

DimensionWhat it detects
Malicious URL detectionMalicious links, phishing websites, and other dangerous URLs
Prompt attack detectionPrompt injection, jailbreak attacks, and other malicious prompt patterns
Content compliance detectionPolitically sensitive, violent, terrorist, or otherwise prohibited content
Sensitive content detectionPrivacy leaks and other sensitive data

Protection levels

Each dimension supports four protection levels. Higher tolerance means fewer items are blocked.

Protection levelBehavior
Do not blockDetects and logs risks without blocking. Publishing proceeds normally.
Low riskBlocks content rated low risk or higher. Most restrictive.
Medium riskBlocks content rated medium risk or higher. Allows low-risk content through.
High riskBlocks only high-risk content. Most tolerant.

The following table summarizes what each level blocks:

Detected risk levelDo not blockLow riskMedium riskHigh risk
LowLog onlyBlockedAllowedAllowed
MediumLog onlyBlockedBlockedAllowed
HighLog onlyBlockedBlockedBlocked

Protection scope

Choose which operations trigger security scanning:

ScopeDescription
Configuration creation and modificationScans content when Nacos configurations are created or updated
MCP service creation and modificationScans content when MCP servers are created or updated

Enable the guardrail

  1. Log on to the MSE Management Console.

  2. In the left-side navigation pane, choose Service Registry & Configuration Center > Instances.

  3. Click the name of your instance.

  4. In the left-side navigation pane, choose Security Protection > Content Security Guardrail.

  5. Click Enable.

    Note
    • After you enable this feature, the system automatically checks the security and compliance of your content when you publish a configuration. Security policies are used to detect sensitive content and compliance issues, such as privacy leaks, malicious script injections, and risks from publishing non-compliant content.

    • On first use, you are prompted to authorize the AliyunServiceRoleForMSEEngineService service-linked role. Follow the on-screen instructions to grant the required permissions.

Configure mitigation policies

After you enable the guardrail, configure the detection policies and protection scope.

  1. On the Mitigation Policy Settings page, set the Blocking Policy for each detection dimension. Select a protection level for each of the four dimensions based on your security requirements. For example, set prompt attack detection to Low risk (block all flagged content) and malicious URL detection to Medium risk (allow low-risk URLs through).

  2. Under Protection Scope, select the operations that trigger security scanning:

    • Configuration creation and modification

    • MCP service creation and modification

  3. Click Save Changes.

Important

Changes take effect immediately. All subsequent publish operations are scanned against the configured policies.

What happens when content is blocked

When you publish a configuration or an MCP service, the guardrail scans the content against all enabled detection dimensions:

  1. If the content passes all checks, it is published normally.

  2. If a policy violation is detected:

    • At the Do not block level, the system logs the finding, sends alerts, and allows publishing to proceed.

    • At other protection levels, the system sends alerts and blocks publishing when the detected risk meets or exceeds the configured threshold.

To review detection history, go to the Content Security Guardrail page in the console.

Considerations

  • Each detection dimension operates independently. Adjusting the protection level for one dimension does not affect the others.

  • The Do not block level still logs detected risks. Use this level to evaluate the guardrail's detection accuracy before enforcing stricter policies.