All Products
Search
Document Center

AI Guardrails:Configure check items

Last Updated:Mar 31, 2026

Use the Guardrails console to control which content risks are detected for your AI applications. Enable or disable detection services, configure blocklists and allowlists, and fine-tune risk tag detection scopes to match your application's needs.

Prerequisites

Before you begin, ensure that you have:

Configure detection policies

  1. Log on to the Guardrails consoleGuardrails console.

  2. In the navigation pane on the left, choose Protection Configuration > Configuration.

    Two policies are available:

    PolicyIdentifier
    AI input content moderationquery_security_check_intl
    AI-generated content moderationresponse_security_check_intl

    image

  3. Enable or disable detection services as needed:

    • Sensitive content detection

    • Prompt injection detection

    To review the rule details for a service before enabling it, click ManagementAI input content moderation (query_security_check_intl) in the Actions column.

    Note: Enabling Sensitive content detection or Prompt injection detection triggers a billing notification. These services are billed separately. For details, see Activation and billing overview.
  4. Configure vocabularies: select a vocabulary to add to the blocklist or allowlist. For details, see Vocabulary management.

  5. Manage risk tags: Enable or disable each risk tag in the Guardrails console. For certain risk tags, configure a more specific detection scope. The following steps use the AI input content moderation (query_security_check_intl) policy as an example:

    1. On the Rule Management tab, click Management in the Actions column.

    2. Select the detection type to configure, such as Undesirable Content.

    3. Click Edit to enter edit mode, then modify the detection status.

    4. Click Save.

    Note: Changes take effect in about 2 to 5 minutes.

    image

What's next