Validate AI Safety Policies with Online Testing - Content Moderation - Alibaba Cloud - AI Guardrails

Before integrating the Guardrails API, use the online testing feature to validate detection behavior against your content. Submit text, select a policy template, and inspect results — all from the console, without writing code.

Prerequisites

Before you begin, ensure that you have:

Activated Guardrails on the Guardrails service activation page

Running tests incurs charges. For details, see Billing overview.

Run a test

Log on to the Guardrails console.
In the input box, enter the text you want to test.
Below the input box, select a detection policy template:
Template Policy ID
AI input content moderation query_security_check_intl
AI-generated content moderation response_security_check_intl
Click Run test.
Review the Test Results panel.

Template	Policy ID
AI input content moderation	`query_security_check_intl`
AI-generated content moderation	`response_security_check_intl`

Alternatively, select a sample template to test with preset content. Sample templates cover content compliance, sensitive content, and prompt attacks. After selecting a template, click Run test to view the detection results.

Enable additional detection features

If Test Results shows Not enabled next to Sensitive content detection or Prompt injection detection, those features are inactive. Enable them directly from the results panel.

In Test Results, if the status of Sensitive content detection or Prompt injection detection is Not enabled, click Proceed to enable to activate the feature.
On the check item configuration page, select the features to enable.
Confirm the activation. If you enable Sensitive content detection or Prompt injection detection, a dialog box appears to notify you that the feature is billed separately. For details, see Billing overview.