All Products
Search
Document Center

Platform For AI:Monitoring and logging

Last Updated:Mar 10, 2026

Monitor PAI resources using CloudMonitor, track configuration changes with Cloud Config, and audit operations with ActionTrail.

Alibaba Cloud health status

Track the health status of Alibaba Cloud resources to handle exceptions quickly. Visit Alibaba Cloud Health Status to monitor service status.

Check service health status by region and subscribe to RSS feeds for exception notifications.

image..png

CloudMonitor

CloudMonitor Basic provides free real-time monitoring for PAI resources, including operational status, ECS resource usage, website performance, and business disruptions.

Enable CloudMonitor Basic for PAI to use monitoring capabilities. See Cloud service monitoring.

Enable alerts for critical metrics

Enable alerts for multiple critical PAI metrics simultaneously to establish an alert system efficiently and gain comprehensive insights into resource usage and business operations. See Enable the initiative alert feature.

Configure custom alerts

Create a custom dashboard to manage all metrics on a single platform. See Manage the monitoring charts of a custom dashboard.

Configure alert rules for each metric to receive notifications through phone calls, text messages, emails, DingTalk chatbots, or the Alibaba Cloud app.

Create an alert blacklist to block alerts for specific metrics. See Manage blacklist policies.

Cloud Config

Cloud Config monitors configuration changes of all cloud resources and ensures continuous compliance of your cloud infrastructure.

Track resource configuration changes

Cloud Config audits operations of your Alibaba Cloud account and RAM users. Configuration changes are recorded every 10 minutes by default.

Enable compliance pre-check for MLPS 2.0

Cloud Config uses rules aligned with MLPS 2.0 baseline to evaluate resource configuration compliance. Enable compliance pre-check with a few clicks. The system automatically checks resources for compliance continuously. Download the pre-check report and submit it to an inspection agency.

Query and analyze audit data

Send historical configuration changes and non-compliant events to a Simple Log Service Logstore to query and analyze audit data centrally. See Deliver resource data to a Logstore in Simple Log Service.

ActionTrail

Enable ActionTrail for PAI to monitor and record Alibaba Cloud account operations centrally, including console logon and cloud resource access. Perform security analysis, intrusion detection, resource change tracking, and compliance auditing based on these records.

ActionTrail generates logs for cloud service access through the Alibaba Cloud Management Console, API operations, and developer tools. For audit events details, see Audit events of ECS.

ActionTrail tracks and retains events for 90 days by default. To retain events longer, create a trail that sends events to a Simple Log Service Logstore or OSS bucket. See Getting Started.

After creating a trail to send events to a Simple Log Service Logstore or OSS bucket, query or analyze events in the Simple Log Service or OSS console. See Query events in the Simple Log Service or OSS console.

To trace a historical event, submit a ticket to request the required permissions.

Workspace notification

PAI provides a notification mechanism for workspaces. Create notification rules to monitor DLC job and pipeline job status, or trigger events based on model version approval status. Receive notifications through DingTalk, phone calls, and emails. See Workspace notification.

Tensorboard

Create a Tensorboard in Machine Learning Designer or for a DLC job to view model training analytical reports in a visualized manner. See the following topics: