The AI Gateway for Model Service is an intelligent middleware service that centrally manages and optimizes AI model calls for your enterprise. Its core features include smart routing, load balancing, identity authentication, traffic shaping, and cache acceleration. The smart routing feature automatically assigns requests to the optimal model. The service also provides real-time monitoring and data analytics.
Prerequisites
You have connected AI Application Observability to Cloud Monitor 2.0.
View the service list
Log on to the Cloud Monitor 2.0 console, and select a workspace. In the left navigation pane, choose .
On the AI Observability page, choose .
On the AI Gateway Service page, you can view the service lists for AI Gateway and AI API, and their basic metrics:
AI Gateway metrics include the following: Instance Name, AI Gateway ID, Region, Specifications, Status, Virtual Network ID, Virtual Switch ID, Number of Replicas, and Resource Group ID.
AI API metrics include the following: API Name, API ID, Region, Model Provider, API Protocol, API Base Path, and Gateway ID.
Filter the service list data.
On the AI Gateway and AI API tabs, you can click the
icon in a column to filter by that field.In the upper-right corner of the service list, you can filter the data by selecting a time range, such as 1 minute, 5 minutes, 15 minutes, 1 hour, or 1 day, or by specifying a custom time.
Click the target instance name or API name to access the visualization dashboard for the AI Gateway or AI API and view metrics such as queries per second (QPS), request success rate, token consumption, and response time.