Multimedia analysis provides algorithm-based services to analyze multimedia content. These services include foundation model services and advanced model services, which offer out-of-the-box algorithm capabilities. This topic describes the billing details and usage instructions for multimedia analysis.
Background information
Multimedia analysis supports the following algorithm services:
Foundation model services: These services provide out-of-the-box algorithm capabilities for images. They include model services such as multi-label image tagging, image quality assessment, facial attribute analysis (such as attractiveness, face shape, hairstyle, and hair color), age analysis, figure modification (slimming or plus-size), and watermark removal.
Advanced model services: These services provide out-of-the-box algorithm capabilities for videos. They include model services such as video classification and tagging, video quality assessment, dynamic classification and tagging for posts with images and videos (used for tagging multimodal content such as dynamic posts and threads), and AI-generated image tagging. The tags improve the training of AI image generation models.
Billing details
Multimedia analysis supports two billing methods: pay-as-you-go and subscription resource plans. For more information, see Billing details for multimedia analysis.
Usage guide
Activate multimedia analysis and purchase a resource plan
First-time users must activate the service in the Multimedia Analysis section, which is under Solutions on the Platform for AI (PAI) page. The procedure is as follows.
Log on to the PAI console.
Follow the instructions in the figure to activate the Multimedia Analysis service.
The pay-as-you-go billing method is used by default. You are billed based on the number of calls.

You can also purchase a resource plan with a one-time payment for a lower price.
On the Basic Model Service tab of the Multimedia Analytics page, click Purchase Resource Plan.
On the Subscription Model Service page, configure the Quantity, Scenarios, API Calls and Duration parameters, and then click Buy Now.
To use multimedia analysis services, set the Scenarios parameter to Multimedia Analysis-Basic Model Service or Multimedia Analysis-Advanced Model Service. Configure the other parameters based on your business requirements.
Python SDK instructions
After activating the multimedia analysis service, you can use the Python software development kit (SDK) to call various algorithm services. For more information, see Multimedia analysis: Python SDK instructions.
Java SDK instructions
After activating the multimedia analysis service, refer to the Java SDK GitHub for details about using the Java SDK to call API operations for algorithm services. The parameters for the Java SDK are almost identical to those for the Python SDK. For parameter details, see Multimedia analysis: Python SDK instructions.
Multimedia analysis capabilities matrix
Specification | Model service name | Consumption per service call | Description | Example |
Foundation model service | Image quality assessment | 1 foundation model service call | Provides image quality assessment and returns a floating-point score from 0 to 100. |
|
Facial attribute analysis | 1 foundation model service call |
|
| |
Age analysis | 1 foundation model service call |
| Age ranges include the following: | |
Multi-label image tagging | 1 foundation model service call | Provides multi-label image tagging. It can output the top K tags with the highest probabilities and their corresponding high-dimensional features. | Examples of frequent tags: girl, selfie, boy, daily life, screenshot, food, car, cuisine, game, cartoon, animal, Korean fashion. | |
Figure modification | 1 foundation model service call | Provides a figure modification feature. You can upload a portrait and adjust the figure by changing the degree parameter. This includes making the figure slimmer or larger. A | The API operation returns the Base64 encoding of the modified image. | |
Watermark removal | 1 foundation model service call | Removes watermarks from an image. | The API operation returns the Base64 encoding of the image after watermark removal. | |
AI-generated image tagging | 1 foundation model service call | Provides multi-label image tagging capabilities for training AI image generation models, such as Stable Diffusion. Better tags improve the quality of the generated images. |
| |
Custom model service | N foundation model service calls. The value of N varies based on the complexity of the custom model. | Provides custom model services for images and videos. | Depends on the specific type of custom model. | |
Advanced model service | Dynamic classification and tagging for posts with images and videos | 1 advanced model service call | Provides classification and tagging for dynamic posts or threads that contain multimodal content. Supports classification and tagging using text and image or text and video combinations. Also supports returning high-dimensional feature embeddings. |
|
Video quality assessment | 1 advanced model service call | Provides short video quality assessment and returns a floating-point quality score from 0 to 100. |
| |
Video classification and tagging | 1 advanced model service call | Provides short video classification and tagging. Returns the video class and the top K tags with the highest probabilities. Also supports outputting high-dimensional video features. |
|
Testing and service
For further testing and support, contact us by submitting a ticket for technical support.