This topic describes the billing items for Intelligent Media Management (IMM) and related details.
Pricing of Billing Items for Alibaba Cloud International Website
Intelligent Media Management (IMM) includes the following categories of billing items: image intelligence, metadata management, media processing, document processing, and file processing.
Starting at 11:00 UTC+8 on July 28, 2025, IMM will charge for some previously free capabilities and adjust prices for certain existing billing items. For more information, see the IMM Pricing Adjustment Notice.
Image Intelligence
For detailed pricing of image intelligence billing items, see the following table.
Billing Item | Billing Item Description | Related API Operations | Related x-oss-process Operations | Price (USD) | Unit |
ImageDetect | Face detection |
|
| 0.028 | per 1,000 calls |
Body detection | DetectImageBodies (body detection) | image/bodies | |||
Vehicle detection | DetectImageCars | image/cars | |||
ImageLabel | Image tagging | DetectImageLabels (image label detection) | image/labels | 0.142 | per 1,000 calls |
ImageFace | Face images | CreateFacesSearchingTask (create a similar-face search task) | 0.028 | per 1,000 calls | |
ImageFaceClustering | Face clustering |
| 7.0754717 | per 1,000 calls | |
GenerateStory | Story generation | CreateStory (create a story) | 7.0754717 | per 1,000 calls | |
ImageMosaic | Image mosaic | AddImageMosaic (add an image mosaic) | 0.0074 | per 1,000 calls | |
ImageCropping | Smart image cropping suggestions | DetectImageCropping (detect visually optimal cropping regions) | image/crop,g_auto | 0.1415094 | per 1,000 calls |
ImageQRCodes | QR code detection in images | DetectImageCodes (QR code detection) | image/codes | 0.1132075 | per 1,000 calls |
ImageSplicing | Image stitching | CreateImageSplicingTask (create an image stitching task) |
| per 1,000 calls | |
ImageToPDF | Image to PDF conversion | CreateImageToPDFTask (create an image-to-PDF task) | 0.0074 | per 1,000 images | |
ImageScoring | Image quality scoring | DetectImageScore (image quality scoring) | image/scoring | 0.0424528 | per 1,000 calls |
LocationDateClustering | Spatiotemporal clustering | CreateLocationDateClusteringTask (create a spatiotemporal clustering task) | Free during limited period | per 1,000 calls | |
SimilarImageClustering | Image clustering | CreateSimilarImageClusteringTask (create a similar-image clustering task) | Free during limited period | per 1,000 calls | |
Blindwatermark | Blind watermarking for images |
|
| 0.0990566 | per 1,000 calls |
ReverseGeocoding | Reverse geocoding | DetectMediaMeta (retrieve media file metadata) Note Charged only when the media file contains geographic location information. | 0.1415094 | per 1,000 calls | |
ImageTexts | Optical character recognition (OCR) for images | DetectImageTexts (perform OCR on images) | 7.0754717 | per 1,000 calls |
Metadata Management
For detailed pricing of metadata management billing items, see the following table.
Billing Item | Billing Item Description | Related API Operations | Related x-oss-process Operations | Price (USD) | Unit |
StandardQueryL0 | Basic queries |
| task/get | 0.001 | per 1,000 calls |
StandardQueryL1 | Standard queries |
| 0.002 | per 1,000 calls | |
StandardQueryL2 | Advanced queries |
| 0.074 | per 1,000 calls | |
MediaMeta | Retrieve media information |
|
| 0.1415094 | per 1,000 calls |
SemanticAnalyze | Semantic analysis | SemanticQuery (natural language query) | 0.52 | per 1,000 calls |
Media Processing
For detailed pricing of media processing billing items, see the following table.
Billing Item | Billing Item Description | Related API Operations | Related x-oss-process Operations | Price (USD) | Unit |
AudioCompress | Audio transcoding | CreateMediaConvertTask (create a media transcoding task) |
| 0.0000141509 | per second of audio |
VideoCompressCopy | Container format conversion | CreateMediaConvertTask (create a media transcoding task) | 0.00001433525 | Videos per second | |
VideoCompress264LD | H.264-LD transcodingNote* | CreateMediaConvertTask (create a media transcoding task) |
| 0.0000509434 | per second of video |
VideoCompress264SD | H.264-SD transcodingNote* | CreateMediaConvertTask (create a media transcoding task) |
| 0.0000707547 | per second of video |
VideoCompress264HD | H.264-HD transcodingNote* | CreateMediaConvertTask (create a media transcoding task) |
| 0.0001273585 | per second of video |
VideoCompress2642K | H.264-2K transcodingNote* | CreateMediaConvertTask (create a media transcoding task) |
| 0.0002830189 | per second of video |
VideoCompress2644K | H.264-4K transcodingNote* | CreateMediaConvertTask (create a media transcoding task) |
| 0.0006367925 | per second of video |
VideoCompress265LD | H.265-LD transcodingNote* | CreateMediaConvertTask (create a media transcoding task) |
| 0.0002122642 | per second of video |
VideoCompress265SD | H.265-SD transcodingNote* | CreateMediaConvertTask (create a media transcoding task) |
| 0.0003537736 | per second of video |
VideoCompress265HD | H.265-HD transcodingNote* | CreateMediaConvertTask (create a media transcoding task) |
| 0.0007075472 | per second of video |
VideoCompress2652K | H.265-2K transcodingNote* | CreateMediaConvertTask (create a media transcoding task) |
| 0.0011320755 | per second of video |
VideoCompress2654K | H.265-4K transcodingNote* | CreateMediaConvertTask (create a media transcoding task) |
| 0.0022641509 | per second of video |
MediaAnimation | Video to GIF conversion | CreateMediaConvertTask (create a media transcoding task) | video/animation |
| Thousands of Frames |
ExtractSubtitleText | Text subtitle extraction from videos | CreateMediaConvertTask (create a media transcoding task) | 0.223 | per 1,000 streams | |
ExtractSubtitleImage | Image subtitle extraction from videos | CreateMediaConvertTask (create a media transcoding task) | 0.015 | thousand frames | |
VideoFraming | Video snapshot | CreateMediaConvertTask (create a media transcoding task) |
| 0.015 | Thousand Frames |
VideoClassification | Video label detection | CreateVideoLabelClassificationTask (create a media transcoding task) | 7.0754717 | per 1,000 calls | |
LiveTranscoding | Real-time transcodingNote* | GenerateVideoPlaylist (generate a real-time transcoding playlist) |
| 0.0000141509 | CountUnit |
Document Processing
For detailed pricing of document processing billing items, see the following table.
Projects created before December 1, 2023 for online preview or online editing are billed based on the number of times documents are opened. Projects created on or after December 1, 2023 are billed based on the number of API calls.
Billing Item | Billing Item Description | Related API Operations | Related x-oss-process Operations | Price (USD) | Unit |
DocumentConvert | Document conversion | CreateOfficeConversionTask (create a document conversion task) |
| 11.3207547 | per 1,000 calls |
Document text extraction | ExtractDocumentText (extract document text) | ||||
DocumentWebofficeEdit | Online editing (Weboffice)Note* |
| doc/edit | 2.8301887 | per 1,000 calls |
DocumentWebofficePreview | Online preview (Weboffice)Note* |
| doc/preview | 1.4150943 | per 1,000 calls |
DocumentWebofficeCachePreview | Cached preview (Weboffice) |
| 0.9905660 | per 1,000 calls Important This refers to the number of API calls. |
File Processing
For detailed pricing of file processing billing items, see the following table.
Billing Item | Billing Item Description | Related API Operations | Related x-oss-process Operations | Price (USD) | Unit |
PointCloudCompress | Point cloud compression | CreateCompressPointCloudTask (create a point cloud compression task) | pointcloud/compress | 0.03 | per 1,000 calls |
FileProcess | File packaging and download | CreateFileCompressionTask (create a file compression task) | 0.00074 | GB | |
Archive decompression | CreateFileUncompressionTask (create a decompression task) | ||||
FilePreview | Archive preview | CreateArchiveFileInspectionTask (create an archive inspection task) | 0.0074 | TB |
Multiple billing items for the API
The SemanticQuery operation incurs charges for both StandardQueryL2 and SemanticAnalyze billing items.
The CreateFacesSearchingTask operation incurs charges for both ImageDetect and ImageFace billing items.
Video transcoding details
H.264 transcoding: Output video uses the H.264 encoder.
H.265 transcoding: Output video uses the H.265 encoder.
LD: Output video resolution ≤ 640 × 480
SD: Output video resolution ≤ 1280 × 720
HD: Output video resolution ≤ 1920 × 1080
2K: Output video resolution ≤ 2560 × 1440
4K: Output video resolution ≤ 3840 × 2160
Video transcoding is billed per second of video. Charges are rounded up to the nearest second. Fractions of a second are billed as one full second.
Document preview and editing billing details
For projects created before December 1, 2023, online editing and online preview are billed based on the number of times documents are opened—not the number of API calls.
For projects created on or after December 1, 2023, billing is based on the number of API calls. To switch to the new billing model, simply create a new project.
In the API call billing model, each API call can be used by only one user. If reused, only the last user retains access. Access permissions for all other users are revoked.
If the Permission.Readonly parameter in the GenerateWebofficeToken operation is set to true, you are charged for document preview. If it is set to false, you are charged for online editing.
The RefreshWebofficeToken operation is billed based on the parameters used when GenerateWebofficeToken was called. If Permission.Readonly was set to true, you are charged for document preview. Otherwise, you are charged for online editing.
Real-time transcoding billing details
Billing components:
When generating a playlist, you can control the initial transcoding duration using the InitialTranscode parameter. This generates charges for LiveTranscoding.
When playing a video, if a TS file has not been transcoded, a new transcoding job starts. This generates additional LiveTranscoding charges.
Reading source video files from OSS and writing transcoded files back to OSS incurs charges. Playing video files directly from OSS also incurs charges. For details about OSS-related charges, see the OSS billing items.
LiveTranscoding count formula:
Video
Codec eff values: h264 = 0.3, h265 = 1.8
Formula:
Ceiling(eff * Ceiling(Height / 240) * Ceiling(Width / 240) * Ceiling(FrameRate / 30) + 1) * Ceiling(VideoStreamDuration)Audio
eff value: 0.3
Formula:
Ceiling(eff * Ceiling(AudioStreamDuration))
Billing rule: Each audio or video stream specified in TargetVideo.Stream or TargetAudio.Stream is processed separately and billed individually. The following examples illustrate real-time transcoding charges.
Example 1 (playlist generated only, no playback, no LiveTranscoding charges):
A user calls GenerateVideoPlaylist. The output video is 38 minutes long, with resolution 800 × 600, frame rate 30, codec h264, and initial transcoding duration 0 seconds. TranscodeAhead uses the default value. No playback occurs.
Example 2 (playlist generated only, pre-transcoding configured, charges only for pre-transcoding):
A user calls GenerateVideoPlaylist. The output video is 38 minutes long, with resolution 800 × 600, frame rate 30, codec h264, and initial transcoding duration 30 seconds. TranscodeAhead uses the default value. No playback occurs.
Charges:
LiveTranscoding (CU count formula):
Ceiling((0.3 * Ceiling(800 / 240) * Ceiling(600 / 240) * Ceiling(30 / 30) + 1) * Ceiling(30) + Ceiling(0.3 * Ceiling(30)) = 159 (CountUnit)
Example 3 (playlist generated, partial playback, charges only for played segments):
A user calls GenerateVideoPlaylist. The output video is 38 minutes long, with resolution 800 × 600, frame rate 30, codec h264, and initial transcoding duration 0 seconds. TranscodeAhead uses the default value. The user plays the m3u8 playlist from the start, stops at minute 5 (default forward transcoding is 2 minutes), then jumps to minute 15 and plays to the end.
Charges:
LiveTranscoding (CU count formula):
Ceiling((0.3 * Ceiling(800 / 240) * Ceiling(600 / 240) * Ceiling(30 / 30) + 1) * (Ceiling((5 + 2) * 60) + Ceiling((38 - 15) * 60)) + Ceiling(0.3 * Ceiling((5 + 2) * 60)) + Ceiling(0.3 * Ceiling((38 - 15) * 60)) = 9540 (CountUnit)
Example 4 (multiple users play the same video, repeated playback incurs charges only once):
A user calls GenerateVideoPlaylist. The output video is 38 minutes long, with resolution 800 × 600, frame rate 30, codec h264, and initial transcoding duration 0 seconds. TranscodeAhead uses the default value.
User A plays the m3u8 playlist from the start and stops at minute 5.
User B plays the m3u8 playlist starting at minute 15 and plays to the end.
User C plays the m3u8 playlist from start to finish.
Charges:
LiveTranscoding (CU count formula):
Ceiling((0.3 * Ceiling(800 / 240) * Ceiling(600 / 240) * Ceiling(30 / 30) + 1) * Ceiling(38 * 60) + Ceiling(0.3 * Ceiling(38 * 60)) = 12084 (CountUnit)
Term definitions:
Width: Width of the output video resolution
Height: Height of the output video resolution
FrameRate: Video frame rate
VideoStreamDuration: Duration of the video stream
AudioStreamDuration: Duration of the audio stream
eff: CU coefficient
Ceiling(x) function: Returns the smallest integer value greater than or equal to x.
Operator and billing item mapping
When you create metadata indexes by attaching an OSS bucket or calling IndexFileMeta or BatchIndexFileMeta, executing the operators in Mapping between workflow templates and operators incurs fees for data processing, index storage, and OSS requests. OSS request fees are charged by OSS. For more information, see OSS request fees. The mapping between operators and billing items is as follows:
Operator | Billing Item | Billed by |
OSSMeta operator | OSS | |
MIME operator | No charge | None |
FaceDetection operator | ImageFaceNote* | IMM |
LabelClassification operator (image) | ImageClassificationNote* | IMM |
LabelClassification operator (video) | VideoClassification | IMM |
ImageScoring operator | ImageScoringNote* | IMM |
ReGEO operator | ReverseGeocoding | IMM |
MediaMeta operator | MediaMeta | IMM |
EXIF operator | OSS | |
ExtractDocumentText operator | DocumentConvert | IMM |
ExtractImageEmbeddings operator | Free during limited period | IMM |
To process various image formats, IMM uses the image processing capabilities of Object Storage Service (OSS) to perform operations such as format conversion and image scaling one or more times. These operations incur corresponding fees, which are charged by OSS. For more information about these fees, see Data processing fees.
External request fees
Using Intelligent Media Management (IMM) to access OSS incurs OSS request fees. For more information, see Request fees.