All Products
Search
Document Center

Intelligent Media Management:Billing Items for Intelligent Media Management

Last Updated:Feb 15, 2026

This topic describes the billing items for Intelligent Media Management (IMM) and related details.

Pricing of Billing Items for Alibaba Cloud International Website

Intelligent Media Management (IMM) includes the following categories of billing items: image intelligence, metadata management, media processing, document processing, and file processing.

Important

Starting at 11:00 UTC+8 on July 28, 2025, IMM will charge for some previously free capabilities and adjust prices for certain existing billing items. For more information, see the IMM Pricing Adjustment Notice.

Image Intelligence

For detailed pricing of image intelligence billing items, see the following table.

Billing Item

Billing Item Description

Related API Operations

Related x-oss-process Operations

Price (USD)

Unit

ImageDetect

Face detection

  • DetectImageFaces (face detection)

  • CompareImageFaces (face similarity comparison)

  • SearchImageFigureCluster (query face clusters)

  • CreateFacesSearchingTask (create a similar-face search task)

  • image/faces

  • image/crop,g_face

  • image/blur,g_face

  • image/blur,g_faces

0.028

per 1,000 calls

Body detection

DetectImageBodies (body detection)

image/bodies

Vehicle detection

DetectImageCars

image/cars

ImageLabel

Image tagging

DetectImageLabels (image label detection)

image/labels

0.142

per 1,000 calls

ImageFace

Face images

CreateFacesSearchingTask (create a similar-face search task)

0.028

per 1,000 calls

ImageFaceClustering

Face clustering

  • CreateFigureClusteringTask (create a face clustering task)

  • CreateFigureClustersMergingTask (create a task to merge face clusters)

7.0754717

per 1,000 calls

GenerateStory

Story generation

CreateStory (create a story)

7.0754717

per 1,000 calls

ImageMosaic

Image mosaic

AddImageMosaic (add an image mosaic)

0.0074

per 1,000 calls

ImageCropping

Smart image cropping suggestions

DetectImageCropping (detect visually optimal cropping regions)

image/crop,g_auto

0.1415094

per 1,000 calls

ImageQRCodes

QR code detection in images

DetectImageCodes (QR code detection)

image/codes

0.1132075

per 1,000 calls

ImageSplicing

Image stitching

CreateImageSplicingTask (create an image stitching task)

  • 0.745 for output resolution less than 6000 × 4000

  • 2.234 for output resolution greater than 6000 × 4000

per 1,000 calls

ImageToPDF

Image to PDF conversion

CreateImageToPDFTask (create an image-to-PDF task)

0.0074

per 1,000 images

ImageScoring

Image quality scoring

DetectImageScore (image quality scoring)

image/scoring

0.0424528

per 1,000 calls

LocationDateClustering

Spatiotemporal clustering

CreateLocationDateClusteringTask (create a spatiotemporal clustering task)

Free during limited period

per 1,000 calls

SimilarImageClustering

Image clustering

CreateSimilarImageClusteringTask (create a similar-image clustering task)

Free during limited period

per 1,000 calls

Blindwatermark

Blind watermarking for images

  • EncodeBlindWatermark (add a blind watermark)

  • CreateDecodeBlindWatermarkTask (create a blind watermark decoding task)

  • image/blindwatermark

  • image/deblindwatermark

0.0990566

per 1,000 calls

ReverseGeocoding

Reverse geocoding

DetectMediaMeta (retrieve media file metadata)

Note

Charged only when the media file contains geographic location information.

0.1415094

per 1,000 calls

ImageTexts

Optical character recognition (OCR) for images

DetectImageTexts (perform OCR on images)

7.0754717

per 1,000 calls

Metadata Management

For detailed pricing of metadata management billing items, see the following table.

Billing Item

Billing Item Description

Related API Operations

Related x-oss-process Operations

Price (USD)

Unit

StandardQueryL0

Basic queries

  • GetFileMeta (retrieve file metadata)

  • DeleteFileMeta (delete file metadata)

  • UpdateFileMeta (update file metadata)

  • BatchDeleteFileMeta (batch delete file metadata)

  • GetFigureCluster (retrieve face cluster information)

  • UpdateFigureCluster (Updates the figure clustering)

  • UpdateStory (update a story)

  • GetStory (retrieve story information)

  • DeleteStory (delete a story)

  • UpdateLocationDateCluster (update spatiotemporal clusters)

  • DeleteLocationDateCluster (delete spatiotemporal clusters)

task/get

0.001

per 1,000 calls

StandardQueryL1

Standard queries

  • BatchGetFileMeta (batch retrieve file metadata)

  • BatchUpdateFileMeta (batch update file metadata)

  • ListFaceGroups (list face groups in a media set)

  • AddStoryFiles (add files to a story)

  • QueryStories (query stories)

  • RemoveStoryFiles (remove files from a story)

  • QueryFigureClusters (query figure clusters)

  • QueryLocationDateClusters (query spatiotemporal clusters)

0.002

per 1,000 calls

StandardQueryL2

Advanced queries

  • SimpleQuery (simple query)

  • FuzzyQuery (fuzzy query)

0.074

per 1,000 calls

MediaMeta

Retrieve media information

  • DetectMediaMeta (retrieve media file metadata)

  • GetMediaMeta (retrieve media file metadata)

  • audio/info

  • video/info

0.1415094

per 1,000 calls

SemanticAnalyze

Semantic analysis

SemanticQuery (natural language query)

0.52

per 1,000 calls

Media Processing

For detailed pricing of media processing billing items, see the following table.

Billing Item

Billing Item Description

Related API Operations

Related x-oss-process Operations

Price (USD)

Unit

AudioCompress

Audio transcoding

CreateMediaConvertTask (create a media transcoding task)

  • audio/concat

  • audio/compress

0.0000141509

per second of audio

VideoCompressCopy

Container format conversion

CreateMediaConvertTask (create a media transcoding task)

0.00001433525

Videos per second

VideoCompress264LD

H.264-LD transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0000509434

per second of video

VideoCompress264SD

H.264-SD transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0000707547

per second of video

VideoCompress264HD

H.264-HD transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0001273585

per second of video

VideoCompress2642K

H.264-2K transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0002830189

per second of video

VideoCompress2644K

H.264-4K transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0006367925

per second of video

VideoCompress265LD

H.265-LD transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0002122642

per second of video

VideoCompress265SD

H.265-SD transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0003537736

per second of video

VideoCompress265HD

H.265-HD transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0007075472

per second of video

VideoCompress2652K

H.265-2K transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0011320755

per second of video

VideoCompress2654K

H.265-4K transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0022641509

per second of video

MediaAnimation

Video to GIF conversion

CreateMediaConvertTask (create a media transcoding task)

video/animation

  • Basic format: 0.012

  • Advanced format: 0.074

Thousands of Frames

ExtractSubtitleText

Text subtitle extraction from videos

CreateMediaConvertTask (create a media transcoding task)

0.223

per 1,000 streams

ExtractSubtitleImage

Image subtitle extraction from videos

CreateMediaConvertTask (create a media transcoding task)

0.015

thousand frames

VideoFraming

Video snapshot

CreateMediaConvertTask (create a media transcoding task)

  • video/snapshots

  • video/sprite

0.015

Thousand Frames

VideoClassification

Video label detection

CreateVideoLabelClassificationTask (create a media transcoding task)

7.0754717

per 1,000 calls

LiveTranscoding

Real-time transcodingNote*

GenerateVideoPlaylist (generate a real-time transcoding playlist)

  • hls/m3u8

  • hls/ts

0.0000141509

CountUnit

Document Processing

For detailed pricing of document processing billing items, see the following table.

Important

Projects created before December 1, 2023 for online preview or online editing are billed based on the number of times documents are opened. Projects created on or after December 1, 2023 are billed based on the number of API calls.

Billing Item

Billing Item Description

Related API Operations

Related x-oss-process Operations

Price (USD)

Unit

DocumentConvert

Document conversion

CreateOfficeConversionTask (create a document conversion task)

  • doc/convert

  • doc/snapshot

11.3207547

per 1,000 calls

Document text extraction

ExtractDocumentText (extract document text)

DocumentWebofficeEdit

Online editing (Weboffice)Note*

  • GenerateWebofficeToken (obtain a Weboffice token)

  • RefreshWebofficeToken (refresh a Weboffice token)

doc/edit

2.8301887

per 1,000 calls

DocumentWebofficePreview

Online preview (Weboffice)Note*

  • GenerateWebofficeToken (obtain a Weboffice token)

  • RefreshWebofficeToken (refresh a Weboffice token)

doc/preview

1.4150943

per 1,000 calls

DocumentWebofficeCachePreview

Cached preview (Weboffice)

  • GenerateWebofficeToken (obtain a Weboffice token)

  • RefreshWebofficeToken (refresh a Weboffice token)

0.9905660

per 1,000 calls

Important

This refers to the number of API calls.

File Processing

For detailed pricing of file processing billing items, see the following table.

Billing Item

Billing Item Description

Related API Operations

Related x-oss-process Operations

Price (USD)

Unit

PointCloudCompress

Point cloud compression

CreateCompressPointCloudTask (create a point cloud compression task)

pointcloud/compress

0.03

per 1,000 calls

FileProcess

File packaging and download

CreateFileCompressionTask (create a file compression task)

0.00074

GB

Archive decompression

CreateFileUncompressionTask (create a decompression task)

FilePreview

Archive preview

CreateArchiveFileInspectionTask (create an archive inspection task)

0.0074

TB

Multiple billing items for the API

Note
  • The SemanticQuery operation incurs charges for both StandardQueryL2 and SemanticAnalyze billing items.

  • The CreateFacesSearchingTask operation incurs charges for both ImageDetect and ImageFace billing items.

Video transcoding details

Note
  • H.264 transcoding: Output video uses the H.264 encoder.

  • H.265 transcoding: Output video uses the H.265 encoder.

  • LD: Output video resolution ≤ 640 × 480

  • SD: Output video resolution ≤ 1280 × 720

  • HD: Output video resolution ≤ 1920 × 1080

  • 2K: Output video resolution ≤ 2560 × 1440

  • 4K: Output video resolution ≤ 3840 × 2160

  • Video transcoding is billed per second of video. Charges are rounded up to the nearest second. Fractions of a second are billed as one full second.

Document preview and editing billing details

Note
  • For projects created before December 1, 2023, online editing and online preview are billed based on the number of times documents are opened—not the number of API calls.

  • For projects created on or after December 1, 2023, billing is based on the number of API calls. To switch to the new billing model, simply create a new project.

  • In the API call billing model, each API call can be used by only one user. If reused, only the last user retains access. Access permissions for all other users are revoked.

  • If the Permission.Readonly parameter in the GenerateWebofficeToken operation is set to true, you are charged for document preview. If it is set to false, you are charged for online editing.

  • The RefreshWebofficeToken operation is billed based on the parameters used when GenerateWebofficeToken was called. If Permission.Readonly was set to true, you are charged for document preview. Otherwise, you are charged for online editing.

Real-time transcoding billing details

Note
  • Billing components:

    • When generating a playlist, you can control the initial transcoding duration using the InitialTranscode parameter. This generates charges for LiveTranscoding.

    • When playing a video, if a TS file has not been transcoded, a new transcoding job starts. This generates additional LiveTranscoding charges.

    • Reading source video files from OSS and writing transcoded files back to OSS incurs charges. Playing video files directly from OSS also incurs charges. For details about OSS-related charges, see the OSS billing items.

  • LiveTranscoding count formula:

    • Video

      • Codec eff values: h264 = 0.3, h265 = 1.8

      • Formula:

    Ceiling(eff * Ceiling(Height / 240) * Ceiling(Width / 240) * Ceiling(FrameRate / 30) + 1) * Ceiling(VideoStreamDuration)

    • Audio

      • eff value: 0.3

      • Formula:

    Ceiling(eff * Ceiling(AudioStreamDuration))

  • Billing rule: Each audio or video stream specified in TargetVideo.Stream or TargetAudio.Stream is processed separately and billed individually. The following examples illustrate real-time transcoding charges.

    • Example 1 (playlist generated only, no playback, no LiveTranscoding charges):

      • A user calls GenerateVideoPlaylist. The output video is 38 minutes long, with resolution 800 × 600, frame rate 30, codec h264, and initial transcoding duration 0 seconds. TranscodeAhead uses the default value. No playback occurs.

    • Example 2 (playlist generated only, pre-transcoding configured, charges only for pre-transcoding):

      • A user calls GenerateVideoPlaylist. The output video is 38 minutes long, with resolution 800 × 600, frame rate 30, codec h264, and initial transcoding duration 30 seconds. TranscodeAhead uses the default value. No playback occurs.

      • Charges:

        • LiveTranscoding (CU count formula): Ceiling((0.3 * Ceiling(800 / 240) * Ceiling(600 / 240) * Ceiling(30 / 30) + 1) * Ceiling(30) + Ceiling(0.3 * Ceiling(30)) = 159 (CountUnit)

    • Example 3 (playlist generated, partial playback, charges only for played segments):

      • A user calls GenerateVideoPlaylist. The output video is 38 minutes long, with resolution 800 × 600, frame rate 30, codec h264, and initial transcoding duration 0 seconds. TranscodeAhead uses the default value. The user plays the m3u8 playlist from the start, stops at minute 5 (default forward transcoding is 2 minutes), then jumps to minute 15 and plays to the end.

      • Charges:

        • LiveTranscoding (CU count formula): Ceiling((0.3 * Ceiling(800 / 240) * Ceiling(600 / 240) * Ceiling(30 / 30) + 1) * (Ceiling((5 + 2) * 60) + Ceiling((38 - 15) * 60)) + Ceiling(0.3 * Ceiling((5 + 2) * 60)) + Ceiling(0.3 * Ceiling((38 - 15) * 60)) = 9540 (CountUnit)

    • Example 4 (multiple users play the same video, repeated playback incurs charges only once):

      • A user calls GenerateVideoPlaylist. The output video is 38 minutes long, with resolution 800 × 600, frame rate 30, codec h264, and initial transcoding duration 0 seconds. TranscodeAhead uses the default value.

        User A plays the m3u8 playlist from the start and stops at minute 5.

        User B plays the m3u8 playlist starting at minute 15 and plays to the end.

        User C plays the m3u8 playlist from start to finish.

      • Charges:

        • LiveTranscoding (CU count formula): Ceiling((0.3 * Ceiling(800 / 240) * Ceiling(600 / 240) * Ceiling(30 / 30) + 1) * Ceiling(38 * 60) + Ceiling(0.3 * Ceiling(38 * 60)) = 12084 (CountUnit)

  • Term definitions:

    • Width: Width of the output video resolution

    • Height: Height of the output video resolution

    • FrameRate: Video frame rate

    • VideoStreamDuration: Duration of the video stream

    • AudioStreamDuration: Duration of the audio stream

    • eff: CU coefficient

    • Ceiling(x) function: Returns the smallest integer value greater than or equal to x.

Operator and billing item mapping

When you create metadata indexes by attaching an OSS bucket or calling IndexFileMeta or BatchIndexFileMeta, executing the operators in Mapping between workflow templates and operators incurs fees for data processing, index storage, and OSS requests. OSS request fees are charged by OSS. For more information, see OSS request fees. The mapping between operators and billing items is as follows:

Operator

Billing Item

Billed by

OSSMeta operator

GetRequest

OSS

MIME operator

No charge

None

FaceDetection operator

ImageFaceNote*

IMM

LabelClassification operator (image)

ImageClassificationNote*

IMM

LabelClassification operator (video)

VideoClassification

IMM

ImageScoring operator

ImageScoringNote*

IMM

ReGEO operator

ReverseGeocoding

IMM

MediaMeta operator

MediaMeta

IMM

EXIF operator

GetRequest

OSS

ExtractDocumentText operator

DocumentConvert

IMM

ExtractImageEmbeddings operator

Free during limited period

IMM

Important

To process various image formats, IMM uses the image processing capabilities of Object Storage Service (OSS) to perform operations such as format conversion and image scaling one or more times. These operations incur corresponding fees, which are charged by OSS. For more information about these fees, see Data processing fees.

External request fees

Using Intelligent Media Management (IMM) to access OSS incurs OSS request fees. For more information, see Request fees.