All Products
Search
Document Center

Intelligent Media Management:Billing Items for Intelligent Media Management

Last Updated:Mar 10, 2026

This topic describes the billing items for Intelligent Media Management (IMM) and related details.

Billing Items for Alibaba Cloud International Website

Intelligent Media Management (IMM) includes the following billing items: image intelligence, metadata management, media processing, document processing, and file processing.

Important

Starting at 11:00 UTC+8 on July 28, 2025, IMM will begin charging for some previously free features and will adjust prices for some existing billing items. For more information, refer to the IMM Pricing Adjustment Notice.

Image Intelligence

For detailed pricing of image intelligence billing items, refer to the following table.

Billing Item

Billing Item Description

Related API Operations

Related x-oss-process Operations

Price (USD)

Unit

ImageDetect

Face detection

  • DetectImageFaces (face detection)

  • CompareImageFaces (face similarity comparison)

  • SearchImageFigureCluster (query face clusters)

  • CreateFacesSearchingTask (create a similar-face search task)

  • image/faces

  • image/crop,g_face

  • image/blur,g_face

  • image/blur,g_faces

0.028

per 1,000 calls

Human body detection

DetectImageBodies (human body detection)

image/bodies

Vehicle detection

DetectImageCars

image/cars

ImageLabel

Image tagging

DetectImageLabels (image label detection)

image/labels

0.142

per 1,000 calls

ImageFace

Face images

CreateFacesSearchingTask (create a similar-face search task)

0.028

per 1,000 calls

ImageFaceClustering

Face clustering

  • CreateFigureClusteringTask (create a face clustering task)

  • CreateFigureClustersMergingTask (create a face cluster merging task)

7.0754717

per 1,000 calls

GenerateStory

Story generation

CreateStory (create a story)

7.0754717

per 1,000 calls

ImageMosaic

Image mosaic

AddImageMosaic (add an image mosaic)

0.0074

per 1,000 calls

ImageCropping

Smart image cropping suggestions

DetectImageCropping (detect visually optimal cropping regions)

image/crop,g_auto

0.1415094

per 1,000 calls

ImageQRCodes

QR code detection in images

DetectImageCodes (QR code detection)

image/codes

0.1132075

per 1,000 calls

ImageSplicing

Image stitching

CreateImageSplicingTask (create an image stitching task)

  • Output resolution less than 6000 × 4000: 0.745

  • Output resolution greater than or equal to 6000 × 4000: 2.234

per 1,000 calls

ImageToPDF

Image-to-PDF conversion

CreateImageToPDFTask (create an image-to-PDF conversion task)

0.0074

per 1,000 images

ImageScoring

Image quality scoring

DetectImageScore (image quality scoring)

image/scoring

0.0424528

per 1,000 calls

LocationDateClustering

Spatiotemporal clustering

CreateLocationDateClusteringTask (create a spatiotemporal clustering task)

Free for a limited time

per 1,000 calls

SimilarImageClustering

Image clustering

CreateSimilarImageClusteringTask (create a similar-image clustering task)

Free for a limited time

per 1,000 calls

Blindwatermark

Blind watermarking for images

  • EncodeBlindWatermark (add a blind watermark)

  • CreateDecodeBlindWatermarkTask (create a blind watermark decoding task)

  • image/blindwatermark

  • image/deblindwatermark

0.0990566

per 1,000 calls

ReverseGeocoding

Reverse geocoding

DetectMediaMeta (retrieve media file metadata)

Note

Charged only when the media file contains geographic location information.

0.1415094

per 1,000 calls

ImageTexts

Optical character recognition (OCR) for images

DetectImageTexts (OCR for images)

7.0754717

per 1,000 calls

Metadata Management

For detailed pricing of metadata management billing items, refer to the following table.

Billing Item

Billing item description

Related API Operations

Related x-oss-process Operations

Price (USD)

Unit

StandardQueryL0

Basic queries

  • GetFileMeta (retrieve file metadata)

  • DeleteFileMeta (delete file metadata)

  • UpdateFileMeta (update file metadata)

  • BatchDeleteFileMeta (batch delete file metadata)

  • GetFigureCluster (retrieve face cluster information)

  • UpdateFigureCluster (update face clusters)

  • UpdateStory (update a story)

  • GetStory (retrieve story information)

  • DeleteStory (delete a story)

  • UpdateLocationDateCluster (update spatiotemporal clusters)

  • DeleteLocationDateCluster (delete spatiotemporal clusters)

task/get

0.001

per 1,000 calls

StandardQueryL1

Standard queries

  • BatchGetFileMeta (batch retrieve file metadata)

  • BatchUpdateFileMeta (batch update file metadata)

  • ListFaceGroups (list face groups in a media set)

  • AddStoryFiles (add files to a story)

  • QueryStories (query stories)

  • RemoveStoryFiles (remove files from a story)

  • QueryFigureClusters (Query Figure Clustering)

  • QueryLocationDateClusters (query spatiotemporal clusters)

0.002

per 1,000 calls

StandardQueryL2

Advanced queries

  • SimpleQuery (simple query)

  • FuzzyQuery (fuzzy query)

0.074

per 1,000 calls

MediaMeta

Retrieve media information

  • DetectMediaMeta (retrieve media file metadata)

  • GetMediaMeta (retrieve media file metadata)

  • audio/info

  • video/info

0.1415094

per 1,000 calls

SemanticAnalyze

Semantic analysis

SemanticQuery (natural language query)

0.52

per 1,000 calls

Media Processing

For detailed pricing of media processing billing items, refer to the following table.

Billing Item

Billing item description

Related API Operations

Related x-oss-process Operations

Price (USD)

Unit

AudioCompress

Audio transcoding

CreateMediaConvertTask (create a media transcoding task)

  • audio/concat

  • audio/compress

0.0000141509

per second of audio

VideoCompressCopy

Container format conversion

CreateMediaConvertTask (create a media transcoding task)

0.00001433525

per second of video

VideoCompress264LD

H.264 LD transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0000509434

Videos per second

VideoCompress264SD

H.264 SD transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0000707547

per second of video

VideoCompress264HD

H.264 HD transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0001273585

per second of video

VideoCompress2642K

H.264 2K transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0002830189

per second of video

VideoCompress2644K

H.264 4K transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0006367925

per second of video

VideoCompress265LD

H.265 LD transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0002122642

per second of video

VideoCompress265SD

H.265 SD transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0003537736

per second of video

VideoCompress265HD

H.265 HD transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0007075472

per second of video

VideoCompress2652K

H.265 2K transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0011320755

per second of video

VideoCompress2654K

H.265 4K transcodingNote*

CreateMediaConvertTask (create a media transcoding task)

  • video/concat

  • video/convert

0.0022641509

per second of video

MediaAnimation

Video Animation

CreateMediaConvertTask (create a media transcoding task)

video/animation

  • Basic format: 0.012

  • Advanced format: 0.074

Thousands of frames

ExtractSubtitleText

Text subtitle extraction from videos

CreateMediaConvertTask (create a media transcoding task)

0.223

Qianlu

ExtractSubtitleImage

Image subtitle extraction from videos

CreateMediaConvertTask (create a media transcoding task)

0.015

Thousand Frames

VideoFraming

Video snapshot

CreateMediaConvertTask (create a media transcoding task)

  • video/snapshots

  • video/sprite

0.015

Thousand Frames

VideoClassification

Video label detection

CreateVideoLabelClassificationTask (create a media transcoding task)

7.0754717

per 1,000 calls

LiveTranscoding

Real-time transcodingNote*

GenerateVideoPlaylist (generate a real-time transcoding playlist)

  • hls/m3u8

  • hls/ts

0.0000141509

CountUnit

Document Processing

For detailed pricing of document processing billing items, refer to the following table.

Important

Projects created before December 1, 2023 for online preview and online editing are billed based on the number of times documents are opened. Projects created on or after December 1, 2023 are billed based on the number of API calls.

Billing Item

Billing Item Description

Related API Operations

Related x-oss-process Operations

Price (USD)

Unit

DocumentConvert

Document conversion

CreateOfficeConversionTask (create a document conversion task)

  • doc/convert

  • doc/snapshot

11.3207547

per 1,000 calls

Document content extraction

ExtractDocumentText (extract document text)

DocumentWebofficeEdit

Online editing (Weboffice)Note*

  • GenerateWebofficeToken (obtain a Weboffice token)

  • RefreshWebofficeToken (refresh a Weboffice token)

doc/edit

2.8301887

per 1,000 calls

DocumentWebofficePreview

Online preview (Weboffice)Note*

  • GenerateWebofficeToken (obtain a Weboffice token)

  • RefreshWebofficeToken (refresh a Weboffice token)

doc/preview

1.4150943

per 1,000 calls

DocumentWebofficeCachePreview

Cached preview (Weboffice)

  • GenerateWebofficeToken (obtain a Weboffice token)

  • RefreshWebofficeToken (refresh a Weboffice token)

0.9905660

per 1,000 calls

Important

This refers to the number of API calls.

File Processing

For detailed pricing of file processing billing items, refer to the following table.

Billing Item

Billing Project Description

Related API Operations

Related x-oss-process Operations

Price (USD)

Unit

PointCloudCompress

Point cloud compression

CreateCompressPointCloudTask (create a point cloud compression task)

pointcloud/compress

0.03

per 1,000 calls

FileProcess

File packaging and download

CreateFileCompressionTask (create a file compression task)

0.00074

GB

Archive decompression

CreateFileUncompressionTask (create a decompression task)

FilePreview

Archive preview

CreateArchiveFileInspectionTask (create an archive preview and parsing task)

0.0074

TB

API operations with multiple billing items

Note
  • The SemanticQuery operation incurs charges for both StandardQueryL2 and SemanticAnalyze.

  • The CreateFacesSearchingTask operation incurs charges for both ImageDetect and ImageFace.

Video transcoding details

Note
  • H.264 transcoding: Output video uses the H.264 encoder.

  • H.265 transcoding: Output video uses the H.265 encoder.

  • LD: Output resolution is less than or equal to 640 × 480.

  • SD: Output resolution is less than or equal to 1280 × 720.

  • HD: Output resolution is less than or equal to 1920 × 1080.

  • 2K: Output resolution is less than or equal to 2560 × 1440.

  • 4K: Output resolution is less than or equal to 3840 × 2160.

  • Video transcoding is billed per second of output video. Charges are rounded up to the nearest second. Fractions of a second are billed as a full second.

Document preview and editing billing details

Note
  • For projects created before December 1, 2023, online editing and online preview are billed based on the number of times documents are opened—not the number of API calls.

  • Projects created on or after December 1, 2023 are billed based on the number of API calls. To switch to the new billing model, create a new project.

  • In the API call billing model, each API call can be used by only one user. If an API call is reused, only the last user retains access, and access permissions for other users are revoked.

  • If the Permission.Readonly parameter in the GenerateWebofficeToken operation is set to true, you are charged for document preview. If it is set to false, you are charged for online editing.

  • The RefreshWebofficeToken operation is billed based on the Permission.Readonly setting used when generating the original token. If Permission.Readonly was set to true, you are charged for document preview. Otherwise, you are charged for online editing.

Real-time transcoding billing details

Note
  • Billing components:

    • When generating a playlist, you can control initial transcoding duration using the InitialTranscode parameter. This incurs LiveTranscoding charges.

    • During playback, if a .ts file has not been transcoded, a new transcoding job starts. This incurs LiveTranscoding charges.

    • Reading source video files from OSS and writing transcoded files back to OSS incurs fees. Reading video files from OSS for playback also incurs fees. For details about OSS-related fees, refer to the OSS billing items.

  • LiveTranscoding count formula:

    • Video

      • Codec `eff` values: h264 = 0.3, h265 = 1.8

      • Formula:

    Ceiling(eff * Ceiling(Height/240) * Ceiling(Width/240) * Ceiling(FrameRate/30) + 1) * Ceiling(VideoStreamDuration)

    • Audio

      • `eff` value: 0.3

      • Formula:

    Ceiling(eff * Ceiling(AudioStreamDuration))

  • Billing rule: Each audio or video stream specified in TargetVideo.Stream or TargetAudio.Stream is processed separately and billed individually. The following examples illustrate real-time transcoding costs.

    • Example 1 (playlist generated only; no playback, so no LiveTranscoding charges):

      • A user calls GenerateVideoPlaylist. The output video length is 38 minutes, with a resolution of 800 × 600, a frame rate of 30, and an h264 video codec. The initial transcoding duration is 0 seconds, and TranscodeAhead uses the default value. No video is played.

    • Example 2 (playlist generated only; pre-transcoding configured; only pre-transcoding incurs LiveTranscoding charges):

      • A user calls GenerateVideoPlaylist. The output video length is 38 minutes, with a resolution of 800 × 600, a frame rate of 30, and an h264 video codec. The initial transcoding duration is 30 seconds, and TranscodeAhead uses the default value. No video is played.

      • Charges:

        • LiveTranscoding (CU count formula): Ceiling((0.3 * Ceiling(800/240) * Ceiling(600/240) * Ceiling(30/30) + 1) * Ceiling(30)) + Ceiling(0.3 * Ceiling(30)) = 159 (CountUnit)

    • Example 3 (playlist generated and partially played; only played segments incur LiveTranscoding charges):

      • A user calls GenerateVideoPlaylist. The output video length is 38 minutes, with a resolution of 800 × 600, a frame rate of 30, and an h264 video codec. The initial transcoding duration is 0 seconds, and TranscodeAhead uses the default value. The user plays the m3u8 playlist from the start, stops at minute 5 (default forward transcoding is 2 minutes), then jumps to minute 15 and plays to the end.

      • Charges:

        • LiveTranscoding (CU count formula): Ceiling((0.3 * Ceiling(800/240) * Ceiling(600/240) * Ceiling(30/30) + 1) * (Ceiling((5+2)*60) + Ceiling((38-15)*60))) + Ceiling(0.3 * Ceiling((5+2) * 60)) + Ceiling(0.3 * Ceiling((38-15) * 60)) = 9540 (CountUnit)

    • Example 4 (multiple users play the same video; repeated playback incurs LiveTranscoding charges only once):

      • A user calls GenerateVideoPlaylist. The output video length is 38 minutes, with a resolution of 800 × 600, a frame rate of 30, and an h264 video codec. The initial transcoding duration is 0 seconds, and TranscodeAhead uses the default value.

        User A plays the m3u8 playlist from the start to minute 5, then exits.

        User B plays the m3u8 playlist from minute 15 to the end.

        User C plays the m3u8 playlist from start to finish.

      • Charges:

        • LiveTranscoding (CU count formula): Ceiling((0.3 * Ceiling(800/240) * Ceiling(600/240) * Ceiling(30/30) + 1) * Ceiling(38*60)) + Ceiling(0.3 * Ceiling(38 * 60)) = 12084 (CountUnit)

  • Term definitions:

    • Width: The width of the output video resolution.

    • Height: The height of the output video resolution.

    • FrameRate: The video frame rate.

    • VideoStreamDuration: The duration of the video stream.

    • AudioStreamDuration: The length of the audio stream.

    • `eff`: The CU coefficient.

    • Ceiling(x) function: Returns the smallest integer greater than or equal to x.

Operator and billing item mapping

When you create metadata indexes by binding an OSS Bucket or calling IndexFileMeta or BatchIndexFileMeta, executing operators from the workflow template and operator mapping generates data processing fees, index storage fees, and OSS request fees. OSS request fees are charged by OSS. For more information, refer to OSS request fees. The mapping between operators and billing items is as follows:

Operator

Billing Item

Billed by

OSSMeta operator

GetRequest

OSS

MIME operator

No charge

N/A

FaceDetection operator

ImageFaceNote*

IMM

LabelClassification operator (image)

ImageClassificationNote*

IMM

LabelClassification operator (video)

VideoClassification

IMM

ImageScoring operator

ImageScoringNote*

IMM

ReGEO operator

ReverseGeocoding

IMM

MediaMeta operator

MediaMeta

IMM

EXIF operator

GetRequest

OSS

ExtractDocumentText operator

DocumentConvert

IMM

ExtractImageEmbeddings operator

Free for a limited time

IMM

Important

To process various image formats, IMM uses the image processing feature of Object Storage Service (OSS) to perform one or more operations, such as format conversion or image scaling. These operations incur fees charged by OSS. For more information about these fees, refer to Data processing fees.

External request fees

Accessing OSS through Intelligent Media Management (IMM) incurs OSS request fees. For more information, refer to Request fees.