All Products
Search
Document Center

Intelligent Media Management:Billable items

Last Updated:Nov 27, 2025

This topic describes the billable items of Intelligent Media Management (IMM) and includes important notes.

Pricing of billable items on Alibaba Cloud International Website

Intelligent Media Management (IMM) has billable items in the following categories: smart imaging, metadata management, media management, document processing, and file processing.

Important

Starting from 11:00 on July 28, 2025 (UTC+8), Intelligent Media Management (IMM) will charge for some features that are currently free and adjust the prices of some existing billable items. For more information, see IMM Billing Adjustment Announcement.

Smart imaging

The following table describes the pricing of the smart imaging billable items.

Billable item

Description

Related API operations

Related x-oss-process operations

Price before 11:00 on July 28, 2025 (USD)

Price after 11:00 on July 28, 2025 (USD)

Unit

ImageDetect

Face detection

  • DetectImageFaces (Face detection)

  • CompareImageFaces (Compare face similarity in images)

  • SearchImageFigureCluster (Query the cluster to which a face in an image belongs)

  • CreateFacesSearchingTask (Create a task to search for images with similar faces)

  • image/faces

  • image/crop,g_face

  • image/blur,g_face

  • image/blur,g_faces

0.028

0.028

Per 1,000 calls

Body detection

DetectImageBodies (Body detection in images)

image/bodies

Free for a limited time

Vehicle detection

DetectImageCars

image/cars

Free for a limited time

ImageLabel

Image tagging

DetectImageLabels (Image tag detection)

image/labels

0.142

0.142

Per 1,000 calls

ImageFace

Face image

CreateFacesSearchingTask (Create a task to search for images with similar faces)

Free for a limited time

0.028

Per 1,000 calls

ImageFaceClustering

Face clustering

  • CreateFigureClusteringTask (Create a face clustering task for figures)

  • CreateFigureClustersMergingTask (Create a task to merge face cluster groups)

7.0754717

7.0754717

Per 1,000 calls

GenerateStory

Story generation

CreateStory (Create a story)

7.0754717

7.0754717

Per 1,000 calls

ImageMosaic

Image mosaic

AddImageMosaic (Add a mosaic to an image)

Free for a limited time

0.0074

Per 1,000 calls

ImageCropping

Smart cropping suggestions for images

DetectImageCropping (Detect visually appealing crop boxes in an image)

image/crop,g_auto

0.1415094

0.1415094

Per 1,000 calls

ImageQRCodes

QR code detection in images

DetectImageCodes (QR code detection in images)

image/codes

0.1132075

0.1132075

Per 1,000 calls

ImageSplicing

Image splicing

CreateImageSplicingTask (Create an image splicing task)

Free for a limited time

  • If the output resolution is smaller than 6000 × 4000: 0.745

  • If the output resolution is larger than 6000 × 4000: 2.234

Per 1,000 calls

ImageToPDF

Image to PDF conversion

CreateImageToPDFTask (Create an image-to-PDF conversion task)

Free for a limited time

0.0074

Tofu skin

ImageScoring

Image quality scoring

DetectImageScore (Image quality scoring)

image/scoring

0.0424528

0.0424528

Per 1,000 calls

LocationDateClustering

Spatiotemporal clustering

CreateLocationDateClusteringTask (Create a spatiotemporal clustering task)

Free for a limited time

Free for a limited time

Per 1,000 calls

SimilarImageClustering

Image clustering

CreateSimilarImageClusteringTask (Create a similar image clustering task)

Free for a limited time

Free for a limited time

Per 1,000 calls

Blindwatermark

Blind watermark for images

  • EncodeBlindWatermark (Add a blind watermark to an image)

  • CreateDecodeBlindWatermarkTask (Create a blind watermark parsing task)

  • image/blindwatermark

  • image/deblindwatermark

0.0990566

0.0990566

Per 1,000 calls

ReverseGeocoding

Reverse geocoding

DetectMediaMeta (Get media file metadata)

Note

Charged when the media file contains geographic location information.

0.1415094

0.1415094

Per 1,000 calls

ImageTexts

Image text recognition (OCR)

DetectImageTexts (Image text recognition)

7.0754717

7.0754717

Per 1,000 calls

Metadata management

The following table describes the pricing of the metadata management billable items.

Billable item

Description

Related API operations

Related x-oss-process operations

Price before 11:00 on July 28, 2025 (USD)

Price after 11:00 on July 28, 2025 (USD)

Unit

StandardQueryL0

Basic query

  • GetFileMeta (Get file metadata)

  • DeleteFileMeta (Delete object metadata)

  • UpdateFileMeta (Update file metadata)

  • BatchDeleteFileMeta (Batch delete object metadata)

  • GetFigureCluster (Get face clustering information for a figure)

  • UpdateFigureCluster (Update figure cluster)

  • UpdateStory (Update story)

  • GetStory (Get story information)

  • DeleteStory (Delete story)

  • UpdateLocationDateCluster (Update spatiotemporal cluster)

  • DeleteLocationDateCluster (Delete spatiotemporal cluster group)

task/get

0.014

0.001

Per 1,000 calls

StandardQueryL1

Standard query

  • BatchGetFileMeta (Batch get file metadata)

  • BatchUpdateFileMeta (Batch update object metadata)

  • ListFaceGroups (Get the list of face groups in a media set)

  • AddStoryFiles (Add files to a story)

  • QueryStories (Query stories)

  • RemoveStoryFiles (Remove files from a story)

  • QueryFigureClusters (Query figure clusters)

  • QueryLocationDateClusters (Query spatiotemporal clusters)

0.0283

0.002

Per 1,000 calls

StandardQueryL2

Advanced query

  • SimpleQuery (Simple query)

  • FuzzyQuery (Fuzzy query)

0.708

0.074

Per 1,000 calls

MediaMeta

Get media information

  • DetectMediaMeta (Get media file metadata)

  • GetMediaMeta (Get media file metadata)

  • audio/info

  • video/info

0.1415094

0.1415094

Per 1,000 calls

SemanticAnalyze

Semantic analysis

SemanticQuery (Natural language query)

Free for a limited time

0.52

Per 1,000 calls

ApsaraVideo Media Processing

The following table describes the pricing of the ApsaraVideo Media Processing billable items.

Billable item

Description

Related API operations

Related x-oss-process operations

Price before 11:00 on July 28, 2025 (USD)

Price after 11:00 on July 28, 2025 (USD)

Unit

AudioCompress

Audio transcoding

CreateMediaConvertTask (Create a media transcoding task)

  • audio/concat

  • audio/compress

0.0000141509

0.0000141509

Per second of audio

VideoCompressCopy

Container format conversion

CreateMediaConvertTask (Create a media transcoding task)

0.0001415094

0.0001415094

Per second of video

VideoCompress264LD

H.264 transcoding - LDNote*

CreateMediaConvertTask (Create a media transcoding task)

  • video/concat

  • video/convert

0.0000509434

0.0000509434

Per second of video

VideoCompress264SD

H.264 transcoding - SDNote*

CreateMediaConvertTask (Create a media transcoding task)

  • video/concat

  • video/convert

0.0000707547

0.0000707547

Per second of video

VideoCompress264HD

H.264 transcoding - HDNote*

CreateMediaConvertTask (Create a media transcoding task)

  • video/concat

  • video/convert

0.0001273585

0.0001273585

Per second of video

VideoCompress2642K

H.264 transcoding - 2KNote*

CreateMediaConvertTask (Create a media transcoding task)

  • video/concat

  • video/convert

0.0002830189

0.0002830189

Per second of video

VideoCompress2644K

H.264 transcoding - 4KNote*

CreateMediaConvertTask (Create a media transcoding task)

  • video/concat

  • video/convert

0.0006367925

0.0006367925

Per second of video

VideoCompress265LD

H.265 transcoding - LDNote*

CreateMediaConvertTask (Create a media transcoding task)

  • video/concat

  • video/convert

0.0002122642

0.0002122642

Per second of video

VideoCompress265SD

H.265 transcoding - SDNote*

CreateMediaConvertTask (Create a media transcoding task)

  • video/concat

  • video/convert

0.0003537736

0.0003537736

Per second of video

VideoCompress265HD

H.265 transcoding - HDNote*

CreateMediaConvertTask (Create a media transcoding task)

  • video/concat

  • video/convert

0.0007075472

0.0007075472

Per second of video

VideoCompress2652K

H.265 transcoding - 2KNote*

CreateMediaConvertTask (Create a media transcoding task)

  • video/concat

  • video/convert

0.0011320755

0.0011320755

Per second of video

VideoCompress2654K

H.265 transcoding - 4KNote*

CreateMediaConvertTask (Create a media transcoding task)

  • video/concat

  • video/convert

0.0022641509

0.0022641509

Per second of video

MediaAnimation

Video to animated image conversion

CreateMediaConvertTask (Create a media transcoding task)

video/animation

Free for a limited time

  • Basic format: 0.012

  • Advanced format: 0.074

Per 1,000 frames

ExtractSubtitleText

Video text caption extraction

CreateMediaConvertTask (Create a media transcoding task)

Free for a limited time

0.223

Per 1,000 streams

ExtractSubtitleImage

Video image caption extraction

CreateMediaConvertTask (Create a media transcoding task)

Free for a limited time

0.015

Per 1,000 frames

VideoFraming

Video snapshot

CreateMediaConvertTask (Create a media transcoding task)

  • video/snapshots

  • video/sprite

0.142

0.015

Per 1,000 frames

VideoClassification

Video tag detection

CreateVideoLabelClassificationTask (Create a media transcoding task)

7.0754717

7.0754717

Per 1,000 calls

LiveTranscoding

Transcoding during playbackNote*

GenerateVideoPlaylist (Generate a playlist for transcoding during playback)

  • hls/m3u8

  • hls/ts

0.0000141509

0.0000141509

CountUnit

Document processing

The following table describes the pricing of the document processing billable items.

Important

For projects created before December 1, 2023, online preview and online editing are billed based on the number of times a document is opened. For projects created on or after December 1, 2023, these features are billed based on the number of API operation calls.

Billable item

Description

Related API operations

Related x-oss-process operations

Price before 11:00 on July 28, 2025 (USD)

Price after 11:00 on July 28, 2025 (USD)

Unit

DocumentConvert

Document conversion

CreateOfficeConversionTask (Create a document conversion task)

  • doc/convert

  • doc/snapshot

11.3207547

11.3207547

Thousands of times

Document content extraction

ExtractDocumentText (Extract document text)

DocumentWebofficeEdit

Online editing (Weboffice)Note*

  • GenerateWebofficeToken (Get a Weboffice token)

  • RefreshWebofficeToken (Refresh a Weboffice token)

doc/edit

2.8301887

2.8301887

Per 1,000 calls

DocumentWebofficePreview

Online preview (Weboffice)Note*

  • GenerateWebofficeToken (Get a Weboffice token)

  • RefreshWebofficeToken (Refresh a Weboffice token)

doc/preview

1.4150943

1.4150943

Per 1,000 calls

DocumentWebofficeCachePreview

Cached preview (Weboffice)

  • GenerateWebofficeToken (Get a Weboffice token)

  • RefreshWebofficeToken (Refresh a Weboffice token)

0.9905660

0.9905660

Per 1,000 calls

Important

This refers to the number of API operation calls.

File processing

The following table describes the pricing of the file processing billable items.

Billable item

Description

Related API operations

Related x-oss-process operations

Price before 11:00 on July 28, 2025 (USD)

Price after 11:00 on July 28, 2025 (USD)

Unit

PointCloudCompress

Point cloud compression

CreateCompressPointCloudTask (Create a point cloud compression task)

pointcloud/compress

Free for a limited time

0.03

Per 1,000 calls

FileProcess

File packaging and download

CreateFileCompressionTask (Create a file compression task)

Free for a limited time

0.00074

GB

Compressed package decompression

CreateFileUncompressionTask (Create a decompression task)

Free for a limited time

FilePreview

Compressed package preview

CreateArchiveFileInspectionTask (Create a compressed package preview and parsing task)

Free for a limited time

0.0074

TB

Notes on API operations that involve multiple billable items

Note
  • The SemanticQuery API operation incurs fees for two billable items: StandardQueryL2 and SemanticAnalyze.

  • The CreateFacesSearchingTask API operation incurs fees for two billable items: ImageDetect and ImageFace.

Notes on video transcoding

Note
  • H.264 transcoding: The output video uses the H.264 encoder.

  • H.265 transcoding: The output video uses the H.265 encoder.

  • LD: The resolution of the transcoded video is less than or equal to 640 × 480.

  • SD: The resolution of the transcoded video is less than or equal to 1280 × 720.

  • HD: The resolution of the transcoded video is less than or equal to 1920 × 1080.

  • 2K: The resolution of the transcoded video is less than or equal to 2560 × 1440.

  • 4K: The resolution of the transcoded video is less than or equal to 3840 × 2160.

  • Video transcoding is billed per second of video. The transcoding length is rounded up to the nearest second. Durations less than 1 second are billed as 1 second.

Notes on billing for document preview and editing

Note
  • For projects created before December 1, 2023, online editing and online preview are billed based on the number of times a document is opened, not the number of API operation calls.

  • Projects created on or after December 1, 2023 are billed based on the number of API operation calls. To switch to the new billing method, you must create a new project.

  • In the billing mode based on the number of API operation calls, a single API call can be used by only one user. If the call is reused, only the last user can access the document. The access permissions of other users are revoked.

  • If the Permission.Readonly parameter in the GenerateWebofficeToken API operation is set to true, you are charged for document preview. If this parameter is set to false, you are charged for online editing.

  • The billing for RefreshWebofficeToken depends on the parameters used in the original GenerateWebofficeToken API call. If the Permission.Readonly parameter was set to true, you are charged for document preview. Otherwise, you are charged for online editing.

Notes on billing for transcoding during playback

Note
  • Billing is based on the following components:

    • When you generate a playlist, you can set the InitialTranscode parameter to control the duration of the initial transcoding. This incurs LiveTranscoding fees.

    • When you play a video, if you play a TS file that has not been transcoded, a new transcoding task is triggered. This incurs LiveTranscoding fees.

    • During transcoding, fees are incurred for reading the source video file from OSS and writing the transcoded file to OSS. Fees are also incurred for reading the video file from OSS for playback. For more information about OSS-related fees, see OSS billable items.

  • Formula for calculating LiveTranscoding compute units (CUs):

    • Video

      • The `eff` parameter values for the codec of different video outputs are: h264: 0.3, h265: 1.8.

      • The formula is as follows:

    Ceiling (eff * Ceiling(Height/240) * Ceiling(Width/240) * Ceiling(FrameRate/30) + 1 ) * Ceiling(VideoStreamDuration)

    • Audio

      • The `eff` parameter value is 0.3.

      • The formula is as follows:

    Ceiling(eff * Ceiling(AudioStreamDuration))

  • Billing rules: Real-time processing of multiple video or audio streams is performed based on the settings of TargetVideo.Stream or TargetAudio.Stream. Each audio and video stream is billed separately. The following examples describe how real-time transcoding fees are calculated.

    • Example 1 (Only a playlist is generated. No transcoding-during-playback fees are incurred if the video is not played.):

      • A user calls GenerateVideoPlaylist. The output video is 38 minutes long. The resolution is 800 × 600, the frame rate is 30, and the video encoding format is H.264. The initial transcoding duration is 0 seconds. The default value is used for TranscodeAhead. The video is not played.

    • Example 2 (Only a playlist is generated. Pre-transcoding is configured. Only pre-transcoding fees are incurred if the video is not played.):

      • A user calls GenerateVideoPlaylist. The output video is 38 minutes long. The resolution is 800 × 600, the frame rate is 30, and the video encoding format is H.264. The initial transcoding duration is 30 seconds. The default value is used for TranscodeAhead. The video is not played.

      • Fees incurred:

        • LiveTranscoding (The number of CUs is calculated using the following formula): Ceiling((0.3 * Ceiling(800/240) * Ceiling(600/240) * Ceiling(30/30) + 1 ) * (Ceiling(30)) + Ceiling(0.3 * Ceiling(30)) = 159 (CUs)

    • Example 3 (After a playlist is generated, a part of the video is played. Transcoding-during-playback fees are incurred only for the played part of the video.):

      • A user calls GenerateVideoPlaylist. The output video is 38 minutes long. The resolution is 800 × 600, the frame rate is 30, and the video encoding format is H.264. The initial transcoding duration is 0 seconds. The default value is used for TranscodeAhead. The user plays the video using the M3U8 file, starts playing from the beginning to the 5th minute (the video is transcoded 2 minutes ahead by default), and then jumps to the 15th minute and plays to the end of the video.

      • Fees incurred:

        • LiveTranscoding (The number of CUs is calculated using the following formula): Ceiling((0.3 * Ceiling(800/240) * Ceiling(600/240) * Ceiling(30/30) + 1) * (Ceiling((5+2)*60) + Ceiling((38-15)*60)) + Ceiling(0.3 * Ceiling((5+2) * 60)) + Ceiling(0.3 * Ceiling((38-15) * 60) = 9540 (CUs)

    • Example 4: If multiple users play the video, transcoding-during-playback fees are incurred only once for the parts that are played repeatedly.

      • A user calls GenerateVideoPlaylist. The output video is 38 minutes long. The resolution is 800 × 600, the frame rate is 30, and the video encoding format is H.264. The initial transcoding duration is 0 seconds. The default value is used for TranscodeAhead.

        User A plays the video using the M3U8 file, starts playing from the beginning to the 5th minute, and then stops playing.

        User B plays the video using the M3U8 file, starts playing from the 15th minute to the end.

        User C plays the video using the M3U8 file from the beginning to the end.

      • Fees incurred:

        • LiveTranscoding (The number of CUs is calculated using the following formula): Ceiling((0.3 * Ceiling(800/240) * Ceiling(600/240) * Ceiling(30/30) + 1) * Ceiling(38*60) + Ceiling(0.3 * Ceiling(38 * 60)) = 12084 (CUs)

  • Terminology:

    • Width: The width of the output video resolution.

    • Height: The vertical resolution of the output video

    • FrameRate: The video frame rate.

    • VideoStreamDuration: The length of the video stream.

    • AudioStreamDuration: The length of the audio stream.

    • eff: The CU coefficient.

    • Ceiling(x) function: Returns the smallest integer that is greater than or equal to x.

Mapping between operators and billable items

When you create metadata indexes by attaching an OSS Bucket or by calling the IndexFileMeta or BatchIndexFileMeta operations, you incur fees. Executing the operators described in Mappings between workflow templates and operators incurs data processing fees, index storage fees, and OSS request fees. OSS request fees are charged by OSS. For more information, see OSS Request fees. The following table shows the mappings between operators and billable items.

Operator

Billable item

Billed by

OSSMeta operator

GetRequest

OSS

MIME operator

No charge

N/A

FaceDetection operator

ImageFaceNote*

IMM

LabelClassification operator (image)

ImageClassificationNote*

IMM

LabelClassification operator (video)

VideoClassification

IMM

ImageScoring operator

ImageScoringNote*

IMM

ReGEO operator

ReverseGeocoding

IMM

MediaMeta operator

MediaMeta

IMM

EXIF operator

GetRequest

OSS

ExtractDocumentText operator

DocumentConvert

IMM

ExtractImageEmbeddings operator

Free for a limited time

IMM

Important

To process image files in various formats, IMM uses the image processing capabilities of Object Storage Service (OSS) to perform one or more operations, such as format conversion and image scaling. These operations incur fees that are charged by OSS. For more information about these fees, see Data processing fees.

External request fees

Accessing Object Storage Service (OSS) through Intelligent Media Management (IMM) incurs OSS request fees. For more information, see Request fees.