All Products
Search
Document Center

Intelligent Media Management:API overview

Last Updated:Mar 17, 2026

API standard and pre-built SDKs in multi-language

The OpenAPI specification of this product (imm/2020-09-30) follows the RPC standard. Alibaba Cloud provides pre-built SDKs for popular programming languages to abstract low-level complexities such as request signing. This enables developers to call APIs using language-specific syntax without dealing with HTTP details directly.

Custom signature

If your specific needs, such as a customized signature, are not supported by the SDK, manually sign requests using the signature mechanism. Note that manual signing requires significant effort (usually about 5 business days). For support, join our DingTalk group (ID: 147535001692).

Before you begin

An Alibaba Cloud account has full administrative privileges. A compromised AccessKey pair exposes all associated resources to unauthorized access, posing a significant security risk. To call APIs securely, create a Resource Access Management (RAM) user with API access only, configure its AccessKey pairs, and implement the principle of least privilege (PoLP) through RAM policies. Use the Alibaba Cloud account only when its permissions are explicitly required for specific scenarios.

Service Region

API

Title

Description

ListRegions List of Regions Supporting IMM Service Get the list of regions

Project Management

API

Title

Description

CreateProject CreateProject Creates a project.
UpdateProject UpdateProject Updates information about a project.
GetProject GetProject Queries the basic information, datasets, and file statistics of a project.
ListProjects ListProjects Queries projects. You can call this operation to query the basic information, datasets, and file statistics of multiple projects at the same time.
DeleteProject DeleteProject Deletes a project.

Metadata Management

API

Title

Description

Data set management Data set management
CreateDataset Create Dataset Create a dataset.
UpdateDataset Update Dataset Updates the information for a dataset.
GetDataset GetDataset Queries a dataset.
ListDatasets ListDatasets Queries a list of datasets. You can query the list by dataset prefix.
DeleteDataset DeleteDataset Deletes a dataset.
Metadata Index Metadata Index
IndexFileMeta IndexFileMeta Performs data processing on input files for tasks such as label detection, face detection, and location detection. This operation extracts object metadata and creates an index, which lets you retrieve data from a dataset.
BatchIndexFileMeta BatchIndexFileMeta This operation performs a batch index of object metadata by processing input files for tasks such as label detection, face detection, and location detection. The object metadata is then indexed into a dataset to support various data retrieval methods.
UpdateFileMeta UpdateFileMeta Updates the partial metadata of the indexed files in a dataset.
BatchUpdateFileMeta BatchUpdateFileMeta Updates some metadata items of files indexed into a dataset.
GetFileMeta GetFileMeta Queries metadata of a file whose metadata is indexed into the dataset.
BatchGetFileMeta BatchGetFileMeta Queries metadata of multiple objects or files in the specified dataset.
DeleteFileMeta DeleteFileMeta Removes the metadata of a file from a dataset.
BatchDeleteFileMeta BatchDeleteFileMeta Deletes the metadata of multiple files from a dataset.
Query and statistics Query and statistics
SimpleQuery SimpleQuery Queries files in a dataset by performing a simple query operation. The operation supports logical expressions.
SemanticQuery SemanticQuery Queries metadata in a dataset by inputting natural language.
FuzzyQuery FuzzyQuery Queries the extracted file metadata, including the file name, labels, path, custom tags, and other fields. If the value of a metadata field of a file matches the specified string, the metadata of the file is returned.
intelligent management intelligent management
face clustering face clustering
CreateFigureClusteringTask CreateFigureClusteringTask Creates a figure clustering task. This task uses an intelligent algorithm to group the faces of different people in images that are indexed in a dataset.
CreateFigureClustersMergingTask CreateFigureClustersMergingTask Merges two or more figure clustering groups into a single figure clustering group.
GetFigureCluster GetFigureCluster Obtains basic information about face clustering, including the creation time, number of images, and cover.
QueryFigureClusters QueryFigureClusters Queries face groups based on given conditions.
BatchGetFigureCluster BatchGetFigureCluster Queries face clusters.
UpdateFigureCluster UpdateFigureCluster Updates information about a face cluster, such as the cluster name and labels.
SearchImageFigureCluster SearchImageFigureCluster Queries face clusters that contain a specific face in an image. Each face cluster contains information such as bounding boxes and similarity.
CreateFacesSearchingTask CreateFacesSearchingTask Searches a media set for the top N images most similar to a specified image or face ID. The operation returns the corresponding face IDs and bounding boxes, sorted by similarity in descending order.
spatio-temporal clustering spatio-temporal clustering
CreateLocationDateClusteringTask CreateLocationDateClusteringTask The spatio-temporal clustering feature classifies files in a dataset based on their time and location. This feature works on indexed files, such as images and videos, that contain shooting time and location data. These classifications can represent content from a user's trip, where files have similar timestamps and locations. The classifications can also represent content shot at different places where a user lives or works. Analyzing the locations and time ranges of these classifications lets you categorize media files, create highlight reels, and generate photo and video stories.
QueryLocationDateClusters QueryLocationDateClusters Queries a list of spatiotemporal clusters based on the specified conditions.
UpdateLocationDateCluster UpdateLocationDateCluster Updates a spatiotemporal cluster.
DeleteLocationDateCluster DeleteLocationDateCluster Deletes a spatiotemporal cluster.
Story Story
CreateStory CreateStory Creates a story.
QueryStories QueryStories Queries stories based on the specified conditions.
GetStory GetStory Queries a story.
CreateCustomizedStory CreateCustomizedStory Creates a story based on the specified images and videos.
UpdateStory UpdateStory Updates the information about a story, such as the story name and cover image.
AddStoryFiles AddStoryFiles Adds objects to a story.
RemoveStoryFiles RemoveStoryFiles Deletes files from a story.
DeleteStory DeleteStory Deletes a story.
image clustering image clustering
CreateSimilarImageClusteringTask CreateSimilarImageClusteringTask The similar image clustering feature groups images that you have indexed in a dataset into clusters based on visual similarity. This feature is useful for scenarios such as deduplicating images or selecting the best shots. For example, you can use it to filter burst photos in an album.
QuerySimilarImageClusters QuerySimilarImageClusters You can call this operation to query the list of similar image clusters.
Data Binding Data Binding
CreateBinding CreateBinding Creates a binding relationship between a dataset and an Object Storage Service (OSS) bucket. This allows for the automatic synchronization of incremental and full data and indexing.
GetBinding GetBinding Queries the binding relationship between a specific dataset and an Object Storage Service (OSS) bucket.
ListBindings ListBindings Queries bindings between a dataset and Object Storage Service (OSS) buckets.
DeleteBinding DeleteBinding Deletes the binding between a dataset and an Object Storage Service (OSS) bucket.
AttachOSSBucket AttachOSSBucket Binds an Object Storage Service (OSS) bucket to the specified project. The binding enables you to use IMM features by using the x-oss-process parameter.
DetachOSSBucket DetachOSSBucket Unbinds an Object Storage Service (OSS) bucket from the corresponding project.
GetOSSBucketAttachment GetOSSBucketAttachment Queries the name of the project bound to an Object Storage Service (OSS) bucket.

Image Processing

API

Title

Description

EncodeBlindWatermark EncodeBlindWatermark Embeds specific textual information into an image as watermarks. These watermarks are visually imperceptible and do not affect the aesthetics of the image or the integrity of the original data. The watermarks can be extracted by using the CreateDecodeBlindWatermarkTask operation.
CreateDecodeBlindWatermarkTask CreateDecodeBlindWatermarkTask Extracts a blind watermark.
GetDecodeBlindWatermarkResult GetDecodeBlindWatermarkResult Queries the result of an invisible watermark parsing task.
DetectImageLabels DetectImageLabels Detects scene, object, and event information in an image. Scene information includes natural landscapes, daily life, and disasters. Event information includes talent shows, office events, performances, and production events. Object information includes tableware, electronics, furniture, and transportation. The DetectImageLabels operation supports more than 30 different categories and thousands of labels.
DetectImageScore DetectImageScore Calculates the aesthetics quality score of an image based on metrics such as the composition, brightness, contrast, color, and resolution. The operation returns a score within the range from 0 to 1. A higher score indicates better image quality.
DetectImageCodes DetectImageCodes Detects barcodes and QR codes in an image.
DetectImageFaces DetectImageFaces Detects faces from an image, including face boundary information, attributes, and quality. The boundary information includes the distance from the y-coordinate of the vertex to the top edge (Top), distance from the x-coordinate of the vertex to the left edge (Left), height (Height), and width (Width). Face attributes include the age (Age), age standard deviation (AgeSD), gender (Gender), emotion (Emotion), mouth opening (Mouth), beard (Beard), hat wearing (Hat), mask wearing (Mask), glasses wearing (Glasses), head orientation (HeadPose), attractiveness (Attractive), and confidence levels for preceding attributes. Quality information includes the face quality score (FaceQuality) and face resolution (Sharpness).
DetectImageCropping DetectImageCropping Detects the cropping area that produces the optimal visual effect based on a given image ratio by using AI model capabilities.
AddImageMosaic AddImageMosaic Adds mosaics, Gaussian blurs, or solid color shapes to blur one or more areas of an image for privacy protection and saves the output image to the specified path in Object Storage Service (OSS).
CreateImageToPDFTask CreateImageToPDFTask Converts multiple images into a single PDF file and saves the file as a specified OSS object.
CreateImageSplicingTask CreateImageSplicingTask Stitches multiple images into a single image based on specified rules and saves the output to a specified OSS object.
CompareImageFaces CompareImageFaces Compares the similarity of the largest faces in two images. The largest face refers to the largest face frame in an image after face detection.
DetectImageBodies DetectImageBodies Detects human body information, such as the confidence level and body bounding box, in an image.
DetectImageCars DetectImageCars Detects the outline data, attributes, and license plate information of vehicles in an image. The vehicle attributes include the vehicle color (CarColor) and vehicle type (CarType). The license plate information includes the recognition content (Content) and plate frame (Boundary).
DetectImageTexts DetectImageTexts Recognizes and extracts text content from an image.

Media Processing

API

Title

Description

CreateMediaConvertTask Create Media Transcoding Task Creates an asynchronous media transcoding task. This task processes audio and video files. It supports media transcoding, media concatenation, video snapshots, and converting videos into animated images.
DetectMediaMeta DetectMediaMeta Queries media metadata, including the media format and stream information.
CreateVideoLabelClassificationTask CreateVideoLabelClassificationTask Detects labels for scenarios, objects, and events in video content. This feature supports more than 30 categories and thousands of labels. Scenario labels include natural landscapes, life scenes, and disaster scenes. Event labels include talent shows, office work, performances, and production. Object labels include tableware, electronic products, furniture, and vehicles.
GetVideoLabelClassificationResult GetVideoLabelClassificationResult Queries the results of a video label detection task.
GenerateVideoPlaylist GenerateVideoPlaylist Generates a playlist from a video file for live transcoding. The output is an M3U8 file that enables immediate playback and on-demand transcoding based on playback progress. Compared with offline transcoding, this method significantly reduces transcoding wait times and lowers transcoding and storage overhead.
CreateHighlightTask CreateHighlightTask Creates a highlight task.

Document processing

API

Title

Description

GenerateWebofficeToken Obtain Weboffice Token Generates an access credential for document preview and editing.
RefreshWebofficeToken Refresh Weboffice Token Refresh Document Preview and Editing Token
CreateOfficeConversionTask CreateOfficeConversionTask Creates a document conversion task that converts documents, such as Word, PowerPoint, Excel, and PDF files, stored in Object Storage Service (OSS) into images, text files, or PDF files.
ExtractDocumentText Extract Document Text Extract text from the document

File Processing

API

Title

Description

Compression Decompression Compression Decompression
CreateFileCompressionTask CreateFileCompressionTask Package Download API
CreateArchiveFileInspectionTask CreateArchiveFileInspectionTask Creates a task to inspect a compressed file and retrieve a list of its contents without decompressing the file.
CreateFileUncompressionTask CreateFileUncompressionTask A file decompression task lets you decompress specific files or an entire compressed package to a specified location. Supported formats include Zip, RAR, and 7z.
point cloud compression point cloud compression
CreateCompressPointCloudTask CreateCompressPointCloudTask The point cloud compression feature compresses point cloud data in Object Storage Service (OSS). This helps reduce data transmission over the network.

Content Security

API

Title

Description

DetectTextAnomaly DetectTextAnomaly Detects whether specified text contains anomalies, such as pornography, advertisements, excessive junk content, politically sensitive content, and abuse.
CreateImageModerationTask CreateImageModerationTask Detects non-compliant content in images, such as pornography, terrorism, undesirable scenes, logos, and text-in-image violations.
CreateVideoModerationTask CreateVideoModerationTask Detects threats or non-compliant content in videos. This operation can be used in scenarios such as pornography detection, terrorism and politically sensitive content detection, text and image violation detection, undesirable scene detection, and logo detection.
GetImageModerationResult GetImageModerationResult Queries an image compliance detection task.
GetVideoModerationResult GetVideoModerationResult Queries the result of a video moderation task.

Task Management

API

Title

Description

GetTask GetTask Queries information about an asynchronous task. Intelligent Media Management (IMM) has multiple asynchronous data processing capabilities, each of which has its own operation for creating tasks. For example, you can call the CreateFigureClusteringTask operation to create a face clustering task and the CreateFileCompressionTask operation to create a file compression task. The GetTask operation is a general operation. You can call this operation to query information about asynchronous tasks by task ID or type.
ListTasks ListTasks Lists tasks based on specific conditions, such as by time range and by tag.
Trigger Trigger
CreateTrigger CreateTrigger Creates a trigger to start data processing in Intelligent Media Management (IMM). The trigger is activated by event sources, such as Object Storage Service (OSS), and uses data processing templates to process media files, such as images, videos, and documents.
SuspendTrigger SuspendTrigger Suspends a running trigger.
ResumeTrigger ResumeTrigger Resumes a trigger that is in the Suspended or Failed state.
UpdateTrigger UpdateTrigger Updates information about a trigger, such as the input data source, data processing settings, and tags.
GetTrigger GetTrigger Queries the information about a trigger.
ListTriggers ListTriggers Queries triggers by tag or status.
DeleteTrigger DeleteTrigger Deletes a trigger.
Batch Processing Batch Processing
CreateBatch CreateBatch Creates a batch processing task that performs specified operations, such as transcoding and format conversion, on multiple existing files.
SuspendBatch SuspendBatch Suspends a batch processing task.
ResumeBatch ResumeBatch Resumes a batch processing task that is in the Suspended or Failed state.
UpdateBatch UpdateBatch Updates information for a batch processing task, such as the data source configuration, data processing configuration, and tags.
ListBatches ListBatches Queries batch processing tasks. You can query batch processing tasks based on conditions such task tags and status. The results can be sorted.
GetBatch GetBatch Queries the information about a batch processing task.
DeleteBatch DeleteBatch Deletes a batch processing task.

Others

API

Title

Description

ContextualAnswer Q\&A API in AI Assistant Phase II of AI Assistant, Q\&A API
ContextualRetrieval ContextualRetrieval Retrieves semantically similar documents. The operation is designed for multi-turn conversations and can process message input in historical conversations. The operation returns results that are highly related to the current conversation based on an in-depth understanding of contextual content. It provides consistent and efficient information retrieval in multi-turn conversations.
ListAttachedOSSBuckets List OSS Bucket Binding Relationships List bound attachments