IMM API Functions & Endpoints Overview - Intelligent Media Management

API standard and pre-built SDKs in multi-language

The OpenAPI specification of this product (imm/2020-09-30) follows the RPC standard. Alibaba Cloud provides pre-built SDKs for popular programming languages to abstract low-level complexities such as request signing. This enables developers to call APIs using language-specific syntax without dealing with HTTP details directly.

Custom signature

If your specific needs, such as a customized signature, are not supported by the SDK, manually sign requests using the signature mechanism. Note that manual signing requires significant effort (usually about 5 business days). For support, join our DingTalk group (ID: 147535001692).

Before you begin

An Alibaba Cloud account has full administrative privileges. A compromised AccessKey pair exposes all associated resources to unauthorized access, posing a significant security risk. To call APIs securely, create a Resource Access Management (RAM) user with API access only, configure its AccessKey pairs, and implement the principle of least privilege (PoLP) through RAM policies. Use the Alibaba Cloud account only when its permissions are explicitly required for specific scenarios.

Service Region

API	Title	Description
ListRegions	List of Regions Supporting IMM Service	Get the list of regions

Project Management

API	Title	Description
CreateProject	CreateProject	Creates a project.
UpdateProject	UpdateProject	Updates information about a project.
GetProject	GetProject	Queries the basic information, datasets, and file statistics of a project.
ListProjects	ListProjects	Queries projects. You can call this operation to query the basic information, datasets, and file statistics of multiple projects at the same time.
DeleteProject	DeleteProject	Deletes a project.

Metadata Management

API	Title	Description
Data set management	Data set management
CreateDataset	Create Dataset	Create a dataset.
UpdateDataset	Update Dataset	Updates the information for a dataset.
GetDataset	GetDataset	Queries a dataset.
ListDatasets	ListDatasets	Queries a list of datasets. You can query the list by dataset prefix.
DeleteDataset	DeleteDataset	Deletes a dataset.
Metadata Index	Metadata Index
IndexFileMeta	IndexFileMeta	Performs data processing on input files for tasks such as label detection, face detection, and location detection. This operation extracts object metadata and creates an index, which lets you retrieve data from a dataset.
BatchIndexFileMeta	BatchIndexFileMeta	This operation performs a batch index of object metadata by processing input files for tasks such as label detection, face detection, and location detection. The object metadata is then indexed into a dataset to support various data retrieval methods.
UpdateFileMeta	UpdateFileMeta	Updates the partial metadata of the indexed files in a dataset.
BatchUpdateFileMeta	BatchUpdateFileMeta	Updates some metadata items of files indexed into a dataset.
GetFileMeta	GetFileMeta	Queries metadata of a file whose metadata is indexed into the dataset.
BatchGetFileMeta	BatchGetFileMeta	Queries metadata of multiple objects or files in the specified dataset.
DeleteFileMeta	DeleteFileMeta	Removes the metadata of a file from a dataset.
BatchDeleteFileMeta	BatchDeleteFileMeta	Deletes the metadata of multiple files from a dataset.
Query and statistics	Query and statistics
SimpleQuery	SimpleQuery	Queries files in a dataset by performing a simple query operation. The operation supports logical expressions.
SemanticQuery	SemanticQuery	Queries metadata in a dataset by inputting natural language.
FuzzyQuery	FuzzyQuery	Queries the extracted file metadata, including the file name, labels, path, custom tags, and other fields. If the value of a metadata field of a file matches the specified string, the metadata of the file is returned.
intelligent management	intelligent management
face clustering	face clustering
CreateFigureClusteringTask	CreateFigureClusteringTask	Creates a figure clustering task. This task uses an intelligent algorithm to group the faces of different people in images that are indexed in a dataset.
CreateFigureClustersMergingTask	CreateFigureClustersMergingTask	Merges two or more figure clustering groups into a single figure clustering group.
GetFigureCluster	GetFigureCluster	Obtains basic information about face clustering, including the creation time, number of images, and cover.
QueryFigureClusters	QueryFigureClusters	Queries face groups based on given conditions.
BatchGetFigureCluster	BatchGetFigureCluster	Queries face clusters.
UpdateFigureCluster	UpdateFigureCluster	Updates information about a face cluster, such as the cluster name and labels.
SearchImageFigureCluster	SearchImageFigureCluster	Queries face clusters that contain a specific face in an image. Each face cluster contains information such as bounding boxes and similarity.
CreateFacesSearchingTask	CreateFacesSearchingTask	Searches a media set for the top N images most similar to a specified image or face ID. The operation returns the corresponding face IDs and bounding boxes, sorted by similarity in descending order.
spatio-temporal clustering	spatio-temporal clustering
CreateLocationDateClusteringTask	CreateLocationDateClusteringTask	The spatio-temporal clustering feature classifies files in a dataset based on their time and location. This feature works on indexed files, such as images and videos, that contain shooting time and location data. These classifications can represent content from a user's trip, where files have similar timestamps and locations. The classifications can also represent content shot at different places where a user lives or works. Analyzing the locations and time ranges of these classifications lets you categorize media files, create highlight reels, and generate photo and video stories.
QueryLocationDateClusters	QueryLocationDateClusters	Queries a list of spatiotemporal clusters based on the specified conditions.
UpdateLocationDateCluster	UpdateLocationDateCluster	Updates a spatiotemporal cluster.
DeleteLocationDateCluster	DeleteLocationDateCluster	Deletes a spatiotemporal cluster.
Story	Story
CreateStory	CreateStory	Creates a story.
QueryStories	QueryStories	Queries stories based on the specified conditions.
GetStory	GetStory	Queries a story.
CreateCustomizedStory	CreateCustomizedStory	Creates a story based on the specified images and videos.
UpdateStory	UpdateStory	Updates the information about a story, such as the story name and cover image.
AddStoryFiles	AddStoryFiles	Adds objects to a story.
RemoveStoryFiles	RemoveStoryFiles	Deletes files from a story.
DeleteStory	DeleteStory	Deletes a story.
image clustering	image clustering
CreateSimilarImageClusteringTask	CreateSimilarImageClusteringTask	The similar image clustering feature groups images that you have indexed in a dataset into clusters based on visual similarity. This feature is useful for scenarios such as deduplicating images or selecting the best shots. For example, you can use it to filter burst photos in an album.
QuerySimilarImageClusters	QuerySimilarImageClusters	You can call this operation to query the list of similar image clusters.
Data Binding	Data Binding
CreateBinding	CreateBinding	Creates a binding relationship between a dataset and an Object Storage Service (OSS) bucket. This allows for the automatic synchronization of incremental and full data and indexing.
GetBinding	GetBinding	Queries the binding relationship between a specific dataset and an Object Storage Service (OSS) bucket.
ListBindings	ListBindings	Queries bindings between a dataset and Object Storage Service (OSS) buckets.
DeleteBinding	DeleteBinding	Deletes the binding between a dataset and an Object Storage Service (OSS) bucket.
AttachOSSBucket	AttachOSSBucket	Binds an Object Storage Service (OSS) bucket to the specified project. The binding enables you to use IMM features by using the x-oss-process parameter.
DetachOSSBucket	DetachOSSBucket	Unbinds an Object Storage Service (OSS) bucket from the corresponding project.
GetOSSBucketAttachment	GetOSSBucketAttachment	Queries the name of the project bound to an Object Storage Service (OSS) bucket.

Image Processing

API	Title	Description
EncodeBlindWatermark	EncodeBlindWatermark	Embeds specific textual information into an image as watermarks. These watermarks are visually imperceptible and do not affect the aesthetics of the image or the integrity of the original data. The watermarks can be extracted by using the CreateDecodeBlindWatermarkTask operation.
CreateDecodeBlindWatermarkTask	CreateDecodeBlindWatermarkTask	Extracts a blind watermark.
GetDecodeBlindWatermarkResult	GetDecodeBlindWatermarkResult	Queries the result of an invisible watermark parsing task.
DetectImageLabels	DetectImageLabels	Detects scene, object, and event information in an image. Scene information includes natural landscapes, daily life, and disasters. Event information includes talent shows, office events, performances, and production events. Object information includes tableware, electronics, furniture, and transportation. The DetectImageLabels operation supports more than 30 different categories and thousands of labels.
DetectImageScore	DetectImageScore	Calculates the aesthetics quality score of an image based on metrics such as the composition, brightness, contrast, color, and resolution. The operation returns a score within the range from 0 to 1. A higher score indicates better image quality.
DetectImageCodes	DetectImageCodes	Detects barcodes and QR codes in an image.
DetectImageFaces	DetectImageFaces	Detects faces from an image, including face boundary information, attributes, and quality. The boundary information includes the distance from the y-coordinate of the vertex to the top edge (Top), distance from the x-coordinate of the vertex to the left edge (Left), height (Height), and width (Width). Face attributes include the age (Age), age standard deviation (AgeSD), gender (Gender), emotion (Emotion), mouth opening (Mouth), beard (Beard), hat wearing (Hat), mask wearing (Mask), glasses wearing (Glasses), head orientation (HeadPose), attractiveness (Attractive), and confidence levels for preceding attributes. Quality information includes the face quality score (FaceQuality) and face resolution (Sharpness).
DetectImageCropping	DetectImageCropping	Detects the cropping area that produces the optimal visual effect based on a given image ratio by using AI model capabilities.
AddImageMosaic	AddImageMosaic	Adds mosaics, Gaussian blurs, or solid color shapes to blur one or more areas of an image for privacy protection and saves the output image to the specified path in Object Storage Service (OSS).
CreateImageToPDFTask	CreateImageToPDFTask	Converts multiple images into a single PDF file and saves the file as a specified OSS object.
CreateImageSplicingTask	CreateImageSplicingTask	Stitches multiple images into a single image based on specified rules and saves the output to a specified OSS object.
CompareImageFaces	CompareImageFaces	Compares the similarity of the largest faces in two images. The largest face refers to the largest face frame in an image after face detection.
DetectImageBodies	DetectImageBodies	Detects human body information, such as the confidence level and body bounding box, in an image.
DetectImageCars	DetectImageCars	Detects the outline data, attributes, and license plate information of vehicles in an image. The vehicle attributes include the vehicle color (CarColor) and vehicle type (CarType). The license plate information includes the recognition content (Content) and plate frame (Boundary).
DetectImageTexts	DetectImageTexts	Recognizes and extracts text content from an image.

Media Processing

API	Title	Description
CreateMediaConvertTask	Create Media Transcoding Task	Creates an asynchronous media transcoding task. This task processes audio and video files. It supports media transcoding, media concatenation, video snapshots, and converting videos into animated images.
DetectMediaMeta	DetectMediaMeta	Queries media metadata, including the media format and stream information.
CreateVideoLabelClassificationTask	CreateVideoLabelClassificationTask	Detects labels for scenarios, objects, and events in video content. This feature supports more than 30 categories and thousands of labels. Scenario labels include natural landscapes, life scenes, and disaster scenes. Event labels include talent shows, office work, performances, and production. Object labels include tableware, electronic products, furniture, and vehicles.
GetVideoLabelClassificationResult	GetVideoLabelClassificationResult	Queries the results of a video label detection task.
GenerateVideoPlaylist	GenerateVideoPlaylist	Generates a playlist from a video file for live transcoding. The output is an M3U8 file that enables immediate playback and on-demand transcoding based on playback progress. Compared with offline transcoding, this method significantly reduces transcoding wait times and lowers transcoding and storage overhead.
CreateHighlightTask	CreateHighlightTask	Creates a highlight task.

Document processing

API	Title	Description
GenerateWebofficeToken	Obtain Weboffice Token	Generates an access credential for document preview and editing.
RefreshWebofficeToken	Refresh Weboffice Token	Refresh Document Preview and Editing Token
CreateOfficeConversionTask	CreateOfficeConversionTask	Creates a document conversion task that converts documents, such as Word, PowerPoint, Excel, and PDF files, stored in Object Storage Service (OSS) into images, text files, or PDF files.
ExtractDocumentText	Extract Document Text	Extract text from the document

File Processing

API	Title	Description
Compression Decompression	Compression Decompression
CreateFileCompressionTask	CreateFileCompressionTask	Package Download API
CreateArchiveFileInspectionTask	CreateArchiveFileInspectionTask	Creates a task to inspect a compressed file and retrieve a list of its contents without decompressing the file.
CreateFileUncompressionTask	CreateFileUncompressionTask	A file decompression task lets you decompress specific files or an entire compressed package to a specified location. Supported formats include Zip, RAR, and 7z.
point cloud compression	point cloud compression
CreateCompressPointCloudTask	CreateCompressPointCloudTask	The point cloud compression feature compresses point cloud data in Object Storage Service (OSS). This helps reduce data transmission over the network.

Content Security

API	Title	Description
DetectTextAnomaly	DetectTextAnomaly	Detects whether specified text contains anomalies, such as pornography, advertisements, excessive junk content, politically sensitive content, and abuse.
CreateImageModerationTask	CreateImageModerationTask	Detects non-compliant content in images, such as pornography, terrorism, undesirable scenes, logos, and text-in-image violations.
CreateVideoModerationTask	CreateVideoModerationTask	Detects threats or non-compliant content in videos. This operation can be used in scenarios such as pornography detection, terrorism and politically sensitive content detection, text and image violation detection, undesirable scene detection, and logo detection.
GetImageModerationResult	GetImageModerationResult	Queries an image compliance detection task.
GetVideoModerationResult	GetVideoModerationResult	Queries the result of a video moderation task.

Task Management

API	Title	Description
GetTask	GetTask	Queries information about an asynchronous task. Intelligent Media Management (IMM) has multiple asynchronous data processing capabilities, each of which has its own operation for creating tasks. For example, you can call the CreateFigureClusteringTask operation to create a face clustering task and the CreateFileCompressionTask operation to create a file compression task. The GetTask operation is a general operation. You can call this operation to query information about asynchronous tasks by task ID or type.
ListTasks	ListTasks	Lists tasks based on specific conditions, such as by time range and by tag.
Trigger	Trigger
CreateTrigger	CreateTrigger	Creates a trigger to start data processing in Intelligent Media Management (IMM). The trigger is activated by event sources, such as Object Storage Service (OSS), and uses data processing templates to process media files, such as images, videos, and documents.
SuspendTrigger	SuspendTrigger	Suspends a running trigger.
ResumeTrigger	ResumeTrigger	Resumes a trigger that is in the Suspended or Failed state.
UpdateTrigger	UpdateTrigger	Updates information about a trigger, such as the input data source, data processing settings, and tags.
GetTrigger	GetTrigger	Queries the information about a trigger.
ListTriggers	ListTriggers	Queries triggers by tag or status.
DeleteTrigger	DeleteTrigger	Deletes a trigger.
Batch Processing	Batch Processing
CreateBatch	CreateBatch	Creates a batch processing task that performs specified operations, such as transcoding and format conversion, on multiple existing files.
SuspendBatch	SuspendBatch	Suspends a batch processing task.
ResumeBatch	ResumeBatch	Resumes a batch processing task that is in the Suspended or Failed state.
UpdateBatch	UpdateBatch	Updates information for a batch processing task, such as the data source configuration, data processing configuration, and tags.
ListBatches	ListBatches	Queries batch processing tasks. You can query batch processing tasks based on conditions such task tags and status. The results can be sorted.
GetBatch	GetBatch	Queries the information about a batch processing task.
DeleteBatch	DeleteBatch	Deletes a batch processing task.

Others

API	Title	Description
ContextualAnswer	Q\&A API in AI Assistant	Phase II of AI Assistant, Q\&A API
ContextualRetrieval	ContextualRetrieval	Retrieves semantically similar documents. The operation is designed for multi-turn conversations and can process message input in historical conversations. The operation returns results that are highly related to the current conversation based on an in-depth understanding of contextual content. It provides consistent and efficient information retrieval in multi-turn conversations.
ListAttachedOSSBuckets	List OSS Bucket Binding Relationships	List bound attachments