You can create a data index and use the metadata and semantic content of objects as index conditions to quickly search for images, videos, documents, and audio files in Object Storage Service (OSS).
Why use data indexing?
Traditional file search methods exhibit significant limitations, which OSS Data Indexing effectively addresses:
Traditional Search | OSS Data Indexing |
Complex operations: Requires using ListObject to traverse data and extract metadata for building custom databases, resulting in time-consuming and cumbersome workflows. | Simplified operations: Eliminates the need for data migration or custom search systems by enabling direct filtering and statistics via automatically built OSS indexes. |
Low retrieval performance: Slow speed and inefficiency when handling massive data. | High-performance retrieval: Supports second-level indexing and aggregation, scaling to multi-billion-file index libraries. |
Limited retrieval capabilities: Restricted to OSS metadata-based searches. | Multi-modal support: Satisfies diverse requirements through advanced methods such as content semantics and file characterization. |
Supported data indexing methods
OSS supports MetaSearch and AISearch. The following table describes the preceding data indexing methods.
Item | MetaSearch | AISearch |
Description | Search for specific objects based on metadata attributes, such as object metadata, ETags, and tags. | Search for specific objects based on the information about documents, images, videos, and audio files. You can specify semantic content as index conditions, and OSS compares the semantic content with objects in OSS. |
Scenario | Object query and statistics | Multimodal search and complex object search |
Sample index condition | Search for Standard objects whose access control list (ACL) is private and which are uploaded on September 14, 2024
| Search for images related to the semantic content "apple"
|
Sample result | Return Standard objects whose ACL is private and which are uploaded on September 14, 2024
| Return images related to the semantic content "apple"
|
Instructions on selecting a data indexing method
Comparison of search conditions
Search condition | MetaSearch | AISearch |
OSS metadata | ✅ | ✅ |
Object tags and ETags | ✅ | ✅ |
User metadata | ❌ | ✅ |
Multimedia metadata | ❌ | ✅ |
Semantic content | ❌ | ✅ |
For more information about the fields and operators supported by MetaSearch, see Appendix: Fields and operators supported in scalar search.
For more information about the fields and operators supported by AISearch, see Appendix: Fields and operators supported by AISearch.
Typical scenarios
Cost Optimization Analytics Identify non-critical or cold data by using OSS metadata such as timestamps to reduce storage costs.
MetaSearch is recommended.
Data Validation Verify data cleansing results by comparing metrics such as data amount and file size via OSS metadata after data processing or data cleansing.
MetaSearch is recommended.
Data Auditing Perform deep statics and auditing for file content by integrating OSS metadata with vector semantics to meet compliance requirements.
Vector search is recommended.
Multi-modal Search Perform search based on multimedia data and vector semantics for advanced search scenarios, such as search in chat history, media asset, and semantics.
Vector search is recommended.
Process
The following figures show how MetaSearch and AISearch work.
How MetaSearch works
The following figure shows how to use MetaSearch to search for objects based on metadata attributes.
You upload files, such as images, videos, documents, and audio files, from an application to an OSS bucket.
You use a RAM user that has the permissions to manage OSS to enable data indexing for the bucket and select MetaSearch.
OSS uses the default index table structure to automatically create data indexes that contain OSS metadata, object ETags, and object tags.
The application calls the DoMetaQuery operation to search for objects based on metadata attributes.
OSS returns the objects that meet the search conditions.
How AISearch works
The following figure shows to use AISearch to search for objects based on metadata attributes and semantic content.
You upload files, such as images, videos, documents, and audio files, from an application to an OSS bucket.
You use a RAM user that has the permissions to manage OSS to enable data indexing for the bucket and select AISearch.
OSS uses the default index table structure and Embedding model to automatically create data indexes that contain OSS metadata, object ETags, object tags, user metadata, multimedia metadata, and semantic content.
The application calls the DoMetaQuery operation to search for objects based on metadata attributes and semantic content.
OSS returns the objects that meet the search conditions.
Get started
For more information about how to use MetaSearch and AISearch, see:
Use MetaSearch to search for OSS objects based on metadata attributes
Use AISearch to quickly search for objects based on semantic content and multimedia metadata
For further instructions in different use cases, see:
References
For details of the performance of different indexing methods, see:



