Index OSS data to quickly find images, videos, documents, and audio by metadata or semantic content.
Why use data indexing
Traditional retrieval methods have limitations that data indexing addresses:
|
Traditional retrieval |
OSS data indexing |
|
Complex operations: Requires iterating objects with ListObjects and building a custom metadata database. |
Easy to use: No data migration or custom search system needed. Query and analyze data directly through indexes that OSS builds automatically. |
|
Low retrieval performance: Slow and inefficient at massive scale. |
High-performance retrieval: Sub-second indexing and aggregation across tens of billions of objects. |
|
Limited search capabilities: Only supports OSS metadata-based searches. |
Multimodal support: Semantic search and object feature analysis across content types. |
Supported data retrieval methods
OSS supports two retrieval methods: MetaSearch and AISearch.
|
Item |
MetaSearch |
AISearch |
|
Description |
Queries objects by metadata attributes including OSS metadata, ETags, and object tags. |
Converts documents, images, videos, and audio into vectors, then retrieves objects by semantic similarity. |
|
Use cases |
Object search and statistics. |
Multimodal search and complex object retrieval. |
|
Example query |
Search for objects uploaded on September 14, 2024, with a private ACL and the Standard storage class. The OSS Metadata panel filters by storage class, ACL, Object Name, Upload Type, Last Modified Time, Object Size, and Version. |
Search for images related to "apple". Enter "apple" in the Semantic Content field (AI tag) and select Image under Multimedia Metadata. Filter further by OSS Metadata: storage class, ACL, Object Name (wildcard supported), Upload Type, Last Modified Time, Object Size, and Object Version. |
|
Example result |
Returns a list of objects uploaded on September 14, 2024, with a private ACL and the Standard storage class. The query returns three objects: |
Returns a list of image objects related to "apple". For example, the object |
Choose a data retrieval method
Comparison of search criteria
|
Search criteria |
MetaSearch |
AISearch |
|
OSS metadata |
✅ |
✅ |
|
Object tags and ETags |
✅ |
✅ |
|
User metadata |
❌ |
✅ |
|
Multimedia metadata |
❌ |
✅ |
|
Semantic content |
❌ |
✅ |
-
All supported MetaSearch metadata fields are listed in Appendix: List of fields and operators for MetaSearch.
-
All supported AISearch metadata fields are listed in Appendix: List of AISearch fields and operators.
Recommended use cases
-
Cost optimization statistics
Use metadata such as timestamps to identify unused or cold data and optimize storage costs.
Recommended: MetaSearch.
-
Data validation
After data processing or cleaning, compare metrics like data volume and object size to verify results.
Recommended: MetaSearch.
-
Data auditing
Combine metadata and semantic content to audit object content for compliance.
Recommended: AISearch.
-
Multimodal search
Retrieve objects by multimedia data and semantic content—ideal for chat histories, media asset libraries, and semantic search.
Recommended: AISearch.
How it works
How MetaSearch works
-
An application uploads objects—such as images, videos, documents, and audio—to an OSS bucket.
-
A RAM user with OSS management permissions enables data indexing and selects MetaSearch.
-
OSS automatically creates a data index using the default schema, containing OSS metadata, ETags, and object tags.
-
The application calls the DoMetaQuery API to query objects by metadata attributes.
-
OSS returns objects matching the query conditions.
How AISearch works
-
An application uploads objects—such as images, videos, documents, and audio—to an OSS bucket.
-
A RAM user with OSS management permissions enables data indexing and selects AISearch.
-
OSS automatically creates a data index using the default schema and an embedding model, containing OSS metadata, ETags, object tags, user metadata, multimedia metadata, and semantic content.
-
The application calls the DoMetaQuery API to query objects by metadata attributes and semantic content.
-
OSS returns objects matching the query conditions.
Get started
Get started with MetaSearch and AISearch:
-
Use MetaSearch to search for OSS objects by metadata attributes
-
Use AISearch to quickly search for objects using semantic content and multimedia metadata
Tutorials for specific use cases:
-
Statistics scenario: Tutorial: Use OSS data indexing for large-scale data statistics
-
Multimodal search scenario: Tutorial: Multimodal search with OSS data indexing
-
Intelligent video search scenario: Tutorial: Build a smart semantic search system for IPC devices
Performance reference
Once enabled, data indexing builds and continuously updates the metadata index with dedicated query capacity (QPS).
MetaSearch and AISearch differ in index build time, update latency, and QPS limits: