AI Content Awareness uses multi-modal AI models to extract semantic descriptions and concise summaries from images and videos stored in OSS. These descriptions are vectorized and indexed, so you can search your media library using natural language instead of object names or manual tags.
Use cases
Smart security and IPC monitoring: Run semantic searches on surveillance footage for video playback, event pushes, and trending keyword recommendations (for example, "Today's Trending Keywords").
RAG knowledge base: Build a retrieval-augmented generation (RAG) knowledge base on top of OSS data. AI Content Awareness enriches the retrieval step, giving large models more accurate and relevant context.
Knowledge and media asset management: Index nearly 70 metadata items per multimedia object to power knowledge management platforms, media asset management systems, and enterprise smart office tools.
How it works
AI Content Awareness is built around semantic search: it matches natural-language queries against object content rather than object names.
Enable and index: When you enable the feature for a bucket, OSS asynchronously scans all existing supported objects and builds an initial index.
Extract and describe: A large model extracts core semantic features for each object and generates two content-aware descriptions — a detailed description (about 100 words) and a concise summary (under 20 words).
Vectorize and index: OSS vectorizes the semantic features and descriptions, then builds a vector index library for high-performance search at scale.
Search and rerank: When you submit a query, OSS vectorizes the search text, retrieves the most similar objects from the vector index, and reranks results by combining vector representations with content-aware descriptions. This improves accuracy and recall.
Return results: Search results include the detailed descriptions and concise summaries. Use these directly to build applications such as "Search Trends" or "Daily Summary."
Initial indexing runs asynchronously. Depending on the number and size of objects in the bucket, it may take several minutes to several hours. Existing objects are not searchable during this time. For new or updated objects, OSS automatically triggers an incremental index update.
Supported regions
AI Content Awareness is available in the following regions: China (Beijing), China (Zhangjiakou), China (Hangzhou), China (Shanghai), China (Shenzhen), China (Chengdu), Singapore, and US (Virginia).
To use AI Content Awareness in Singapore or US (Virginia), submit a ticket to apply for activation.
Step 1: Create a bucket and upload objects
Log on to the OSS console.
On the Buckets page, click Create Bucket.
Enter a bucket name relevant to your use case, such as
videos-oss-metaquery. Keep the default settings for all other parameters.Click Create. On the success page, click View Bucket.
On the Objects page, click Upload File > Scan Files. Select the video files to upload, such as VideoA.mp4, VideoB.mp4, and VideoC.mp4. Keep the default settings and click Upload.
Step 2: Enable AI Content Awareness
In the left navigation pane, choose Object Management > Data Indexing.
If this is your first time using Data Indexing, follow the on-screen instructions to grant permissions to the AliyunMetaQueryDefaultRole role. Click Enable Data Indexing.
Select AISearch. In the AI Content Awareness section, choose the content type to analyze:
Option Description Image Content Awareness Analyzes image objects in the bucket Video Content Awareness Analyzes video objects in the bucket (Optional) Configure object filtering rules to limit AI analysis to specific objects. You can add up to five rules. You can filter by prefix, object size, LastModifiedTime, or ObjectTag. Example: Add a prefix rule with the value
videos/to process only objects in thevideos/folder.If you enable object filtering, Data Indexing - AISearch fees and AI Content Awareness fees are charged based on the number of filtered objects only.
Click Enable.
Building the index takes time. The duration depends on the number of objects in the bucket. Refresh the page to check the status.
Step 3: Run a semantic search
After the index is built, query your media objects using natural language. The service is fully serverless — you do not need to manage resource scaling, the vectorization pipeline, or index storage.
Console
On the Buckets page, click your bucket name.
On the Objects page, confirm the uploaded objects are listed.
In the left navigation pane, choose Object Management > Data Indexing.
In the search box, enter a natural-language description such as
a yard with a parked car. In the Media Type section, select Video. Click Query Now.
The search returns matching videos along with their content summaries.
SDK
Java SDK
Available in Java SDK 3.18.2 and later. For the full API reference, see Vector Search (Java SDK V1).
All examples use MetaQueryMode.SEMANTIC to submit semantic search requests.
import com.aliyun.oss.*;
import com.aliyun.oss.common.auth.CredentialsProviderFactory;
import com.aliyun.oss.common.auth.EnvironmentVariableCredentialsProvider;
import com.aliyun.oss.common.comm.SignVersion;
import com.aliyun.oss.model.*;
import java.util.ArrayList;
import java.util.List;
public class DoMetaQuery {
public static void main(String[] args) throws Exception {
// Replace with the endpoint for your bucket's region.
String endpoint = "https://oss-cn-hangzhou.aliyuncs.com";
String bucketName = "examplebucket";
// Credentials are read from the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables.
EnvironmentVariableCredentialsProvider credentialsProvider = CredentialsProviderFactory.newEnvironmentVariableCredentialsProvider();
String region = "cn-hangzhou";
ClientBuilderConfiguration clientBuilderConfiguration = new ClientBuilderConfiguration();
clientBuilderConfiguration.setSignatureVersion(SignVersion.V4);
OSS ossClient = OSSClientBuilder.create()
.endpoint(endpoint)
.credentialsProvider(credentialsProvider)
.clientConfiguration(clientBuilderConfiguration)
.region(region)
.build();
try {
int maxResults = 20;
List<String> mediaTypes = new ArrayList<String>();
mediaTypes.add("image");
String query = "Snow";
String simpleQuery = "{\"Operation\":\"gt\", \"Field\": \"Size\", \"Value\": \"30\"}";
String sort = "Size";
DoMetaQueryRequest doMetaQueryRequest = new DoMetaQueryRequest(
bucketName, maxResults, query, sort, MetaQueryMode.SEMANTIC, mediaTypes, simpleQuery);
DoMetaQueryResult doMetaQueryResult = ossClient.doMetaQuery(doMetaQueryRequest);
} catch (OSSException oe) {
System.out.println("Error Message: " + oe.getErrorMessage());
System.out.println("Error Code: " + oe.getErrorCode());
System.out.println("Request ID: " + oe.getRequestId());
System.out.println("Host ID: " + oe.getHostId());
} catch (ClientException ce) {
System.out.println("Error Message: " + ce.getMessage());
} finally {
if (ossClient != null) {
ossClient.shutdown();
}
}
}
}Python SDK
For the full API reference, see AISearch (Python SDK V2).
import argparse
import alibabacloud_oss_v2 as oss
parser = argparse.ArgumentParser(description="do meta query semantic sample")
parser.add_argument('--region', help='The region in which the bucket is located.', required=True)
parser.add_argument('--bucket', help='The name of the bucket.', required=True)
parser.add_argument('--endpoint', help='The domain names that other services can use to access OSS')
def main():
args = parser.parse_args()
# Credentials are read from the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables.
credentials_provider = oss.credentials.EnvironmentVariableCredentialsProvider()
cfg = oss.config.load_default()
cfg.credentials_provider = credentials_provider
cfg.region = args.region
if args.endpoint is not None:
cfg.endpoint = args.endpoint
client = oss.Client(cfg)
result = client.do_meta_query(oss.DoMetaQueryRequest(
bucket=args.bucket,
mode='semantic',
meta_query=oss.MetaQuery(
max_results=1000,
query='An aerial view of a snow-covered forest',
order='desc',
media_types=oss.MetaQueryMediaTypes(
media_type=['image']
),
simple_query='{"Operation":"gt", "Field": "Size", "Value": "30"}',
),
))
print(vars(result))
if __name__ == "__main__":
main()Go SDK
For the full API reference, see AISearch (Go SDK V2).
package main
import (
"context"
"flag"
"log"
"github.com/aliyun/alibabacloud-oss-go-sdk-v2/oss"
"github.com/aliyun/alibabacloud-oss-go-sdk-v2/oss/credentials"
)
var (
region string
bucketName string
)
func init() {
flag.StringVar(®ion, "region", "", "The region in which the bucket is located.")
flag.StringVar(&bucketName, "bucket", "", "The name of the bucket.")
}
func main() {
flag.Parse()
if len(bucketName) == 0 {
flag.PrintDefaults()
log.Fatalf("invalid parameters, bucket name required")
}
if len(region) == 0 {
flag.PrintDefaults()
log.Fatalf("invalid parameters, region required")
}
// Credentials are read from the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables.
cfg := oss.LoadDefaultConfig().
WithCredentialsProvider(credentials.NewEnvironmentVariableCredentialsProvider()).
WithRegion(region)
client := oss.NewClient(cfg)
request := &oss.DoMetaQueryRequest{
Bucket: oss.Ptr(bucketName),
Mode: oss.Ptr("semantic"),
MetaQuery: &oss.MetaQuery{
MaxResults: oss.Ptr(int64(99)),
Query: oss.Ptr("Overlook the snow-covered forest"),
MediaType: oss.Ptr("image"),
SimpleQuery: oss.Ptr(`{"Operation":"gt", "Field": "Size", "Value": "30"}`),
},
}
result, err := client.DoMetaQuery(context.TODO(), request)
if err != nil {
log.Fatalf("failed to do meta query %v", err)
}
log.Printf("do meta query result:%#v\n", result)
}PHP SDK
For the full API reference, see AISearch (PHP SDK V2).
<?php
require_once __DIR__ . '/../vendor/autoload.php';
use AlibabaCloud\Oss\V2 as Oss;
$optsdesc = [
"region" => ['help' => 'The region in which the bucket is located.', 'required' => true],
"endpoint" => ['help' => 'The domain names that other services can use to access OSS.', 'required' => false],
"bucket" => ['help' => 'The name of the bucket.', 'required' => true],
];
$longopts = \array_map(function ($key) {
return "$key:";
}, array_keys($optsdesc));
$options = getopt("", $longopts);
foreach ($optsdesc as $key => $value) {
if ($value['required'] === true && empty($options[$key])) {
$help = $value['help'];
echo "Error: the following arguments are required: --$key, $help" . PHP_EOL;
exit(1);
}
}
$region = $options["region"];
$bucket = $options["bucket"];
// Credentials are read from the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables.
$credentialsProvider = new Oss\Credentials\EnvironmentVariableCredentialsProvider();
$cfg = Oss\Config::loadDefault();
$cfg->setCredentialsProvider($credentialsProvider);
$cfg->setRegion($region);
if (isset($options["endpoint"])) {
$cfg->setEndpoint($options["endpoint"]);
}
$client = new Oss\Client($cfg);
$request = new Oss\Models\DoMetaQueryRequest($bucket, new Oss\Models\MetaQuery(
maxResults: 99,
query: "Overlook the snow-covered forest",
mediaTypes: new Oss\Models\MetaQueryMediaTypes('image'),
simpleQuery: '{"Operation":"gt", "Field": "Size", "Value": "30"}',
), 'semantic');
$result = $client->doMetaQuery($request);
printf(
'status code: ' . $result->statusCode . PHP_EOL .
'request id: ' . $result->requestId . PHP_EOL .
'result: ' . var_export($result, true)
);ossutil
The following example queries objects in examplebucket that semantically match the description.
ossutil api do-meta-query --bucket examplebucket \
--meta-query "{\"Query\":\"Overlooking the snow covered forest\",\"MediaTypes\":{\"MediaType\":\"video\"},\"SimpleQuery\":\"{\\\"Operation\\\":\\\"gt\\\", \\\"Field\\\": \\\"Size\\\", \\\"Value\\\": \\\"1\\\"}\"}" \
--meta-query-mode semanticFor the full command reference, see do-meta-query.
Billing
Using AI Content Awareness incurs three types of fees:
| Fee type | Description |
|---|---|
| Data indexing fees | Charged for Data Index - AISearch mode |
| AI Content Awareness fees | Charged per image or video object processed; triggered for existing objects when you enable the feature, and for each new uploaded object automatically. We recommend that you monitor your bills closely. |
| API request fees | Charged per API call during initial indexing and incremental index updates |
The following API operations trigger API request fees:
| Operation | API |
|---|---|
| Scan objects in a bucket | ListObjects |
| Build an index for objects | HeadObject and GetObject |
| Object in the bucket has a tag | GetObjectTag |
| Object has custom metadata | GetObjectMeta |
| Bucket contains a symbolic link | GetSymlink |
To stop incurring charges, disable AISearch promptly.
API reference
AI Content Awareness is built on the following RESTful APIs. For applications with custom integration requirements, call these APIs directly (note that you must implement signature calculation manually).
| Purpose | API |
|---|---|
| Enable metadata management | OpenMetaQuery |
| Check metadata management status | GetMetaQueryStatus |
| Query objects by condition | DoMetaQuery |
| Disable metadata management | CloseMetaQuery |