All Products
Search
Document Center

Object Storage Service:AI content awareness

Last Updated:Mar 20, 2026

AI Content Awareness uses multi-modal AI models to extract semantic descriptions and concise summaries from images and videos stored in OSS. These descriptions are vectorized and indexed, so you can search your media library using natural language instead of object names or manual tags.

Use cases

  • Smart security and IPC monitoring: Run semantic searches on surveillance footage for video playback, event pushes, and trending keyword recommendations (for example, "Today's Trending Keywords").

  • RAG knowledge base: Build a retrieval-augmented generation (RAG) knowledge base on top of OSS data. AI Content Awareness enriches the retrieval step, giving large models more accurate and relevant context.

  • Knowledge and media asset management: Index nearly 70 metadata items per multimedia object to power knowledge management platforms, media asset management systems, and enterprise smart office tools.

How it works

AI Content Awareness is built around semantic search: it matches natural-language queries against object content rather than object names.

  1. Enable and index: When you enable the feature for a bucket, OSS asynchronously scans all existing supported objects and builds an initial index.

  2. Extract and describe: A large model extracts core semantic features for each object and generates two content-aware descriptions — a detailed description (about 100 words) and a concise summary (under 20 words).

  3. Vectorize and index: OSS vectorizes the semantic features and descriptions, then builds a vector index library for high-performance search at scale.

  4. Search and rerank: When you submit a query, OSS vectorizes the search text, retrieves the most similar objects from the vector index, and reranks results by combining vector representations with content-aware descriptions. This improves accuracy and recall.

  5. Return results: Search results include the detailed descriptions and concise summaries. Use these directly to build applications such as "Search Trends" or "Daily Summary."

Initial indexing runs asynchronously. Depending on the number and size of objects in the bucket, it may take several minutes to several hours. Existing objects are not searchable during this time. For new or updated objects, OSS automatically triggers an incremental index update.

Supported regions

AI Content Awareness is available in the following regions: China (Beijing), China (Zhangjiakou), China (Hangzhou), China (Shanghai), China (Shenzhen), China (Chengdu), Singapore, and US (Virginia).

To use AI Content Awareness in Singapore or US (Virginia), submit a ticket to apply for activation.

Step 1: Create a bucket and upload objects

  1. Log on to the OSS console.

  2. On the Buckets page, click Create Bucket.

  3. Enter a bucket name relevant to your use case, such as videos-oss-metaquery. Keep the default settings for all other parameters.

  4. Click Create. On the success page, click View Bucket.

  5. On the Objects page, click Upload File > Scan Files. Select the video files to upload, such as VideoA.mp4, VideoB.mp4, and VideoC.mp4. Keep the default settings and click Upload.

Step 2: Enable AI Content Awareness

  1. In the left navigation pane, choose Object Management > Data Indexing.

  2. If this is your first time using Data Indexing, follow the on-screen instructions to grant permissions to the AliyunMetaQueryDefaultRole role. Click Enable Data Indexing.

  3. Select AISearch. In the AI Content Awareness section, choose the content type to analyze:

    OptionDescription
    Image Content AwarenessAnalyzes image objects in the bucket
    Video Content AwarenessAnalyzes video objects in the bucket
  4. (Optional) Configure object filtering rules to limit AI analysis to specific objects. You can add up to five rules. You can filter by prefix, object size, LastModifiedTime, or ObjectTag. Example: Add a prefix rule with the value videos/ to process only objects in the videos/ folder.

    If you enable object filtering, Data Indexing - AISearch fees and AI Content Awareness fees are charged based on the number of filtered objects only.
  5. Click Enable.

Building the index takes time. The duration depends on the number of objects in the bucket. Refresh the page to check the status.

Step 3: Run a semantic search

After the index is built, query your media objects using natural language. The service is fully serverless — you do not need to manage resource scaling, the vectorization pipeline, or index storage.

Console

  1. On the Buckets page, click your bucket name.

  2. On the Objects page, confirm the uploaded objects are listed.

  3. In the left navigation pane, choose Object Management > Data Indexing.

  4. In the search box, enter a natural-language description such as a yard with a parked car. In the Media Type section, select Video. Click Query Now.

The search returns matching videos along with their content summaries.

SDK

Java SDK

Available in Java SDK 3.18.2 and later. For the full API reference, see Vector Search (Java SDK V1).

All examples use MetaQueryMode.SEMANTIC to submit semantic search requests.

import com.aliyun.oss.*;
import com.aliyun.oss.common.auth.CredentialsProviderFactory;
import com.aliyun.oss.common.auth.EnvironmentVariableCredentialsProvider;
import com.aliyun.oss.common.comm.SignVersion;
import com.aliyun.oss.model.*;
import java.util.ArrayList;
import java.util.List;

public class DoMetaQuery {
    public static void main(String[] args) throws Exception {
        // Replace with the endpoint for your bucket's region.
        String endpoint = "https://oss-cn-hangzhou.aliyuncs.com";
        String bucketName = "examplebucket";
        // Credentials are read from the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables.
        EnvironmentVariableCredentialsProvider credentialsProvider = CredentialsProviderFactory.newEnvironmentVariableCredentialsProvider();
        String region = "cn-hangzhou";

        ClientBuilderConfiguration clientBuilderConfiguration = new ClientBuilderConfiguration();
        clientBuilderConfiguration.setSignatureVersion(SignVersion.V4);
        OSS ossClient = OSSClientBuilder.create()
                .endpoint(endpoint)
                .credentialsProvider(credentialsProvider)
                .clientConfiguration(clientBuilderConfiguration)
                .region(region)
                .build();

        try {
            int maxResults = 20;
            List<String> mediaTypes = new ArrayList<String>();
            mediaTypes.add("image");
            String query = "Snow";
            String simpleQuery = "{\"Operation\":\"gt\", \"Field\": \"Size\", \"Value\": \"30\"}";
            String sort = "Size";
            DoMetaQueryRequest doMetaQueryRequest = new DoMetaQueryRequest(
                    bucketName, maxResults, query, sort, MetaQueryMode.SEMANTIC, mediaTypes, simpleQuery);
            DoMetaQueryResult doMetaQueryResult = ossClient.doMetaQuery(doMetaQueryRequest);
        } catch (OSSException oe) {
            System.out.println("Error Message: " + oe.getErrorMessage());
            System.out.println("Error Code:    " + oe.getErrorCode());
            System.out.println("Request ID:    " + oe.getRequestId());
            System.out.println("Host ID:       " + oe.getHostId());
        } catch (ClientException ce) {
            System.out.println("Error Message: " + ce.getMessage());
        } finally {
            if (ossClient != null) {
                ossClient.shutdown();
            }
        }
    }
}

Python SDK

For the full API reference, see AISearch (Python SDK V2).

import argparse
import alibabacloud_oss_v2 as oss

parser = argparse.ArgumentParser(description="do meta query semantic sample")
parser.add_argument('--region', help='The region in which the bucket is located.', required=True)
parser.add_argument('--bucket', help='The name of the bucket.', required=True)
parser.add_argument('--endpoint', help='The domain names that other services can use to access OSS')

def main():
    args = parser.parse_args()

    # Credentials are read from the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables.
    credentials_provider = oss.credentials.EnvironmentVariableCredentialsProvider()

    cfg = oss.config.load_default()
    cfg.credentials_provider = credentials_provider
    cfg.region = args.region
    if args.endpoint is not None:
        cfg.endpoint = args.endpoint

    client = oss.Client(cfg)

    result = client.do_meta_query(oss.DoMetaQueryRequest(
        bucket=args.bucket,
        mode='semantic',
        meta_query=oss.MetaQuery(
            max_results=1000,
            query='An aerial view of a snow-covered forest',
            order='desc',
            media_types=oss.MetaQueryMediaTypes(
                media_type=['image']
            ),
            simple_query='{"Operation":"gt", "Field": "Size", "Value": "30"}',
        ),
    ))

    print(vars(result))

if __name__ == "__main__":
    main()

Go SDK

For the full API reference, see AISearch (Go SDK V2).

package main

import (
	"context"
	"flag"
	"log"

	"github.com/aliyun/alibabacloud-oss-go-sdk-v2/oss"
	"github.com/aliyun/alibabacloud-oss-go-sdk-v2/oss/credentials"
)

var (
	region     string
	bucketName string
)

func init() {
	flag.StringVar(&region, "region", "", "The region in which the bucket is located.")
	flag.StringVar(&bucketName, "bucket", "", "The name of the bucket.")
}

func main() {
	flag.Parse()

	if len(bucketName) == 0 {
		flag.PrintDefaults()
		log.Fatalf("invalid parameters, bucket name required")
	}
	if len(region) == 0 {
		flag.PrintDefaults()
		log.Fatalf("invalid parameters, region required")
	}

	// Credentials are read from the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables.
	cfg := oss.LoadDefaultConfig().
		WithCredentialsProvider(credentials.NewEnvironmentVariableCredentialsProvider()).
		WithRegion(region)

	client := oss.NewClient(cfg)

	request := &oss.DoMetaQueryRequest{
		Bucket: oss.Ptr(bucketName),
		Mode:   oss.Ptr("semantic"),
		MetaQuery: &oss.MetaQuery{
			MaxResults:  oss.Ptr(int64(99)),
			Query:       oss.Ptr("Overlook the snow-covered forest"),
			MediaType:   oss.Ptr("image"),
			SimpleQuery: oss.Ptr(`{"Operation":"gt", "Field": "Size", "Value": "30"}`),
		},
	}
	result, err := client.DoMetaQuery(context.TODO(), request)
	if err != nil {
		log.Fatalf("failed to do meta query %v", err)
	}

	log.Printf("do meta query result:%#v\n", result)
}

PHP SDK

For the full API reference, see AISearch (PHP SDK V2).

<?php

require_once __DIR__ . '/../vendor/autoload.php';

use AlibabaCloud\Oss\V2 as Oss;

$optsdesc = [
    "region"   => ['help' => 'The region in which the bucket is located.', 'required' => true],
    "endpoint" => ['help' => 'The domain names that other services can use to access OSS.', 'required' => false],
    "bucket"   => ['help' => 'The name of the bucket.', 'required' => true],
];

$longopts = \array_map(function ($key) {
    return "$key:";
}, array_keys($optsdesc));

$options = getopt("", $longopts);

foreach ($optsdesc as $key => $value) {
    if ($value['required'] === true && empty($options[$key])) {
        $help = $value['help'];
        echo "Error: the following arguments are required: --$key, $help" . PHP_EOL;
        exit(1);
    }
}

$region = $options["region"];
$bucket = $options["bucket"];

// Credentials are read from the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables.
$credentialsProvider = new Oss\Credentials\EnvironmentVariableCredentialsProvider();

$cfg = Oss\Config::loadDefault();
$cfg->setCredentialsProvider($credentialsProvider);
$cfg->setRegion($region);
if (isset($options["endpoint"])) {
    $cfg->setEndpoint($options["endpoint"]);
}

$client = new Oss\Client($cfg);

$request = new Oss\Models\DoMetaQueryRequest($bucket, new Oss\Models\MetaQuery(
    maxResults: 99,
    query: "Overlook the snow-covered forest",
    mediaTypes: new Oss\Models\MetaQueryMediaTypes('image'),
    simpleQuery: '{"Operation":"gt", "Field": "Size", "Value": "30"}',
), 'semantic');

$result = $client->doMetaQuery($request);
printf(
    'status code: ' . $result->statusCode . PHP_EOL .
    'request id: ' . $result->requestId . PHP_EOL .
    'result: ' . var_export($result, true)
);

ossutil

The following example queries objects in examplebucket that semantically match the description.

ossutil api do-meta-query --bucket examplebucket \
  --meta-query "{\"Query\":\"Overlooking the snow covered forest\",\"MediaTypes\":{\"MediaType\":\"video\"},\"SimpleQuery\":\"{\\\"Operation\\\":\\\"gt\\\", \\\"Field\\\": \\\"Size\\\", \\\"Value\\\": \\\"1\\\"}\"}" \
  --meta-query-mode semantic

For the full command reference, see do-meta-query.

Billing

Using AI Content Awareness incurs three types of fees:

Fee typeDescription
Data indexing feesCharged for Data Index - AISearch mode
AI Content Awareness feesCharged per image or video object processed; triggered for existing objects when you enable the feature, and for each new uploaded object automatically. We recommend that you monitor your bills closely.
API request feesCharged per API call during initial indexing and incremental index updates

The following API operations trigger API request fees:

OperationAPI
Scan objects in a bucketListObjects
Build an index for objectsHeadObject and GetObject
Object in the bucket has a tagGetObjectTag
Object has custom metadataGetObjectMeta
Bucket contains a symbolic linkGetSymlink

To stop incurring charges, disable AISearch promptly.

API reference

AI Content Awareness is built on the following RESTful APIs. For applications with custom integration requirements, call these APIs directly (note that you must implement signature calculation manually).

PurposeAPI
Enable metadata managementOpenMetaQuery
Check metadata management statusGetMetaQueryStatus
Query objects by conditionDoMetaQuery
Disable metadata managementCloseMetaQuery