Use OSS vector storage and Model Studio to build a multimodal semantic image retrieval system - Object Storage Service

Use an OSS Vector Bucket to store, query, and manage vector data. Combine an OSS Vector Bucket with the multimodal embedding model from Alibaba Cloud Model Studio to build an intelligent semantic search system for massive image datasets. This solution enables text-to-image search based on natural language descriptions and is a best practice for scenarios like e-commerce product search, smart photo albums, media asset management, AI-powered semantic search, and image knowledge bases.

Solution overview

Building a multimodal image semantic search system involves the following steps:

Prepare your environment: Obtain access credentials and install the OSS Python SDK and the Alibaba Cloud Model Studio SDK.
Upload image data to OSS: Prepare the image dataset for search and upload it to an OSS bucket.
Create an OSS Vector Bucket and a vector index: Set up a bucket and index to store the vector data.
Generate and write vectors: Use the Alibaba Cloud Model Studio multimodal embedding model to convert images into high-dimensional vectors and write them to the OSS vector index.
Perform a semantic search: Convert query text into a vector and perform a similarity search in the vector index, filtering the results with metadata.
Build a visual demo: Create a web interface to visualize the semantic search results.

1. Prepare your environment

Obtain access credentials

You have activated the OSS service and obtained an AccessKey ID and AccessKey Secret.
You have activated Alibaba Cloud Model Studio and obtained an API Key.

Install the SDKs

Install Python 3.12 or later.
Run the following commands to install the Alibaba Cloud OSS Python SDK V2 and the Alibaba Cloud Model Studio SDK.
```
pip install alibabacloud-oss-v2
pip install dashscope
```

Configure environment variables

For security and portability, configure your access credentials as environment variables.

# Model Studio API Key
export DASHSCOPE_API_KEY=<your_api_key>

# OSS access credentials
export oss_test_access_key_id=<your_access_key_id>
export oss_test_access_key_secret=<your_access_key_secret>
export oss_test_region=<your_region, e.g., cn-hangzhou>
export oss_test_account_id=<your_alibaba_cloud_account_id>

2. Upload image data to OSS

Upload your local image data to an OSS bucket. The Model Studio embedding model requires access to these images through OSS file URLs to vectorize them. The following code demonstrates how to batch upload images from a local folder to a specified bucket.

 -*- coding: utf-8 -*-
"""
Example: Uploads images using the file upload manager

This example shows how to use the OSS SDK's file upload manager for efficient uploads.
This is suitable for large files or scenarios requiring resumable uploads.
"""

import os
import alibabacloud_oss_v2 as oss
from alibabacloud_oss_v2.models import PutObjectRequest


def create_oss_client():
    """Create an OSS client."""
    access_key_id = os.environ.get('oss_test_access_key_id')
    access_key_secret = os.environ.get('oss_test_access_key_secret')
    region = os.environ.get('oss_test_region')
    
    cfg = oss.config.load_default()
    cfg.credentials_provider = oss.credentials.StaticCredentialsProvider(
        access_key_id, access_key_secret
    )
    cfg.region = region
    return oss.Client(cfg)


def upload_with_uploader(client, bucket_name: str, local_path: str, oss_key: str):
    """
    Uploads a file using the upload manager.
    
    Args:
        client: The OSS client.
        bucket_name: The name of the OSS bucket.
        local_path: The local file path.
        oss_key: The OSS object key.
    """
    # Create an upload manager.
    uploader = client.uploader()
    
    # Execute the upload.
    result = uploader.upload_file(
        filepath=local_path,
        request=PutObjectRequest(
            bucket=bucket_name,
            key=oss_key
        )
    )
    return result


def main():
    client = create_oss_client()
    
    bucket_name = "your-bucket-name"
    # Note: The data/photograph/ directory in the GitHub repository linked at the end of this topic contains sample images you can use.
    # You can also modify the local_image_path variable to point to your own image directory.
    local_image_path = "data/photograph/"
    oss_prefix = "photograph/"
    
    image_files = os.listdir(local_image_path)
    print(f"Number of images to upload: {len(image_files)}")
    
    for i, image_name in enumerate(image_files, 1):
        local_path = os.path.join(local_image_path, image_name)
        oss_key = f"{oss_prefix}{image_name}"
        
        try:
            result = upload_with_uploader(client, bucket_name, local_path, oss_key)
            print(f"[{i}/{len(image_files)}] Upload successful: {image_name}, status: {result.status_code}")
        except Exception as e:
            print(f"[{i}/{len(image_files)}] Upload failed for {image_name}: {e}")
    
    print(f"\nUpload complete!")


if __name__ == "__main__":
    main()

3. Create a vector bucket and index

3.1 Create a vector bucket

Use the OSS SDK to create a vector bucket. This bucket stores all your vector data and vector indexes.

# -*- coding: utf-8 -*-
"""
Example: Create a vector bucket

This example shows how to create an OSS Vector Bucket.

Prerequisites:
1. alibabacloud-oss-v2 is installed: pip install alibabacloud-oss-v2
2. Environment variables are set (see the client initialization example).
"""

import os
import alibabacloud_oss_v2 as oss
import alibabacloud_oss_v2.vectors as oss_vectors


def main():
    # Obtain credentials from environment variables.
    access_key_id = os.environ.get('oss_test_access_key_id')
    access_key_secret = os.environ.get('oss_test_access_key_secret')
    region = os.environ.get('oss_test_region')
    account_id = os.environ.get('oss_test_account_id')
    
    # Initialize the client.
    cfg = oss.config.load_default()
    cfg.credentials_provider = oss.credentials.StaticCredentialsProvider(
        access_key_id, access_key_secret
    )
    cfg.region = region
    cfg.account_id = account_id
    client = oss_vectors.Client(cfg)
    
    # Specify the vector bucket name.
    vector_bucket_name = "my-test-2"
    
    print(f"Creating vector bucket: {vector_bucket_name}")
    
    try:
        # Create the vector bucket.
        result = client.put_vector_bucket(oss_vectors.models.PutVectorBucketRequest(
            bucket=vector_bucket_name,
        ))
        print(f"Creation successful!")
        print(f"  status code: {result.status_code}")
        print(f"  request id: {result.request_id}")
    except Exception as e:
        print(f"Creation failed: {e}")
        print("Note: This operation returns an error if the bucket already exists.")


if __name__ == "__main__":
    main()

3.2 Create a vector index

After creating a vector bucket, you must create a vector index to store and query vector data. An index defines the vector dimension and distance metric, which are essential for writing and retrieving vectors. Once the index is created, you can add vector data and its associated scalar metadata row by row.

Note: The dimension of the vector index must match the dimension of the embedding model used in Model Studio.

The following example creates a vector index with 1,024 dimensions and uses cosine distance as the distance metric.

# -*- coding: utf-8 -*-
"""
Example: Create a vector index

This example shows how to create a vector index in a vector bucket.

Prerequisites:
1. alibabacloud-oss-v2 is installed: pip install alibabacloud-oss-v2
2. Environment variables are set.
3. A vector bucket is created.
"""

import os
import alibabacloud_oss_v2 as oss
import alibabacloud_oss_v2.vectors as oss_vectors


def main():
    # Obtain credentials from environment variables.
    access_key_id = os.environ.get('oss_test_access_key_id')
    access_key_secret = os.environ.get('oss_test_access_key_secret')
    region = os.environ.get('oss_test_region')
    account_id = os.environ.get('oss_test_account_id')
    
    # Initialize the client.
    cfg = oss.config.load_default()
    cfg.credentials_provider = oss.credentials.StaticCredentialsProvider(
        access_key_id, access_key_secret
    )
    cfg.region = region
    cfg.account_id = account_id
    client = oss_vectors.Client(cfg)
    
    # Specify the configuration parameters.
    vector_bucket_name = "my-test-2"
    vector_index_name = "test1"
    dimension = 1024  # The output dimension of the Model Studio multimodal embedding model.
    
    print(f"Creating vector index:")
    print(f"  Bucket: {vector_bucket_name}")
    print(f"  Index: {vector_index_name}")
    print(f"  Dimension: {dimension}")
    
    # Create the vector index.
    result = client.put_vector_index(oss_vectors.models.PutVectorIndexRequest(
        bucket=vector_bucket_name,
        index_name=vector_index_name,
        dimension=dimension,
        data_type='float32',           # The vector data type.
        distance_metric='cosine',       # This example uses cosine distance.
        metadata={
            "nonFilterableMetadataKeys": ["key1", "key2"]  # Metadata fields to exclude from filtering.
        }
    ))   
    print(f"\nCreation successful!")
    print(f"  status code: {result.status_code}")
    print(f"  request id: {result.request_id}")

if __name__ == "__main__":
    main()

Parameters:

Parameter	Description
dimension	The vector dimension. This must match the output dimension of the embedding model.
data_type	Supported value: `float32`.
distance_metric	The distance metric. Supported values: `cosine` and `euclidean`.
metadata	The metadata configuration. Configure non-filterable metadata fields to store additional descriptive information not used in search filters.

4. Generate and write vectors

OSS Vector Bucket supports writing vectors from any source, including Alibaba Cloud Model Studio or a self-hosted vectorization service. You can generate and write vectors by calling the Alibaba Cloud Model Studio SDK and the OSS SDK separately, or you can use the OSS-Vectors-Embed-CLI to perform both actions in a single command.

Using SDKs

This example uses the multimodal-embedding-v1 model from Alibaba Cloud Model Studio to convert raw images into 1,024-dimension vectors and write them to a vector index in an OSS Vector Bucket.

4.1 Generate vector data

import dashscope
from dashscope import MultiModalEmbeddingItemImage


def embedding_image(image_url: str) -> list[float]:
    """
    Convert an image to a vector.
    
    Args:
        image_url: The URL of the image. Both OSS URLs and public URLs are supported.
    
    Returns:
        A list of floats representing the 1,024-dimension vector.
    """
    resp = dashscope.MultiModalEmbedding.call(
        model="multimodal-embedding-v1",
        input=[MultiModalEmbeddingItemImage(image=image_url, factor=1.0)]
    )
    return resp.output["embeddings"][0]["embedding"]


def main():
    # The URL of the sample image. Replace it with an accessible URL. If the image is private, use a signed temporary URL.
    image_url = "http://your-bucket-name.oss-cn-hangzhou.aliyuncs.com/photograph/Zsd0YhBa8LM.jpg"
    
    print(f"Vectorizing the image: {image_url}")
    
    # Call the Embedding API.
    resp = dashscope.MultiModalEmbedding.call(
        model="multimodal-embedding-v1",
        input=[MultiModalEmbeddingItemImage(image=image_url, factor=1.0)]
    )
    
    # Print the full response.
    print("\nFull response:")
    print(resp)
    
    # Obtain the vector.
    embedding = resp.output["embeddings"][0]["embedding"]
    print(f"\nVector dimension: {len(embedding)}")
    print(f"First 10 elements of the vector: {embedding[:10]}")


if __name__ == "__main__":
    main()

4.2 Write vector data

# -*- coding: utf-8 -*-
"""
Example: Batch write image vector data

This example shows how to batch write vectorized image data to a vector index.

Prerequisites:
1. alibabacloud-oss-v2 is installed: pip install alibabacloud-oss-v2
2. Environment variables are set.
3. A vector index is created.
4. The image vector data file (data/data.json) is prepared.
"""

import os
import json
import alibabacloud_oss_v2 as oss
import alibabacloud_oss_v2.vectors as oss_vectors


def main():
    # Obtain credentials from environment variables.
    access_key_id = os.environ.get('oss_test_access_key_id')
    access_key_secret = os.environ.get('oss_test_access_key_secret')
    region = os.environ.get('oss_test_region')
    account_id = os.environ.get('oss_test_account_id')
    
    # Initialize the client.
    cfg = oss.config.load_default()
    cfg.credentials_provider = oss.credentials.StaticCredentialsProvider(
        access_key_id, access_key_secret
    )
    cfg.region = region
    cfg.account_id = account_id
    client = oss_vectors.Client(cfg)
    
    # Specify the configuration parameters.
    vector_bucket_name = "my-test-2"
    vector_index_name = "test1"
    
    # Load pre-processed image vector data.
    # Note: The data/ directory in the GitHub repository linked at the end of this topic contains a sample file you can use.
    # You can also modify the data_file variable to point to your own data file.
    data_file = "./data/data.json"
    print(f"Loading image vector data: {data_file}")
    
    image_data_array = []
    with open(data_file, "r") as f:
        image_data_array = json.load(f)
    
    print(f"Loaded {len(image_data_array)} image vector data entries")
    
    # Print a data sample.
    if len(image_data_array) > 0:
        sample = image_data_array[0]
        print(f"\nData sample:")
        print(f"  key: {sample.get('key', 'N/A')}")
        if 'metadata' in sample:
            print(f"  metadata: {sample['metadata']}")
        if 'data' in sample and 'float32' in sample['data']:
            print(f"  Vector dimension: {len(sample['data']['float32'])}")
    
    # Batch write in chunks of 500.
    batch_size = 500
    vectors = []
    total_written = 0
    
    print(f"\nStarting batch write (batch_size={batch_size})...")
    
    for idx in range(len(image_data_array)):
        vectors.append(image_data_array[idx])
        
        if len(vectors) == batch_size:
            result = client.put_vectors(oss_vectors.models.PutVectorsRequest(
                bucket=vector_bucket_name,
                index_name=vector_index_name,
                vectors=vectors,
            ))
            total_written += len(vectors)
            print(f"  Wrote {total_written}/{len(image_data_array)} entries, "
                  f"status code: {result.status_code}")
            vectors = []
    
    # Write the remaining data.
    if len(vectors) > 0:
        result = client.put_vectors(oss_vectors.models.PutVectorsRequest(
            bucket=vector_bucket_name,
            index_name=vector_index_name,
            vectors=vectors,
        ))
        total_written += len(vectors)
        print(f"  Wrote {total_written}/{len(image_data_array)} entries, "
              f"status code: {result.status_code}")
    
    print(f"\nWrite complete! A total of {total_written} vector data entries were written.")


if __name__ == "__main__":
    main()

Using the CLI

The OSS-Vectors-Embed-CLI calls the Alibaba Cloud Model Studio vector model to vectorize raw files in OSS or on your local machine and writes the resulting vectors to an OSS Vector Bucket. The tool also supports multimodal semantic search. It offers features such as batch processing, custom configurations, and writing scalar metadata.

oss-vectors-embed \
  --account-id <your-account-id> \                      // Your Alibaba Cloud account ID
  --vectors-region cn-hangzhou \                        // The region of the OSS Vector Bucket
  put \
  --region cn-hangzhou \                                // The region of the bucket that contains the source OSS files
  --vector-bucket-name my-vector-bucket \               // The name of the OSS Vector Bucket
  --index-name my-index \                               // The name of the vector index
  --model-id multimodal-embedding-v1 \                  // The embedding model to use
  --image "oss://bucket/path/*"                         // Batch vectorize files under this OSS prefix and write the results
  --filename-as-key                                     // Set the vector key to the key of the source file

5. Perform a semantic search

Use natural language text to perform a vector semantic search or a hybrid search of vectors and scalars. You can use the OSS SDK or the OSS-Vectors-Embed-CLI command-line tool to find the most similar images in a vector index. The OSS-Vectors-Embed-CLI encapsulates the vectorization of query content and the similarity search, and supports multimodal semantic search scenarios such as text-to-image and image-to-image search. For more information, see OSS-Vectors-Embed-CLI Command-Line Tool.

5.1 Basic search

Vectorize a query text, such as "a dog", and search the index for the top-k most similar image vectors.

Using SDKs

# -*- coding: utf-8 -*-
"""
Example: Query vectors

This example shows how to perform a similarity search using vectors, with support for metadata filtering.

Prerequisites:
1. alibabacloud-oss-v2 and dashscope are installed.
2. Environment variables are set.
3. Set a Model Studio API Key: export DASHSCOPE_API_KEY=<your_api_key>
4. Vector data has been written to the index.
"""

import os
import alibabacloud_oss_v2 as oss
import alibabacloud_oss_v2.vectors as oss_vectors
import dashscope
from dashscope import MultiModalEmbeddingItemText


def embedding(text: str) -> list[float]:
    """
    Vectorize text to convert a query text into a vector for text-to-image search.
    
    Args:
        text: The text to convert.
    
    Returns:
        A list of floats that represents the 1024-dimension vector.
    """
    return dashscope.MultiModalEmbedding.call(
        model="multimodal-embedding-v1",
        input=[MultiModalEmbeddingItemText(text=text, factor=1.0)]
    ).output["embeddings"][0]["embedding"]


def main():
    # Obtain credentials from environment variables.
    access_key_id = os.environ.get('oss_test_access_key_id')
    access_key_secret = os.environ.get('oss_test_access_key_secret')
    region = os.environ.get('oss_test_region')
    account_id = os.environ.get('oss_test_account_id')
    
    # Initialize the client.
    cfg = oss.config.load_default()
    cfg.credentials_provider = oss.credentials.StaticCredentialsProvider(
        access_key_id, access_key_secret
    )
    cfg.region = region
    cfg.account_id = account_id
    client = oss_vectors.Client(cfg)
    
    # Specify the configuration parameters.
    vector_bucket_name = "my-test-2"
    vector_index_name = "test1"
    
    # Specify the query text.
    query_text = "a dog"
    
    print(f"Performing vector search:")
    print(f"  Bucket: {vector_bucket_name}")
    print(f"  Index: {vector_index_name}")
    print(f"  Query text: {query_text}")
    
    # Convert query text to a vector.
    print(f"\nConverting query text to a vector...")
    query_vector = embedding(query_text)
    print(f"  Vector dimension: {len(query_vector)}")
    
    # Execute vector search.
    print(f"\nExecuting vector search ...")
    result = client.query_vectors(oss_vectors.models.QueryVectorsRequest(
        bucket=vector_bucket_name,
        index_name=vector_index_name,
        query_vector={
            "float32":query_vector
        },
        top_k=5,                    # Return the top 5 most similar results.
        return_distance=True,       # Return the distance.
        return_metadata=True,       # Return the metadata.
    ))
    
    print(f"\nSearch results (total {len(result.vectors)}):")
    for i, vector in enumerate(result.vectors, 1):
        print(f"\n  [{i}] key: {vector.get('key', 'N/A')}")
        if 'distance' in vector:
            print(f"      distance: {vector['distance']:.6f}")
        if 'metadata' in vector:
            print(f"      metadata: {vector['metadata']}")
    
    # Search without a filter.
    print(f"\n" + "=" * 50)
    print(f"Executing vector search (no filter)...")
    result = client.query_vectors(oss_vectors.models.QueryVectorsRequest(
        bucket=vector_bucket_name,
        index_name=vector_index_name,
        query_vector={
            "float32": query_vector
        },
        top_k=5,
        return_distance=True,
        return_metadata=True,
    ))
    
    print(f"\nSearch results (total {len(result.vectors)}):")
    for i, vector in enumerate(result.vectors, 1):
        print(f"  [{i}] key: {vector.get('key', 'N/A')}, "
              f"distance: {vector.get('distance', 'N/A')}")


if __name__ == "__main__":
    main()

Using the CLI

oss-vectors-embed \
  --account-id <your-account-id> \
  --vectors-region cn-hangzhou \
  query \
  --vector-bucket-name my-vector-bucket \
  --index-name my-index \
  --model-id multimodal-embedding-v1 \
  --text-value "a dog" \               // The query text.
  --top-k 100                        // Return the 100 most similar vector results.

{
  "results": [
    {
      "Key": "myimage03.jpg",                                        // A vector result containing the vector key and scalar metadata.
      "metadata": {
        "OSSVECTORS-EMBED-SRC-CONTENT-TYPE": "TEXT",
        "OSSVECTORS-EMBED-SRC-LOCATION":  "./images/photo.jpg",     // The CLI automatically adds source information fields (OSSVECTORS-EMBED-SRC-*) to the vector's row in the vector bucket to trace the vector's origin.
      }
    },
    ...
  ],
  "summary": {
    "queryType": "text",
    "model": "multimodal-embedding-v1",
    "index": "my-index",
    "resultsFound": 100,
    "queryDimensions": 1024
  }
}

5.2 Search with filters

When you perform a vector similarity search, you can apply a precise filter based on image metadata, such as city and height, to narrow the search scope. Vector search supports metadata filtering using operators such as $in, $and, and $or.

Using SDKs

Vectorize a query text, such as "a dog", and search the index for the top-k most similar image vectors, while applying various scalar metadata filters.

# -*- coding: utf-8 -*-
"""
Example: Advanced vector query

This example demonstrates advanced vector search techniques, including complex filters and multiple query examples.

Prerequisites:
1. alibabacloud-oss-v2 and dashscope are installed.
2. Environment variables are set.
3. Set a Model Studio API Key: export DASHSCOPE_API_KEY=<your_api_key>
4. Vector data has been written to the index.
"""

import os
import alibabacloud_oss_v2 as oss
import alibabacloud_oss_v2.vectors as oss_vectors
import dashscope
from dashscope import MultiModalEmbeddingItemText


def embedding(text: str) -> list[float]:
    """Vectorize text to convert a query text into a vector for text-to-image search."""
    return dashscope.MultiModalEmbedding.call(
        model="multimodal-embedding-v1",
        input=[MultiModalEmbeddingItemText(text=text, factor=1.0)]
    ).output["embeddings"][0]["embedding"]


def create_client():
    """Create an OSS vector client."""
    access_key_id = os.environ.get('oss_test_access_key_id')
    access_key_secret = os.environ.get('oss_test_access_key_secret')
    region = os.environ.get('oss_test_region')
    account_id = os.environ.get('oss_test_account_id')
    
    cfg = oss.config.load_default()
    cfg.credentials_provider = oss.credentials.StaticCredentialsProvider(
        access_key_id, access_key_secret
    )
    cfg.region = region
    cfg.account_id = account_id
    return oss_vectors.Client(cfg)


def query_with_filter(client, bucket, index, query_text, filter_body, top_k=5):
    """Perform a vector search with a filter."""
    result = client.query_vectors(oss_vectors.models.QueryVectorsRequest(
        bucket=bucket,
        index_name=index,
        query_vector={"float32": embedding(query_text)},
        filter=filter_body,
        top_k=top_k,
        return_distance=True,
        return_metadata=True,
    ))
    return result.vectors


def main():
    client = create_client()
    
    vector_bucket_name = "my-test-2"
    vector_index_name = "test1"
    
    print("=" * 60)
    print("Advanced Vector Search Examples")
    print("=" * 60)
    
    # Example 1: Use the $in operator to match multiple cities.
    print("\n[Example 1] Use the $in operator to search for images in Hangzhou or Shanghai.")
    print("-" * 40)
    filter_in = {
        "city": {"$in": ["hangzhou", "shanghai"]}
    }
    results = query_with_filter(client, vector_bucket_name, vector_index_name, 
                                "cityscape", filter_in)
    print(f"Query: 'cityscape', Filter: city in ['hangzhou', 'shanghai']")
    print(f"Number of results: {len(results)}")
    for v in results[:3]:
        print(f"  - {v.get('key')}: {v.get('metadata', {}).get('city', 'N/A')}")
    
    # Example 2: Use the $and operator to combine multiple conditions.
    print("\n[Example 2] Use the $and operator to combine multiple filters.")
    print("-" * 40)
    filter_and = {
        "$and": [
            {"city": {"$in": ["hangzhou", "shanghai"]}},
            {"height": {"$in": ["1024"]}}
        ]
    }
    results = query_with_filter(client, vector_bucket_name, vector_index_name,
                                "skyscrapers", filter_and)
    print(f"Query: 'skyscrapers', Filter: city in [hangzhou, shanghai] AND height=1024")
    print(f"Number of results: {len(results)}")
    for v in results[:3]:
        meta = v.get('metadata', {})
        print(f"  - {v.get('key')}: city={meta.get('city')}, height={meta.get('height')}")
    
    # Example 3: Compare semantic search results for different query texts.
    print("\n[Example 3] Compare semantic search results for different query texts.")
    print("-" * 40)
    query_texts = ["a dog", "sunset by the sea", "city night view", "food"]
    
    for qt in query_texts:
        results = query_with_filter(client, vector_bucket_name, vector_index_name,
                                    qt, None, top_k=3)
        print(f"\nQuery: '{qt}'")
        for i, v in enumerate(results, 1):
            print(f"  [{i}] {v.get('key')}, distance: {v.get('distance', 0):.4f}")


if __name__ == "__main__":
    main()

Using the CLI

# AND: Both conditions must be met.
oss-vectors-embed \
  --account-id <your-account-id> \
  --vectors-region cn-hangzhou \
  query \
  --vector-bucket-name my-vector-bucket \
  --index-name my-index \
  --model-id multimodal-embedding-v1 \
  --text-value "a dog" \
  --filter  '{                                                // The combined filter.
        "$and": [
            {"city": {"$in": ["hangzhou", "shanghai"]}},
            {"height": {"$in": ["1024"]}}
        ]
    }'  \  
  --top-k 5

{
  "results": [
  {
      "Key": "fd91808c-8d7c-480e-a72b-2bfa7d313a80",
      "metadata": {
        "OSSVECTORS-EMBED-SRC-CONTENT-TYPE": "IMAGE",
        "author": "admin",
        "city": "hangzhou",
        "OSSVECTORS-EMBED-SRC-CONTENT": "a dog",
        "height": "1024",
        "OSSVECTORS-EMBED-SRC-LOCATION": "./images/photo.jpg"
      }
    },
    {
    ...
  ],
  "summary": {
    "queryType": "text",
    "model": "multimodal-embedding-v1",
    "index": "my-index",
    "resultsFound": 5,
    "queryDimensions": 1024
  }
}

Filters:

Operator	Description	Example
$in	Matches any value in a list.	`{"city": {"$in": ["hangzhou", "beijing"]}}`
$and	Logical AND.	`{"$and": [condition1, condition2]}`
$or	Logical OR.	`{"$or": [condition1, condition2]}`

Complex filter:

{
    "$and": [
        {"city": {"$in": ["hangzhou", "shanghai"]}},
        {
            "$or": [
                {"height": "1024"},
                {"height": "768"}
            ]
        }
    ]
}

6. Build a visual search interface

To better visualize the search results, you can use Gradio to build a simple web interface. This interface provides an interactive search experience with a text input field, filter options, and an image result display.

Install the web UI framework.
```
pip install gradio==5.44.1
```

Save the following code as gradio_app.py.

# -*- coding: utf-8 -*-

import json
import logging
import os

import alibabacloud_oss_v2 as oss
import alibabacloud_oss_v2.vectors as oss_vectors
import dashscope
import gradio as gr
from PIL import Image
from dashscope import MultiModalEmbeddingItemText

logging.basicConfig(level=logging.INFO)

logger = logging.getLogger(__name__)


class Util:
    access_key_id = os.environ.get('oss_test_access_key_id')
    access_key_secret = os.environ.get('oss_test_access_key_secret')
    region = os.environ.get('oss_test_region')
    account_id = os.environ.get('oss_test_account_id')

    cfg = oss.config.load_default()
    cfg.credentials_provider = oss.credentials.StaticCredentialsProvider(access_key_id, access_key_secret)
    cfg.region = region
    cfg.account_id = account_id
    client = oss_vectors.Client(cfg)

    vector_bucket_name = "my-test-2"
    vector_index_name = "test1"
    dimension = 1024

    @staticmethod
    def embedding(text) -> list[float]:
        return dashscope.MultiModalEmbedding.call(
            model="multimodal-embedding-v1",
            input=[MultiModalEmbeddingItemText(text=text, factor=1.0)]
        ).output["embeddings"][0]["embedding"]

    @staticmethod
    def query_text(text: str, top_k: int = 5, city: list[str] = None, height: list[str] = None, return_meta: bool = True, return_distance: bool = True) -> list[tuple[Image.Image, str]]:
        logger.info(f"search text:{text}, top_k:{top_k}, city:{city}, height:{height}")

        sub_filter = []
        if city is not None and len(city) > 0:
            sub_filter.append({"city": {"$in": city}})
        if height is not None and len(height) > 0:
            sub_filter.append({"height": {"$in": height}})
        if len(sub_filter) > 0:
            filter_body = {"$and": sub_filter}
        else:
            filter_body = None

        result = Util.client.query_vectors(oss_vectors.models.QueryVectorsRequest(
            bucket=Util.vector_bucket_name,
            index_name=Util.vector_index_name,
            query_vector={
                "float32": Util.embedding(text)
            },
            filter=filter_body,
            top_k=top_k,
            return_distance=return_distance,
            return_metadata=return_meta,
        ))

        gallery_data = []
        current_dir = os.path.dirname(os.path.abspath(__file__))
        # The web interface depends on local image files. Ensure you have prepared the image resources according to the repository structure.
        # - By default, the system uses images from the data/photograph/ directory in the repository linked at the end of this topic. The web interface reads and displays these files.
        # - To use your own images, place them in a different directory and update the path variable below to point to it.
        for vector in result.vectors:
            file_path = os.path.join(current_dir, "data/photograph/", vector["key"])
            img = Image.open(file_path)
            gallery_data.append((img, json.dumps(vector)))
        ret = gallery_data
        logger.info(f"search text:{text}, top_k:{top_k}, request_id:{result.request_id}, ret:{ret}")
        return ret

    @staticmethod
    def on_gallery_box_select(evt: gr.SelectData):
        result = ""
        img_data = evt.value["caption"]
        img_data = json.loads(img_data)
        for key in img_data:
            img_data_item = img_data[key]
            if type(img_data_item) is str:
                img_data_item = img_data_item.replace("\n", "\\n").replace("\t", "\\t").replace("\r", "\\r")
            if type(img_data_item) is dict:
                for sub_key in img_data_item:
                    img_data_item[sub_key] = img_data_item[sub_key].replace("\n", "\\n").replace("\t", "\\t").replace("\r", "\\r")
                    result += f' - **{sub_key}**: &nbsp; {img_data_item[sub_key]}\r\n'
                continue
            result += f' - **{key}**: &nbsp; {img_data_item}\r\n'
        return result


with gr.Blocks(title="OSS Demo") as demo:
    with gr.Tab("OSS QueryVector Image Demo") as search_tab:
        with gr.Row():
            query_text_box = gr.Textbox(label='query_text', interactive=True, value="a dog")
            top_k_box = gr.Slider(minimum=1, maximum=30, value=10, step=1, label='top_k', interactive=True)
            with gr.Column():
                return_meta_box = gr.Checkbox(label='return_meta', interactive=True, value=True)
                return_distance_box = gr.Checkbox(label='return_distance', interactive=True, value=True)
        with gr.Row():
            city_box = gr.Dropdown(label='city', multiselect=True, choices=["hangzhou", "shanghai", "beijing", "shenzhen", "guangzhou"])
            height_box = gr.Dropdown(label='height', multiselect=True, choices=["1024", "683", "768", "576"])
        with gr.Row():
            query_button = gr.Button(value="query", variant='primary')
        with gr.Row():
            with gr.Column(scale=8):
                gallery_box = gr.Gallery(columns=5, show_label=False, preview=False, allow_preview=False, visible=True, show_download_button=False)
            with gr.Column(scale=2):
                with gr.Row(variant="panel"):
                    md_box = gr.Markdown(visible=True, elem_classes="image_detail")
            gallery_box.select(Util.on_gallery_box_select, [], [md_box])
        query_button.click(
            Util.query_text,
            inputs=[
                query_text_box,
                top_k_box,
                city_box,
                height_box,
                return_meta_box,
                return_distance_box
            ],
            outputs=[
                gallery_box,
            ],
            concurrency_limit=1,
        )

if __name__ == "__main__":
    demo.launch(server_name="0.0.0.0", server_port=7860)

Start the interface.

python gradio_app.py

After the application starts, open http://localhost:7860 in your browser to use the search interface. Search example: Enter "a dog" to return images of dogs.

Feature	Description
query_text	Enter a natural language description, such as "a dog" or "a mountain peak".
top_k	Set the number of results to return (1 to 30).
city	Filter by city. Multiple selections are supported.
height	Filter by image height. Multiple selections are supported.
return_meta	Specifies whether to return metadata.
return_distance	Specifies whether to return the similarity distance.

Object Storage Service:Best practices for OSS vector buckets: quickly build a multi-modal semantic image search

Solution overview

1. Prepare your environment

Obtain access credentials

Install the SDKs

Configure environment variables

2. Upload image data to OSS

3. Create a vector bucket and index

3.1 Create a vector bucket

3.2 Create a vector index

4. Generate and write vectors

Using SDKs

4.1 Generate vector data

4.2 Write vector data

Using the CLI

5. Perform a semantic search

5.1 Basic search

Using SDKs

Using the CLI

5.2 Search with filters

Using SDKs

Using the CLI

6. Build a visual search interface

Related documents