All Products
Search
Document Center

AnalyticDB:Use the API operations of AnalyticDB for PostgreSQL to search for images

Last Updated:Mar 28, 2026

AnalyticDB for PostgreSQL provides API operations for vector-based image search, covering image upload, upload progress tracking, and search by text or image. This guide walks through each operation with Python code examples.

How it works

Vector-based image search represents images as multi-dimensional vectors and finds matches based on similarity, not keywords.

  1. Extract visual features (color, shape, texture) from images and convert them into multi-dimensional vectors.

  2. Store the vectors in AnalyticDB for PostgreSQL and build an index for fast retrieval.

  3. When a query arrives (text or image), convert it into a feature vector and find the closest vectors using Euclidean distance or cosine similarity.

  4. Return results ranked by similarity score.

AnalyticDB for PostgreSQL integrates vectorization algorithms and vector search capabilities, so you can focus on building the application rather than the underlying infrastructure.

Prerequisites

Before you begin, ensure that you have:

Before uploading images

Complete the following setup before uploading any images:

  1. Prepare your image data (cleansed and preprocessed) and create a vector index. See Create a vector index.

  2. Create a namespace for your data, or use an existing one. See Create a namespace.

  3. Create a collection in the namespace, or use an existing one. See CreateDocumentCollection.

When calling CreateDocumentCollection, set the EmbeddingModel parameter to specify the vectorization algorithm for the collection.

Upload images

All upload operations use the UploadDocumentAsync API operation, which is asynchronous. After the call returns, use the job ID to track progress (see Check upload progress).

Supported image formats: .bmp, .jpg, .jpeg, .png, .tiff.

Initialize the client

All examples in this section use the same client initialization. Create the client once and reuse it across operations:

# -*- coding: utf-8 -*-
import os
from alibabacloud_gpdb20160503.client import Client as gpdb20160503Client
from alibabacloud_tea_OpenAPI import models as open_api_models
from alibabacloud_gpdb20160503 import models as gpdb_20160503_models
from alibabacloud_tea_util import models as util_models

def create_client() -> gpdb20160503Client:
    # Credentials are read from environment variables—never hardcode them.
    config = open_api_models.Config(
        access_key_id=os.environ["ALIBABA_CLOUD_ACCESS_KEY_ID"],
        access_key_secret=os.environ["ALIBABA_CLOUD_ACCESS_KEY_SECRET"]
    )
    config.endpoint = "gpdb.aliyuncs.com"
    return gpdb20160503Client(config)

Upload a single image

Use UploadDocumentAsyncAdvanceRequest for local files (pass a file object) and UploadDocumentAsyncRequest for remote images (pass a URL string). The remaining parameters are identical.

Upload a local image:

client = create_client()
with open("<image_file_path>", "rb") as f:
    # image_file_path: absolute path to the local image file
    request = gpdb_20160503_models.UploadDocumentAsyncAdvanceRequest(
        region_id="<your-instance-region-id>",
        dbinstance_id="<your-instance-id>",
        namespace="<your-namespace-name>",
        namespace_password="<your-namespace-password>",
        collection="<your-collection-name>",
        file_name="<filename-with-extension>",  # e.g., photo.jpg
        file_url_object=f,
        dry_run=False,
        metadata={"caption": "sample image", "category": "nature"},  # dict format
    )
    runtime = util_models.RuntimeOptions()
    try:
        response = client.upload_document_async_advance(request, runtime)
        print("Job ID:", response.body.job_id)
    except Exception as error:
        print(error)

Upload a remote image:

client = create_client()
request = gpdb_20160503_models.UploadDocumentAsyncRequest(
    region_id="<your-instance-region-id>",
    dbinstance_id="<your-instance-id>",
    namespace="<your-namespace-name>",
    namespace_password="<your-namespace-password>",
    collection="<your-collection-name>",
    file_name="<filename-with-extension>",  # e.g., photo.jpg
    file_url="<image_file_url>",            # publicly accessible URL
    dry_run=False,
    metadata={"caption": "sample image", "category": "nature"},  # dict format
)
runtime = util_models.RuntimeOptions()
try:
    response = client.upload_document_async_with_options(request, runtime)
    print("Job ID:", response.body.job_id)
except Exception as error:
    print(error)

Parameter reference:

ParameterDescription
region_idThe region ID of the AnalyticDB for PostgreSQL instance
dbinstance_idThe ID of the AnalyticDB for PostgreSQL instance
namespaceThe name of the namespace
namespace_passwordThe password of the namespace
collectionThe name of the collection
file_nameThe image file name, including the extension (.bmp, .jpg, .jpeg, .png, or .tiff)
file_url_object(Local) The file object opened in binary read mode
file_url(Remote) The URL of the remote image
metadataMetadata for the image, in dict format. Fields you add here (e.g., caption, category) are returned in search results.

Upload multiple images

To upload multiple images at once, pack them into a compressed archive and upload the archive. The UploadDocumentAsync operation extracts and processes each image in the archive.

client = create_client()
with open("<compress_file_path>", "rb") as f:
    # compress_file_path: absolute path to the local archive (.tar, .gz, or .zip)
    request = gpdb_20160503_models.UploadDocumentAsyncAdvanceRequest(
        region_id="<your-instance-region-id>",
        dbinstance_id="<your-instance-id>",
        namespace="<your-namespace-name>",
        namespace_password="<your-namespace-password>",
        collection="<your-collection-name>",
        file_name="<archive-filename-with-extension>",  # e.g., images.zip
        file_url_object=f,
        dry_run=False,
        metadata={"batch": "upload-batch-1"},
    )
    runtime = util_models.RuntimeOptions()
    try:
        response = client.upload_document_async_advance(request, runtime)
        print("Job ID:", response.body.job_id)
    except Exception as error:
        print(error)
Important

Each compressed archive can contain up to 100 images. Supported compression formats: TAR, GZ, and ZIP.

Check upload progress

UploadDocumentAsync is asynchronous. Poll GetUploadDocumentJob with the job ID until the status is Success.

client = create_client()
request = gpdb_20160503_models.GetUploadDocumentJobRequest(
    region_id="<your-instance-region-id>",
    dbinstance_id="<your-instance-id>",
    namespace="<your-namespace-name>",
    namespace_password="<your-namespace-password>",
    collection="<your-collection-name>",
    job_id="<job_id>",  # job_id returned by UploadDocumentAsync
)
runtime = util_models.RuntimeOptions()
try:
    response = client.get_upload_document_job_with_options(request, runtime)
    print("Status:", response.body.job.status)
except Exception as error:
    print(error)

When job.status returns Success, all images in the upload job are indexed and ready for search. For more information, see GetUploadDocumentJob.

Search for images

Search by text

QueryContent accepts a text string, converts it into a feature vector, and returns the top-k most similar images.

# -*- coding: utf-8 -*-
import os
from urllib.request import urlopen
from PIL import Image
from alibabacloud_gpdb20160503.client import Client as gpdb20160503Client
from alibabacloud_tea_OpenAPI import models as open_api_models
from alibabacloud_gpdb20160503 import models as gpdb_20160503_models
from alibabacloud_tea_util import models as util_models

client = create_client()  # create_client() defined in the "Initialize the client" section

request = gpdb_20160503_models.QueryContentRequest(
    region_id="<your-instance-region-id>",
    dbinstance_id="<your-instance-id>",
    namespace="<your-namespace-name>",
    namespace_password="<your-namespace-password>",
    collection="<your-collection-name>",
    content="Dog",  # the text query
    top_k=3,
)
runtime = util_models.RuntimeOptions()
try:
    response = client.query_content_with_options(request, runtime)
    if response.status_code != 200:
        raise Exception(f"QueryContent failed: {response.body}")

    for match in response.body.matches.match_list:
        url = match.file_url
        caption = match.metadata.get("caption")
        print(f"URL: {url}, Caption: {caption}")
        Image.open(urlopen(url)).show()
except Exception as error:
    print(error)

Each item in match_list has a file_url pointing to the matched image and a metadata dict containing fields you set during upload (such as caption or category).

Parameter reference:

ParameterDescription
region_idThe region ID of the AnalyticDB for PostgreSQL instance
dbinstance_idThe ID of the AnalyticDB for PostgreSQL instance
namespaceThe name of the namespace
namespace_passwordThe password of the namespace
collectionThe name of the collection
contentThe text query string
top_kThe number of results to return

Sample output (querying "Dog"):

imageimageimage

Results vary based on the images in your collection.

Search by image

To search using a local image as the query, use QueryContentAdvanceRequest with a file object and filename.

# -*- coding: utf-8 -*-
import os
from urllib.request import urlopen
from PIL import Image
from alibabacloud_gpdb20160503.client import Client as gpdb20160503Client
from alibabacloud_tea_OpenAPI import models as open_api_models
from alibabacloud_gpdb20160503 import models as gpdb_20160503_models
from alibabacloud_tea_util import models as util_models

client = create_client()  # create_client() defined in the "Initialize the client" section

query_file_path = "<image_file_path>"  # absolute path to the query image
with open(query_file_path, "rb") as f:
    request = gpdb_20160503_models.QueryContentAdvanceRequest(
        region_id="<your-instance-region-id>",
        dbinstance_id="<your-instance-id>",
        namespace="<your-namespace-name>",
        namespace_password="<your-namespace-password>",
        collection="<your-collection-name>",
        file_url_object=f,
        file_name=os.path.basename(query_file_path),
        top_k=3,
    )
    runtime = util_models.RuntimeOptions()
    try:
        response = client.query_content_advance(request, runtime)
        if response.status_code != 200:
            raise Exception(f"QueryContent failed: {response.body}")

        for match in response.body.matches.match_list:
            url = match.file_url
            caption = match.metadata.get("caption")
            print(f"URL: {url}, Caption: {caption}")
            Image.open(urlopen(url)).show()
    except Exception as error:
        print(error)

Parameter reference:

ParameterDescription
region_idThe region ID of the AnalyticDB for PostgreSQL instance
dbinstance_idThe ID of the AnalyticDB for PostgreSQL instance
namespaceThe name of the namespace
namespace_passwordThe password of the namespace
collectionThe name of the collection
file_url_objectThe query image file object, opened in binary read mode
file_nameThe filename of the query image, including the extension
top_kThe number of results to return

Sample output (querying with a bicycle image):

image.pngimage.png

image.png

Results vary based on the images in your collection.

Build a web UI with Streamlit

Streamlit is a Python framework for machine learning and data visualization, written in Python, that converts data scripts into web applications. Use it to add a search interface on top of your AnalyticDB for PostgreSQL image search backend.

Install Streamlit:

pip install streamlit

The following example builds a text-to-image search UI. Run the script with streamlit run <script_name>.py.

# -*- coding: utf-8 -*-
import os
import streamlit as st
from alibabacloud_gpdb20160503.client import Client as gpdb20160503Client
from alibabacloud_tea_OpenAPI import models as open_api_models
from alibabacloud_gpdb20160503 import models as gpdb_20160503_models
from alibabacloud_tea_util import models as util_models

def create_client() -> gpdb20160503Client:
    config = open_api_models.Config(
        access_key_id=os.environ["ALIBABA_CLOUD_ACCESS_KEY_ID"],
        access_key_secret=os.environ["ALIBABA_CLOUD_ACCESS_KEY_SECRET"]
    )
    config.endpoint = "gpdb.aliyuncs.com"
    return gpdb20160503Client(config)

def search_by_text(content: str) -> list:
    # client is an instance of gpdb20160503Client initialized with credentials from environment variables
    client = create_client()
    request = gpdb_20160503_models.QueryContentRequest(
        region_id="{your-instance-region-id}",
        dbinstance_id="{your-instance-id}",
        namespace="{your-namespace-name}",
        namespace_password="{your-namespace-password}",
        collection="{your-collection-name}",
        content=content,
        top_k=3,
    )
    runtime = util_models.RuntimeOptions()
    try:
        response = client.query_content_with_options(request, runtime)
        if response.status_code != 200:
            raise Exception(f"QueryContent failed: {response.body}")
        return [(m.file_url, m.metadata.get("caption")) for m in response.body.matches.match_list]
    except Exception as error:
        print(error)
        return []

# Streamlit UI
st.header('Demo for searching for images by text')
text_query = st.chat_input("Enter a keyword")
st.text(f"Keyword: {text_query}" if text_query else "Keyword: ")

if text_query:
    for url, caption in search_by_text(text_query):
        st.image(url)
        st.text(f"Description: {caption}")

Parameter reference:

ParameterDescription
region_idThe region ID of the AnalyticDB for PostgreSQL instance
dbinstance_idThe ID of the AnalyticDB for PostgreSQL instance
namespaceThe name of the namespace
namespace_passwordThe password of the namespace
collectionThe name of the collection

Sample output:

image

What's next