This topic provides an overview of image-based search and describes how to build an image-based search system.
Overview
Image-based search is a content-based image retrieval technology that enables users to find similar or related images by uploading an image. It uses computer vision and machine learning to analyze visual features such as color, texture, and shape, and convert images into computable feature vectors. These vectors are then compared and matched with other images in a database.
The application scenarios of image-based search are extensive. Examples:
E-commerce: Users can take or upload a product photo to search for similar products. This improves shopping recommendations and reduces manual search efforts.
Media management: Users can upload an image to find related images, which helps organize and manage large amounts of media resources.
Social media: Users can upload a photo to search for related images, such as posts or photos with similar themes or locations.
Image-based search systems can also be applied in areas like copyright protection and cybersecurity. For example, they can be used to detect and identify pirated images or harmful content online.
Build an image-based search system
Create an image data table
CREATE TABLE image(
id bigint(20) comment 'The ID for each image record',
image_address varchar(255) comment 'The location or representation of the image',
primary key(id)
);The image_address column stores public URLs or Base64-encoded strings of images.
If the image_address column stores Base64-encoded strings, set its data type to LONGTEXT to ensure that complete encoded text is stored. Make sure that the Base64-encoded string does not exceed 2 MB.
Insert image data
INSERT INTO image(id, image_address) values(1, 'https://xxx/image.bmp');Create a vector table
/*polar4ai*/CREATE TABLE image_vector(
id bigint,
image_address varchar,
image_address_vector vector_512,
primary key(id)
);The image_address_vector column stores 512-dimensional vectors.
Make sure that the
idandimage_addressfields in the vector table have the same data types as those in the image data storage table to ensure complete data storage.Different tenants can create different databases to store different types of data. If you need to store new images into the vector table, perform the following steps.
Vectorize images
PolarDB for AI supports online and offline methods to vectorize images and store them in the vector table. The online method can vectorize only one image at a time, while the offline method can batch vectorize images.
The image vectorization process uses the _polar4ai_image2vec model, which currently supports only 512-dimensional vector outputs.
Offline vectorization
To vectorize all images in the image data table (
image), execute the following SQL statement:/*polar4ai*/SELECT * FROM predict(model _polar4ai_image2vec, SELECT id,image_address FROM image) with( primary_key='id', x_cols='image_address', mode='async', vec_col='image_address_vector', input_model='url' ) INTO image_vector;Parameters
Parameter
Description
primary_key
The primary key column in the vector table (
image_vector).x_cols
The image_address column in the image data table (
image).mode
The image write mode. Valid value: async.
vec_col
The image_address_vector column in the vector table (
image_vector).input_mode
The image storage type. Valid values:
url (default)
base64
For example, if the images are stored as public URLs in the
image_addresscolumn, set the value to url. If the images are stored as base64 encoded strings, set the value to base64.After the offline task SQL is executed, a
task_idis returned, such as17c45d84-3633-11f0-add9-ab1f3c25b505. You can execute the following SQL statement to check if the task is complete./*polar4ai*/SHOW TASK `17c45d84-3633-11f0-add9-ab1f3c25b505`;If the
taskStatusfield value in the query result isfinish, the offline task is complete. You can then perform image search.
Online vectorization
To vectorize an image stored in the database online, execute the following SQL statement:
/*polar4ai*/SELECT * FROM predict(model _polar4ai_image2vec, SELECT image_address FROM image WHERE id=1) with();To vectorize an image from a URL on the internet online, execute the following SQL statement:
/*polar4ai*/SELECT * FROM predict(model _polar4ai_image2vec, SELECT 'http://xxxx/image.png') with();To vectorize an image stored as a Base64 encoded string online, execute the following SQL statement:
/*polar4ai*/SELECT * FROM predict(model _polar4ai_image2vec, SELECT '/9j/4AAQSk.....RAEREA//20==') with();
Perform image search
/*polar4ai*/SELECT id,'distance(image_address_vector, [1,2,3,4,5……,512])' FROM image_vector LIMIT 10;image_address_vectoris the image_address_vector field in the vector table (image_vector).When performing image search, replace
[1,2,3,4,5……,512]with the actual query vector.