The imgsmlr extension is a third-party extension for PolarDB for PostgreSQL and that supports similar image search. The imgsmlr extension uses the Haar wavelet transform algorithm to fetch feature values from images, such as PNG and GIF files. It then uses an index to retrieve similar images.
Scope
Supported PolarDB for PostgreSQL versions:
PostgreSQL 16 (minor engine version 2.0.16.9.8.0 or later)
PostgreSQL 14 (minor engine version 14.10.18.0 or later)
You can view the minor engine version in the console or run the SHOW polardb_version; statement. If your cluster does not meet the minor engine version requirement, upgrade the minor engine version.
How to use
Data types
The imgsmlr extension provides two data types: pattern and signature.
Data type | Storage size | Description |
pattern | 16388 bytes | The result of a Haar wavelet transform on an image. |
signature | 64 bytes | A compact representation of a pattern. A GiST index can be used for fast searches. |
Functions
The imgsmlr extension provides several functions. You can use these functions to convert various image types to the pattern type. The extension also provides a function to create a signature from a pattern to make retrieval easier.
Function | Return type | Description |
jpeg2pattern(bytea) | pattern | Converts a JPEG image to the pattern type. |
png2pattern(bytea) | pattern | Converts a PNG image to the pattern type. |
gif2pattern(bytea) | pattern | Converts a GIF image to the pattern type. |
pattern2signature(pattern) | signature | Creates a signature from a pattern. |
shuffle_pattern(pattern) | pattern | Shuffles a pattern to reduce sensitivity to image offset. |
Operators
The pattern and signature types both support the <-> operator for Euclidean distance. The signature type also supports a GiST index on the <-> operator.
Operator | Lvalue type | R-value Type | Return type | Description |
<-> | pattern | pattern | float8 | Calculates the Euclidean distance between two patterns. |
<-> | signature | signature | float8 | Calculates the Euclidean distance between two signatures. |
Examples
Install the extension
CREATE EXTENSION imgsmlr;Create a table for image feature values
Assume an image table exists that contains an id column and a data column. The data column contains binary JPEG data. You can run the following SQL statement to create a table that contains the pattern and signature for the images.
CREATE TABLE pat AS (
SELECT
id,
shuffle_pattern(pattern) AS pattern,
pattern2signature(pattern) AS signature
FROM (
SELECT
id,
jpeg2pattern(data) AS pattern
FROM
image
) x
);Create a GiST index
ALTER TABLE pat ADD PRIMARY KEY (id);
CREATE INDEX pat_signature_idx ON pat USING gist (signature);Search for similar images
To find the top 10 images that are most similar to an image with a specified id, use a subquery. The subquery uses a GiST index on the signature to retrieve the top 100 candidate images. The outer query then searches these candidates by pattern to find the top 10 matches.
SELECT
id,
smlr
FROM
(
SELECT
id,
pattern <-> (SELECT pattern FROM pat WHERE id = :id) AS smlr
FROM pat
WHERE id <> :id
ORDER BY
signature <-> (SELECT signature FROM pat WHERE id = :id)
LIMIT 100
) x
ORDER BY x.smlr ASC
LIMIT 10;Uninstall the extension
DROP EXTENSION imgsmlr;