imgsmlr is a PostgreSQL extension that adds similar image search to PolarDB for PostgreSQL. It applies the Haar wavelet transform to extract feature values from images, then uses a GiST index to retrieve visually similar results.
Prerequisites
Before you begin, ensure that you have:
A PolarDB for PostgreSQL cluster running one of the supported versions:
PostgreSQL version Minimum minor engine version PostgreSQL 16 2.0.16.9.8.0 PostgreSQL 14 14.10.18.0 To check your minor engine version, run
SHOW polardb_version;or view the version in the console. If your cluster does not meet the minimum version requirement, upgrade the minor engine version first.
Reference
Data types
| Data type | Storage size | Description |
|---|---|---|
pattern | 16388 bytes | Result of a Haar wavelet transform on an image |
signature | 64 bytes | Compact representation of a pattern; supports a GiST index for fast searches |
Functions
All *2pattern functions accept a bytea argument containing the raw binary image data.
| Function | Return type | Description |
|---|---|---|
jpeg2pattern(bytea) | pattern | Converts a JPEG image to the pattern type |
png2pattern(bytea) | pattern | Converts a PNG image to the pattern type |
gif2pattern(bytea) | pattern | Converts a GIF image to the pattern type |
pattern2signature(pattern) | signature | Creates a signature from a pattern |
shuffle_pattern(pattern) | pattern | Shuffles a pattern to reduce sensitivity to image offset |
Operators
Both pattern and signature support the <-> operator for Euclidean distance. The signature type also supports a GiST index on <-> for accelerated searches.
| Operator | Left type | Right type | Return type | Description |
|---|---|---|---|---|
<-> | pattern | pattern | float8 | Euclidean distance between two patterns |
<-> | signature | signature | float8 | Euclidean distance between two signatures |
Get started
Install the extension
CREATE EXTENSION imgsmlr;Create a feature table
The following example assumes an image table with an id column and a data column containing binary JPEG data. The query extracts a shuffled pattern and a signature for each image and stores them in a new pat table.
CREATE TABLE pat AS (
SELECT
id,
shuffle_pattern(pattern) AS pattern,
pattern2signature(pattern) AS signature
FROM (
SELECT
id,
jpeg2pattern(data) AS pattern
FROM
image
) x
);shuffle_pattern reduces sensitivity to small positional offsets in images.
Create a GiST index
Add a primary key and a GiST index on signature to enable fast candidate retrieval.
ALTER TABLE pat ADD PRIMARY KEY (id);
CREATE INDEX pat_signature_idx ON pat USING gist (signature);Search for similar images
The following query finds the top 10 images most similar to the image with a given :id.
SELECT
id,
smlr
FROM
(
SELECT
id,
pattern <-> (SELECT pattern FROM pat WHERE id = :id) AS smlr
FROM pat
WHERE id <> :id
ORDER BY
signature <-> (SELECT signature FROM pat WHERE id = :id)
LIMIT 100
) x
ORDER BY x.smlr ASC
LIMIT 10;The inner query uses the GiST index on signature to retrieve the top 100 candidates efficiently. The outer query re-ranks those 100 candidates by exact pattern distance to return the top 10 matches. Adjust both LIMIT values to tune the trade-off between search speed and accuracy for your image collection.
Uninstall the extension
DROP EXTENSION imgsmlr;