This topic describes the smlar plug-in. This allows you to calculate the similarity between two arrays of the same data type.
Prerequisites
The instance runs one of the following PostgreSQL versions:- PostgreSQL 15
- PostgreSQL 14
- PostgreSQL 13
- PostgreSQL 12 (kernel version 20200421 and later)
- PostgreSQL 11 (kernel version 20200402 and later)
Background information
The smlar plug-in provides multiple functions to calculate the similarity between two arrays of the same data type. It also provides parameters to control the similarity calculation methods. All built-in data types are supported.
Function description
- float4 smlar(anyarray, anyarray)
Calculates the similarity between two arrays of the same data type.
- float4 smlar(anyarray, anyarray, bool useIntersect)
Calculates the similarity between two arrays of composite data types. The composite data type is defined as follows:
CREATE TYPE type_name AS (element_name anytype, weight_name FLOAT4);
When the useIntersect parameter is set to true, only the parts that contain duplicate elements are calculated. When the useIntersect parameter is set to false, all elements are calculated.
- float4 smlar( anyarray a, anyarray b, text formula )
Calculates the similarity between two arrays of the same data type. The arrays are specified by the formula parameter.
The predefined variables for formula are described as follows:
- N.i: The number of common elements in the two arrays.
- N.a: The number of distinct elements in array a.
- N.b: The number of distinct elements in array b.
- float4 set_smlar_limit(float4)
Sets the smlar.threshold parameter.
- float4 show_smlar_limit()
Displays the smlar.threshold parameter value.
- anyarray % anyarray
Returns true if the similarity between arrays is greater than the smlar.threshold parameter value. Otherwise, returns false.
- text[] tsvector2textarray(tsvector)
Converts the tsvector type to the text type.
- anyarray array_unique(anyarray)
Sorts the elements (excluding duplicate elements) in an array.
- float4 inarray(anyarray, anyelement)
Returns 1 if the anyelement parameter value exists in the anyarray parameter value. otherwise, returns 0.
- float4 inarray(anyarray, anyelement, float4, float4)
Returns the third parameter value if anyelement exists in anyarray. Otherwise, returns the fourth parameter value.
For more information about parameter descriptions and supported data types, visit smlar.
Use smlar
- After you have connected to an instance, execute the following statement to create a smlar plug-in:
testdb=> create extension smlar;
- Execute the following statements to use basic functions of smlar:
testdb=> SELECT smlar('{1,4,6}'::int[], '{5,4,6}' ); smlar ---------- 0.666667 (1 row) testdb=> SELECT smlar('{1,4,6}'::int[], '{5,4,6}', 'N.i / sqrt(N.a * N.b)' ); smlar ---------- 0.666667 (1 row)
- Execute the following statement to remove smlar:
testdb=> drop extension smlar;