A window function that clusters geometry objects using the density-based spatial clustering of applications with noise (DBSCAN) algorithm and returns a cluster ID for each object.
Syntax
Syntax 1 — Euclidean distance
integer ST_ClusterDBSCAN(geometry winset geom, float8 eps, integer minpoints)Syntax 2 — Spheroid distance
integer ST_ClusterDBSCANSpheroid(geometry winset geom, float8 eps, integer minpoints)Parameters
| Parameter | Type | Description |
|---|---|---|
geom | geometry winset | The geometry objects to cluster. |
eps | float8 | The search radius. Two objects are considered neighbors if the distance between them is within eps. |
minpoints | integer | The minimum number of neighboring objects required for a geometry object to be classified as a core object. |
Description
DBSCAN partitions geometry objects into clusters based on density, without requiring you to specify the number of clusters upfront.
How objects are assigned to clusters:
Core object — An object with at least
minpointsneighbors withinepsdistance. Core objects anchor each cluster.Boundary object — An object that is within
epsdistance of a core object but does not have enough neighbors to be a core object itself.Noise — An object that is not within
epsdistance of any core object. These objects are assigned a cluster ID ofNULL.
If a boundary object falls within eps distance of core objects in multiple clusters, it is assigned to one cluster at random. This can result in a cluster containing fewer objects than minpoints.
Distance calculation:
ST_ClusterDBSCAN(Syntax 1) uses Euclidean distance. Theepsvalue is calculated based on the Euclidean distance between coordinates.ST_ClusterDBSCANSpheroid(Syntax 2) uses the length of a geometry object on an ellipsoid. When the geometry has a spatial reference identifier (SRID) defined in longitude and latitude, clustering is performed in meters.
Example
The following example clusters four points using eps=2 and minpoints=1. With minpoints=1, a single point qualifies as a core object, so all four points are assigned to clusters — none are classified as noise.
SELECT ST_ClusterDBSCAN(geom, 2, 1) OVER (), st_AsText(geom)
FROM (
SELECT unnest(ARRAY[
'POINT(0 0)'::geometry,
'POINT(1 1)'::geometry,
'POINT(-1 -1)'::geometry,
'POINT(-3 -3)'::geometry
]) AS geom
) AS test;Output:
st_clusterdbscan | st_astext
------------------+--------------
0 | POINT(0 0)
0 | POINT(1 1)
0 | POINT(-1 -1)
1 | POINT(-3 -3)
(4 rows)POINT(0 0), POINT(1 1), and POINT(-1 -1) form cluster 0 because they are all within distance 2 of each other. POINT(-3 -3) is more than 2 units away from the others and forms its own cluster 1.
Related functions
ST_ClusterKMeans — Clusters geometry objects into a fixed number of clusters using the K-means algorithm. Use ST_ClusterKMeans when the number of clusters is known in advance.