All Products
Search
Document Center

ApsaraDB RDS:ST_JaccardSimilarity

Last Updated:Mar 28, 2026

Calculates the Jaccard index between two trajectories or sub-trajectories, returning a lower bound and an upper bound of the similarity score.

Syntax

record ST_JaccardSimilarity(trajectory tr1, trajectory tr2, double tol_dist,
                            text unit default '{}', interval tol_time default NULL,
                            timestamp ts default '-infinity', timestamp te default 'infinity');

The function returns a record with the following fields:

FieldTypeDescription
nleaf1intThe number of trajectory points in tr1 that intersect with tr2.
nleaf2intThe number of trajectory points in tr2 that intersect with tr1. This value may differ from nleaf1. For example, if tr1 passes the same point on tr2 twice, nleaf1 is 1 and nleaf2 is 2.
inter1intThe number of trajectory points in tr1 whose distance to tr2 meets both the distance and time tolerances.
inter2intThe number of trajectory points in tr2 whose distance to tr1 meets both the distance and time tolerances.
jaccard_lowerdoubleThe lower bound of the Jaccard index. Calculated as min(inter1, inter2) / (nleaf1 + nleaf2 - min(inter1, inter2)).
jaccard_upperdoubleThe upper bound of the Jaccard index. Calculated as max(inter1, inter2) / (nleaf1 + nleaf2 - max(inter1, inter2)).

Parameters

ParameterTypeDefaultDescription
tr1trajectoryThe first trajectory.
tr2trajectoryThe second trajectory.
tol_distdoubleThe maximum allowed distance between a matching pair of trajectory points, in meters.
unittext'{}'A JSON string that controls how distances are calculated. See unit parameter fields.
tol_timeintervalNULLThe maximum allowed time difference between a matching pair of trajectory points. If NULL or negative, the function matches points based on distance only, ignoring timestamps.
tstimestamp-infinityThe start of the time range. If specified, the function compares only the sub-trajectories between ts and te.
tetimestampinfinityThe end of the time range. If specified, the function compares only the sub-trajectories between ts and te.

unit parameter fields

FieldTypeDefaultDescription
ProjectionstringNoneThe coordinate system to re-project the trajectories into before calculating distances. Valid values: auto (dynamically selects Lambert Azimuthal or UTM based on longitude and latitude; distance unit is meters), srid (re-projects using the specified spatial reference identifier (SRID)). If omitted, calculations use the original coordinate system.
UnitstringnullThe unit for distance measurement. Valid values: null (Euclidean distance based on raw coordinates), M (distance based on the spatial reference of the trajectories, typically meters).
useSpheroidbooltrueSpecifies whether to use an ellipsoid model when Unit is M. true: uses an ellipsoid for accurate distances. false: uses a sphere for approximate distances.

How it works

The classical Jaccard index for two sets is the size of their intersection divided by the size of their union. For trajectories, this extends to spatial (and optionally temporal) matching: a trajectory point in tr1 is considered to intersect with tr2 if there is a point in tr2 within tol_dist meters (and, if tol_time is set, within the specified time window).

Because the matching relationship is not symmetric — tr1 passing near a point on tr2 once is counted differently from tr2 passing near the same point twice — the function returns two bounds rather than a single value:

  • `jaccard_lower` uses the smaller intersection count (min(inter1, inter2)), giving a conservative estimate of overlap. Use this when a confirmed minimum level of similarity is required.

  • `jaccard_upper` uses the larger intersection count (max(inter1, inter2)), giving an optimistic estimate of overlap. Use this to capture the broadest possible match between the two trajectories.

Both values range from 0 (no overlap) to 1 (complete overlap).

Example

The following example compares two trajectories within a three-day time window, using a 100-meter distance tolerance and a 20-second time tolerance.

WITH traj AS (
  SELECT
    ST_makeTrajectory('STPOINT'::leaftype,
      'SRID=4326;LINESTRING(114.49211 37.97921,114.49211 37.97921,114.49211 37.97921,114.49211 37.97921)'::geometry,
      ARRAY[
        to_timestamp(1590287775) AT TIME ZONE 'UTC',
        to_timestamp(1590287778) AT TIME ZONE 'UTC',
        to_timestamp(1590302169) AT TIME ZONE 'UTC',
        to_timestamp(1590302171) AT TIME ZONE 'UTC'
      ], '{}') a,
    ST_makeTrajectory('STPOINT'::leaftype,
      'SRID=4326;LINESTRING(114.49211 37.97921,114.49211 37.97921,114.49211 37.97921,114.49211 37.97921,114.49145 37.97781,114.49145 37.97781,114.49145 37.97781,114.49145 37.97781,114.49145 37.97781,114.49145 37.97781,114.49145 37.97781,114.49145 37.97781,114.49145 37.97781,114.49145 37.97781,114.49211 37.97921,114.49211 37.97921,114.49211 37.97921,114.49211 37.97921,114.49211 37.97921,114.49211 37.97921)'::geometry,
      ARRAY[
        to_timestamp(1590287765) AT TIME ZONE 'UTC',
        to_timestamp(1590287771) AT TIME ZONE 'UTC',
        to_timestamp(1590287778) AT TIME ZONE 'UTC',
        to_timestamp(1590287780) AT TIME ZONE 'UTC',
        to_timestamp(1590295992) AT TIME ZONE 'UTC',
        to_timestamp(1590295997) AT TIME ZONE 'UTC',
        to_timestamp(1590296013) AT TIME ZONE 'UTC',
        to_timestamp(1590296018) AT TIME ZONE 'UTC',
        to_timestamp(1590296025) AT TIME ZONE 'UTC',
        to_timestamp(1590296032) AT TIME ZONE 'UTC',
        to_timestamp(1590296055) AT TIME ZONE 'UTC',
        to_timestamp(1590296073) AT TIME ZONE 'UTC',
        to_timestamp(1590296081) AT TIME ZONE 'UTC',
        to_timestamp(1590296081) AT TIME ZONE 'UTC',
        to_timestamp(1590302169) AT TIME ZONE 'UTC',
        to_timestamp(1590302174) AT TIME ZONE 'UTC',
        to_timestamp(1590302176) AT TIME ZONE 'UTC',
        to_timestamp(1590302176) AT TIME ZONE 'UTC',
        to_timestamp(1590302172) AT TIME ZONE 'UTC',
        to_timestamp(1590302176) AT TIME ZONE 'UTC'
      ], '{}') b
)
SELECT ST_JaccardSimilarity(a, b, 100, '{"unit":"M"}', '20 second',
  '2020-05-23'::timestamptz AT TIME ZONE 'UTC',
  '2020-05-26'::timestamptz AT TIME ZONE 'UTC')
FROM traj;

Output:

       st_jaccardsimilarity
-----------------------------------
 (4,20,4,10,0.2,0.714285714285714)
(1 row)

The result maps to the return fields as follows:

FieldValueMeaning
nleaf144 points in tr1 intersect with tr2
nleaf22020 points in tr2 intersect with tr1
inter144 points in tr1 meet the distance and time tolerances
inter21010 points in tr2 meet the distance and time tolerances
jaccard_lower0.2Conservative similarity: 4 / (4 + 20 - 4)
jaccard_upper0.714...Optimistic similarity: 10 / (4 + 20 - 10)