Path analysis functions compute user behavior paths from event sequence data and return structured results you can visualize in a Sankey diagram. Hologres provides two categories: path details functions that aggregate raw event data into serialized path structures, and path parsing functions that extract readable fields from those structures.
Usage notes
Hologres V2.2 and later support path analysis functions. If your instance runs V2.1 or earlier, contact Hologres technical support to upgrade.
Before using path analysis functions, create the
flow_analysisextension. The extension is database-scoped — create it once per database.NoteDo not create duplicate extensions in the same database.
CREATE extension flow_analysis;
How it works
All path analysis queries follow a two-step pipeline:
Call a path details function (
path_analysis_detailorpad_funnel) inside aGROUP BYquery to aggregate raw event rows per user into a serialized path structure.Pass the result to a path parsing function (
pad_full_pathorpad_session_path_array) to extract readable fields — full paths, subpath boundaries, durations, and session IDs.
The serialized output of step 1 (for example, {"",\x01a\x01b<,\x01b\x01c<,""}) cannot be read directly. Always pipe it into a parsing function.Path details functions
path_analysis_detail
path_analysis_detail aggregates per-user event rows into a serialized path structure. It captures the complete event series, parent-child relationships between events, and the time elapsed between consecutive events.
Syntax
path_analysis_detail(
event, -- (data column) event name
event_time, -- (data column) event timestamp
start_event, -- (config) start or end event to anchor the path
session_interval_sec, -- (config) seconds of inactivity before splitting a session
path_depth, -- (config) max number of events to follow from the anchor
path_offset, -- (config) events to skip from the anchor before tracking
is_reverse, -- (config) false = forward from start; true = backward from end
split_session_by_event -- (config) whether to split sessions on the anchor event
)Data arguments (columns from your table)
| Argument | Data type | Description |
|---|---|---|
event | text | The event name to analyze |
event_time | timestamp, timestamptz, or bigint | The timestamp of each event |
Configuration parameters (literal values)
| Parameter | Data type | Description |
|---|---|---|
start_event | text | The event to anchor the path. When is_reverse is false, this is the start event; when true, it is the end event. |
session_interval_sec | bigint | Inactivity threshold in seconds. If the gap between two consecutive events exceeds this value, Hologres splits them into separate sessions. Set to -1 to split sessions by start_event recurrence only. |
path_depth | bigint | Maximum number of events to track forward (or backward) from the anchor event. |
path_offset | bigint | Number of events to skip from the anchor before tracking begins. 0 means no offset. For example, with events a → b → c and path_offset = 1, tracking starts at b. |
is_reverse | bool | false: track events in chronological order from the start event. true: track events in reverse chronological order from the end event. |
split_session_by_event | bool | true: create a new session each time the anchor event recurs within the time window. false (default): split sessions based on session_interval_sec only. |
Return value
Returns a TEXT serialized array, for example: {"",\x01a\x01b<,\x01b\x01c<,""}. Pass this output to pad_full_path or pad_session_path_array — it cannot be read directly.
Behavior of `split_session_by_event`
This parameter controls how repeated paths within a single time window are counted.
| Value | Behavior | When to use |
|---|---|---|
false (default) | A repeated path a → b → c occurring 10 times within the session counts as one conversion for that session. Note If you set | Use when you want one aggregate journey per session window. |
true | Each recurrence of the anchor event starts a new session, so 10 repetitions count as 10 separate conversions. | Use when each triggered path is an independent user journey, such as tracking repeated purchases within one visit. |
pad_funnel
pad_funnel filters the serialized output of path_analysis_detail to a specific target event sequence (subpath), discarding events outside that sequence.
Syntax
pad_funnel(path_analysis_detail(), target_path)Parameters
| Parameter | Data type | Description |
|---|---|---|
path_analysis_detail() | text | The aggregated serialized output from path_analysis_detail |
target_path | text | The event sequence to filter to, as an array literal (for example, array['Log on', 'Purchase']) |
Return value
Returns a TEXT serialized array, for example: text []{"",\x01a\x01b<,\x01b\x01c<,""}. Pass the result to a path parsing function.
Path parsing functions
pad_full_path
pad_full_path parses the serialized output of path_analysis_detail (or pad_funnel) into structured fields: the full path per session, each subpath's start and end events, their positions in the full path, and the time elapsed between them.
Syntax
pad_full_path(path_analysis_detail())Parameters
| Parameter | Data type | Description |
|---|---|---|
path_analysis_detail() | text | The aggregated serialized output from path_analysis_detail |
Return value
pad_full_path returns an array. Use UNNEST to expand it into rows, then extract individual fields with the following sub-functions:
| Sub-function | Return type | Description |
|---|---|---|
pad_sub_path_left(result) | text | The start event of the subpath |
pad_sub_path_right(result) | text | The end event of the subpath |
pad_sub_index_left(result) | bigint | Position of the start event in the full path (0-based) |
pad_sub_index_right(result) | bigint | Position of the end event in the full path (0-based) |
pad_sub_cost(result) | bigint | Redirection duration of the subpath, in seconds |
pad_sub_session(result) | bigint | Index of the valid session this subpath belongs to (0-based) |
pad_session_path_array
pad_session_path_array extracts the ordered event series for a specific session from the serialized output of path_analysis_detail.
Syntax
pad_session_path_array(path_analysis_detail(), session_idx)Parameters
| Parameter | Data type | Description |
|---|---|---|
path_analysis_detail() | text | The aggregated serialized output from path_analysis_detail |
session_idx | bigint | The session index to extract (0-based) |
Return value
Returns an ARRAY of event names in the order they occurred within the session.
Examples
Prepare sample data
-- Create the extension (once per database).
CREATE extension flow_analysis;
-- Create a sample event table.
CREATE TABLE path_demo(
uid text,
event text,
event_time timestamptz
);
-- Insert sample events for four users.
INSERT INTO path_demo VALUES
('1', 'Register', '2023-11-24 16:01:23+08'),
('1', 'Log on', '2023-11-24 16:02:10+08'),
('1', 'Browse', '2023-11-24 16:02:15+08'),
('1', 'View live streams', '2023-11-24 16:03:10+08'),
('1', 'Browse', '2023-11-24 16:03:15+08'),
('1', 'Add to favorites', '2023-11-24 16:04:20+08'),
('1', 'Browse', '2023-11-24 16:07:21+08'),
('1', 'Purchase', '2023-11-24 16:08:23+08'),
('1', 'Exit', '2023-11-24 16:09:05+08'),
('2', 'Log on', '2023-11-24 16:10:23+08'),
('2', 'Purchase', '2023-11-24 16:12:23+08'),
('3', 'Log on', '2023-11-24 16:02:23+08'),
('3', 'Browse', '2023-11-24 16:02:23+08'),
('3', 'Add to favorites', '2023-11-24 16:03:53+08'),
('3', 'View live streams', '2023-11-24 16:04:53+08'),
('4', 'Log on', '2023-11-24 16:02:23+08'),
('4', 'Browse', '2023-11-24 16:03:53+08'),
('4', 'Purchase', '2023-11-24 16:04:23+08'),
('4', 'View live streams', '2023-11-24 16:05:53+08'),
('4', 'Cancel the order', '2023-11-24 16:06:53+08');All the following examples use this path_demo table.
Example 1: Get the full path for each user
The following two examples both use path_depth = 7 and session_interval_sec = 180. They differ in whether sessions split by inactivity only or by both inactivity and anchor event recurrence.
Split sessions by time only (`split_session_by_event = false`)
Anchor on Log on. A session ends when the gap between consecutive events exceeds 180 seconds.
SELECT
uid,
pad_full_path(
path_analysis_detail(
event,
event_time,
'Log on', -- start_event: anchor the path at 'Log on'
180, -- session_interval_sec: split after 180 s of inactivity
7, -- path_depth: track up to 7 events from the anchor
0, -- path_offset: no offset
false -- is_reverse: chronological order
-- split_session_by_event: omitted, defaults to false
)
) AS ret
FROM path_demo
GROUP BY uid;Result:
uid | ret
-----+---------------------------------------------------
3 | {Log on -> Add to favorites -> View live streams}
4 | {Log on -> Browse -> Purchase -> View live streams -> Cancel the order}
1 | {Log on -> Browse -> View live streams -> Browse -> Add to favorites}
2 | {Log on -> Purchase}
(4 rows)Split sessions by time and by anchor event recurrence (`split_session_by_event = true`)
Anchor on Browse. A new session starts both when inactivity exceeds 180 seconds and when Browse recurs.
SELECT
uid,
pad_full_path(
path_analysis_detail(
event,
event_time,
'Browse', -- start_event: anchor the path at 'Browse'
180, -- session_interval_sec
7, -- path_depth
0, -- path_offset
false, -- is_reverse: chronological order
true -- split_session_by_event: new session on each 'Browse'
)
) AS ret
FROM path_demo
GROUP BY uid;Result:
uid | ret
-----+-----------------------------------------------------------
1 | {Browse -> View live streams, Browse -> Add to favorites, Browse -> Purchase -> Exit}
2 | {}
4 | {Browse -> Purchase -> View live streams -> Cancel the order}
3 | {Browse -> Log on -> Add to favorites -> View live streams}User 1 has three separate sessions because Browse occurs three times. User 2 has no results because no Browse event exists for that user.
Example 2: Expand path results into rows
Use UNNEST to expand the path array into one row per session path, which is useful for downstream aggregation.
SELECT
uid,
unnest(
pad_full_path(
path_analysis_detail(
event,
event_time,
'Log on', -- start_event
180, -- session_interval_sec
7, -- path_depth
0, -- path_offset
false -- is_reverse
)
)
) AS ret
FROM path_demo
GROUP BY uid;Result:
uid | ret
-----+-------------------------------------------------
3 | Log on -> Add to favorites -> View live streams
1 | Log on -> Browse -> View live streams -> Browse -> Add to favorites
2 | Log on -> Purchase
4 | Log on -> Browse -> Purchase -> View live streams -> Cancel the order
(4 rows)For more information, see UNNEST clause.
Example 3: Expand subpaths and get per-event timing
UNNEST on the raw path_analysis_detail output (without pad_full_path) yields one row per subpath. Pass each row to the pad_sub_* sub-functions to extract the start event, end event, positions, and elapsed time.
SELECT
uid,
pad_sub_session(ret) AS session_id,
pad_sub_path_left(ret) AS sub_path_left,
pad_sub_path_right(ret) AS sub_path_right,
pad_sub_index_left(ret) AS sub_index_left,
pad_sub_index_right(ret) AS sub_index_right,
pad_sub_cost(ret) AS sub_cost
FROM (
SELECT
uid,
unnest(
path_analysis_detail(
event,
event_time,
'Log on', -- start_event
180, -- session_interval_sec
7, -- path_depth
0, -- path_offset
false -- is_reverse
)
) AS ret
FROM path_demo
GROUP BY uid
) a;Result:
uid | session_id | sub_path_left | sub_path_right | sub_index_left | sub_index_right | sub_cost
-----+------------+------------------+-------------------+----------------+-----------------+----------
1 | 0 | | Log on | -1 | 0 | 0
1 | 0 | Log on | Browse | 0 | 1 | 5
1 | 0 | Browse | View live streams | 1 | 2 | 55
1 | 0 | View live streams| Browse | 2 | 3 | 5
1 | 0 | Browse | Add to favorites | 3 | 4 | 65
2 | 0 | | Log on | -1 | 0 | 0
2 | 0 | Log on | Purchase | 0 | 1 | 120
3 | 0 | | Log on | -1 | 0 | 0
3 | 0 | Log on | Add to favorites | 0 | 1 | 90
3 | 0 | Add to favorites | View live streams | 1 | 2 | 60
4 | 0 | | Log on | -1 | 0 | 0
4 | 0 | Log on | Browse | 0 | 1 | 90
4 | 0 | Browse | Purchase | 1 | 2 | 30
4 | 0 | Purchase | View live streams | 2 | 3 | 90
4 | 0 | View live streams| Cancel the order | 3 | 4 | 60
(15 rows)Example 4: Get the event series for a specific session
pad_session_path_array extracts the ordered events for session index 0 (the first session for each user).
SELECT
uid,
pad_session_path_array(
path_analysis_detail(
event,
event_time,
'Log on', -- start_event
180, -- session_interval_sec
7, -- path_depth
0, -- path_offset
false -- is_reverse
),
0 -- session_idx: retrieve the first session (0-based)
) AS ret
FROM path_demo
GROUP BY uid;Result:
uid | ret
-----+-----------------------------------------------
1 | {Log on,Browse,View live streams,Browse,Add to favorites}
2 | {Log on,Purchase}
3 | {Log on,Add to favorites,View live streams}
4 | {Log on,Browse,Purchase,View live streams,Cancel the order}
(4 rows)Example 5: Count transitions per subpath
The following query counts how many times each subpath transition occurs across all users (page views). To count unique visitors instead, deduplicate on uid before aggregating.
SELECT
sub_index,
sub_path_left,
sub_path_right,
count(uid)
FROM (
SELECT
uid,
pad_sub_path_left(ret) AS sub_path_left,
pad_sub_path_right(ret) AS sub_path_right,
pad_sub_index_right(ret) AS sub_index
FROM (
SELECT
uid,
unnest(
path_analysis_detail(
event,
event_time,
'Log on', -- start_event
180, -- session_interval_sec
7, -- path_depth
0, -- path_offset
false -- is_reverse
)
) AS ret
FROM path_demo
GROUP BY uid
) a
) a
GROUP BY sub_index, sub_path_left, sub_path_right
ORDER BY sub_index, sub_path_left, sub_path_right;Result:
sub_index | sub_path_left | sub_path_right | count
-----------+------------------+-------------------+-------
0 | | Log on | 4
1 | Log on | Add to favorites | 1
1 | Log on | Browse | 2
1 | Log on | Purchase | 1
2 | Add to favorites | View live streams | 1
2 | Browse | View live streams | 1
2 | Browse | Purchase | 1
3 | View live streams| Browse | 1
3 | Purchase | View live streams | 1
4 | Browse | Add to favorites | 1
4 | View live streams| Cancel the order | 1
(11 rows)Example 6: Calculate average duration per subpath
SELECT
sub_path_left,
sub_path_right,
avg(sub_cost)
FROM (
SELECT
uid,
pad_sub_path_left(ret) AS sub_path_left,
pad_sub_path_right(ret) AS sub_path_right,
pad_sub_cost(ret) AS sub_cost
FROM (
SELECT
uid,
unnest(
path_analysis_detail(
event,
event_time,
'Log on', -- start_event
180, -- session_interval_sec
7, -- path_depth
0, -- path_offset
false -- is_reverse
)
) AS ret
FROM path_demo
GROUP BY uid
) a
) a
GROUP BY sub_path_left, sub_path_right
ORDER BY sub_path_left, sub_path_right;Result:
sub_path_left | sub_path_right | avg
------------------+-------------------+------------
Add to favorites | View live streams | 60.000000
Browse | Add to favorites | 65.000000
Browse | View live streams | 55.000000
Browse | Purchase | 30.000000
Log on | Add to favorites | 90.000000
Log on | Browse | 47.500000
Log on | Purchase | 120.000000
View live streams| Cancel the order | 60.000000
View live streams| Browse | 5.000000
Purchase | View live streams | 90.000000
| Log on | 0.000000
(11 rows)Example 7: Join full paths and subpaths in a single query
The following query correlates each subpath with the full session path it belongs to, using pad_full_path and UNNEST together.
SELECT
uid,
pad_sub_session(item) AS session_id,
full_path[pad_sub_session(item) + 1] AS full_path,
pad_sub_path_left(item) AS sub_path_left,
pad_sub_path_right(item) AS sub_path_right,
pad_sub_index_right(item) AS sub_idx,
pad_sub_cost(item) AS sub_cost
FROM (
SELECT
uid,
unnest(ret) AS item,
pad_full_path(ret) AS full_path
FROM (
SELECT
uid,
path_analysis_detail(
event,
event_time,
'Log on', -- start_event
180, -- session_interval_sec
7, -- path_depth
0, -- path_offset
false -- is_reverse
) AS ret
FROM path_demo
GROUP BY uid
) a
) a;Result:
uid | session_id | full_path | sub_path_left | sub_path_right | sub_idx | sub_cost
-----+------------+---------------------------------------------------------+------------------+-------------------+---------+----------
3 | 0 | Log on -> Add to favorites -> View live streams | | Log on | 0 | 0
3 | 0 | Log on -> Add to favorites -> View live streams | Log on | Add to favorites | 1 | 90
3 | 0 | Log on -> Add to favorites -> View live streams | Add to favorites | View live streams | 2 | 60
1 | 0 | Log on -> Browse -> View live streams -> Browse -> Add to favorites | | Log on | 0 | 0
1 | 0 | Log on -> Browse -> View live streams -> Browse -> Add to favorites | Log on | Browse | 1 | 5
1 | 0 | Log on -> Browse -> View live streams -> Browse -> Add to favorites | Browse | View live streams | 2 | 55
1 | 0 | Log on -> Browse -> View live streams -> Browse -> Add to favorites | View live streams| Browse | 3 | 5
1 | 0 | Log on -> Browse -> View live streams -> Browse -> Add to favorites | Browse | Add to favorites | 4 | 65
2 | 0 | Log on -> Purchase | | Log on | 0 | 0
2 | 0 | Log on -> Purchase | Log on | Purchase | 1 | 120
4 | 0 | Log on -> Browse -> Purchase -> View live streams -> Cancel the order| | Log on | 0 | 0
4 | 0 | Log on -> Browse -> Purchase -> View live streams -> Cancel the order| Log on | Browse | 1 | 90
4 | 0 | Log on -> Browse -> Purchase -> View live streams -> Cancel the order| Browse | Purchase | 2 | 30
4 | 0 | Log on -> Browse -> Purchase -> View live streams -> Cancel the order| Purchase | View live streams | 3 | 90
4 | 0 | Log on -> Browse -> Purchase -> View live streams -> Cancel the order| View live streams| Cancel the order | 4 | 60
(15 rows)Example 8: Filter to a specific funnel path
Use pad_funnel to restrict results to users who followed a specific event sequence. The following query finds users who progressed through Log on → Purchase.
SELECT
uid,
pad_full_path(
pad_funnel(
path_analysis_detail(
event,
event_time,
'Log on', -- start_event
180, -- session_interval_sec
7, -- path_depth
0, -- path_offset
false -- is_reverse
),
array['Log on', 'Purchase'] -- target_path: filter to this funnel
)
) AS ret
FROM path_demo
GROUP BY uid;Result:
uid | ret
-----+----------------------
3 | {Log on}
4 | {Log on -> Purchase}
1 | {Log on}
2 | {Log on -> Purchase}
(4 rows)Users 4 and 2 completed the full Log on → Purchase funnel. Users 1 and 3 reached Log on but did not proceed to Purchase within the tracked path depth.