windowFunnel searches a sliding time window for an ordered sequence of events and returns the length of the longest matching chain.
Limitations
Only Hologres V0.9 and later support the windowFunnel function.
Prerequisites
Before you begin, ensure that you have:
A Hologres instance running V0.9 or later
Superuser access to the target database
Install the flow_analysis extension as a superuser before calling any funnel functions:
CREATE extension flow_analysis; --Install the extension.The extension is installed at the database level. Install it once per database.
By default, the extension loads into the public schema and cannot be moved to another schema. To call the function from a different schema, use the fully qualified name, for example public.windowFunnel.
How it works
windowFunnel processes event data as follows:
When the first condition in the chain matches, the function sets the event counter to 1 and starts the sliding time window.
Each subsequent condition that matches in order increments the counter. If the sequence is broken, the counter stops incrementing.
Return value: The maximum number of consecutive matched conditions from the chain within the sliding time window. Returns an integer.
For a time window long enough to contain all events:
| User events | Specified conditions | Return value |
|---|---|---|
| c1, c2, c3, c4 | c1, c2, c3 | 3 |
| c4, c3, c2, c1 | c1, c2, c3 | 1 |
| c4, c3 | c1, c2, c3 | 0 |
Syntax
windowFunnel(window, mode, timestamp, cond1, cond2, ..., condN)Parameters:
| Parameter | Description |
|---|---|
window | The length of the time window. The windowFunnel function uses the time when the first matched event occurs as the start point of the time window and extracts data of subsequent events based on the length of the time window. |
mode | The matching mode. Accepted values: default (default) and strict. See Matching modes for details. |
timestamp | The column that records event times. Supported data types: TIMESTAMP, INT, and BIGINT. |
cond | A boolean condition representing one stage of the funnel. Specify conditions in the order you expect events to occur. |
Matching modes
Both modes use the same event chain logic. Given conditions A→B→C and user events:
`default` mode — matches as many events as possible starting from the first event in the specified time window.
| User events | Result |
|---|---|
| A, B, C | 3 — all three conditions matched |
| A, B, A, C | 3 — the repeated A is skipped; matching continues to C |
`strict` mode — stops matching when it encounters a repeated event.
| User events | Result |
|---|---|
| A, B, C | 3 — all three conditions matched |
| A, B, A, C | 2 — the repeated A stops matching; only A and B are counted |
Examples
The following examples use the GitHub public event dataset. To import the dataset into your Hologres instance, see Import public datasets with a few clicks.
The dataset table schema:
BEGIN;
CREATE TABLE hologres_dataset_github_event.hologres_github_event (
id BIGINT,
actor_id BIGINT,
actor_login TEXT,
repo_id BIGINT,
repo_name TEXT,
org_id BIGINT,
org_login TEXT,
type TEXT,
created_at TIMESTAMP WITH TIME ZONE NOT NULL,
action TEXT,
iss_or_pr_id BIGINT,
number BIGINT,
comment_id BIGINT,
commit_id TEXT,
member_id BIGINT,
rev_or_push_or_rel_id BIGINT,
ref TEXT,
ref_type TEXT,
state TEXT,
author_association TEXT,
language TEXT,
merged BOOLEAN,
merged_at TIMESTAMP WITH TIME ZONE,
additions BIGINT,
deletions BIGINT,
changed_files BIGINT,
push_size BIGINT,
push_distinct_size BIGINT,
hr TEXT,
month TEXT,
year TEXT,
ds TEXT
);
CALL set_table_property('hologres_dataset_github_event.hologres_github_event', 'orientation', 'column');
CALL set_table_property('hologres_dataset_github_event.hologres_github_event', 'bitmap_columns', 'actor_login,repo_name,org_login,type,action,commit_id,ref,ref_type,state,author_association,language,hr,month,year,ds');
CALL set_table_property('hologres_dataset_github_event.hologres_github_event', 'clustering_key', 'created_at:asc');
CALL set_table_property('hologres_dataset_github_event.hologres_github_event', 'dictionary_encoding_columns', 'actor_login:auto,repo_name:auto,org_login:auto,type:auto,action:auto,commit_id:auto,ref:auto,ref_type:auto,state:auto,author_association:auto,language:auto,hr:auto,month:auto,year:auto,ds:auto');
CALL set_table_property('hologres_dataset_github_event.hologres_github_event', 'distribution_key', 'id');
CALL set_table_property('hologres_dataset_github_event.hologres_github_event', 'segment_key', 'created_at');
CALL set_table_property('hologres_dataset_github_event.hologres_github_event', 'time_to_live_in_seconds', '3153600000');
COMMENT ON TABLE hologres_dataset_github_event.hologres_github_event IS NULL;
END;Calculate funnel levels per user
The following query identifies how far each user progressed through the conversion path CreateEvent → PushEvent → IssuesEvent within a 30-minute window, over a three-day period.
--Calculate the funnel data of each user.
SELECT
actor_id,
windowFunnel (1800, 'default', created_at, type = 'CreateEvent',type = 'PushEvent',type = 'IssuesEvent') AS level
FROM
hologres_dataset_github_event.hologres_github_event
WHERE
created_at >= TIMESTAMP '2024-01-28 10:00:00+08'
AND created_at < TIMESTAMP '2024-01-31 10:00:00+08'
GROUP BY
actor_id;The level column in the result indicates how many events in the funnel each user completed:
| Level | Meaning |
|---|---|
| 0 | Did not complete the first event (CreateEvent) |
| 1 | Completed CreateEvent only |
| 2 | Completed CreateEvent and PushEvent |
| 3 | Completed all three events |
Sample output:
actor_id | level
------------+------
143037332 | 0
38708562 | 0
157624788 | 1
137850795 | 1
69616418 | 2
158019532 | 2
727125 | 3Aggregate users by funnel stage
To count how many users reached each stage, use a cumulative sum over the level distribution:
WITH level_detail AS (
SELECT
level,
COUNT(1) AS count_user
FROM (
SELECT
actor_id,
windowFunnel (1800, 'default', created_at, type = 'CreateEvent', type = 'PushEvent',type = 'IssuesEvent') AS level
FROM
hologres_dataset_github_event.hologres_github_event
WHERE
created_at >= TIMESTAMP '2024-01-28 10:00:00+08'
AND created_at < TIMESTAMP '2024-01-31 10:00:00+08'
GROUP BY
actor_id) AS basic_table
GROUP BY
level
ORDER BY
level ASC
)
SELECT CASE level WHEN 0 THEN 'total'
WHEN 1 THEN 'CreateEvent'
WHEN 2 THEN 'PushEvent'
WHEN 3 THEN 'IssuesEvent'
END AS type
,SUM(count_user) over ( ORDER BY level DESC )
FROM
level_detail
GROUP BY
level,
count_user
ORDER BY
level ASC;Result:
type | sum
--------------+--------
total | 1338166
CreateEvent | 461088
PushEvent | 202221
IssuesEvent | 4727The total row is the cumulative count of all users analyzed. Each subsequent row shows how many users reached at least that stage in the funnel.