The dimension grouping funnel function lets you group and display results based on different dimensions and specify the associated properties of events. For example, you can group results by day, country, or IP address to achieve finer-grained funnel analysis. A user can only be in one group. If the user does not belong to a group, the user is assigned to the unreach group.
Limits
Only Hologres V2.2.32 and later support the dimension grouping funnel function.
Precautions
To use funnel functions, you must execute the following statement as a superuser to install an extension:
CREATE extension flow_analysis; --Install the extension.The extension is installed at the database level. For each database, you need to install the extension only once.
By default, the extension is loaded to the public schema. The extension cannot be loaded to other schemas. To use the extension in other schemas, you must specify a function name in the public.function format, such as
public.windowFunnel.
finder_group_funnel
This function is used to group specified events based on the selected dimension and calculate funnel results.
Function syntax
Syntax
finder_group_funnel(window, start_timestamp, step_interval, step_numbers, num_events, attr_related, group_event_index,time_zone,is_relative_window, server_timestamp, client_timestamp, group_dimension, prop1, prop2, ..., check_event1, check_event2...)Parameters
Parameter
Required
Description
window
Yes
The window for statistical analysis. Unit: milliseconds.
start_timestamp
Yes
The start time of the statistical analysis. The TIMESTAMP and TIMESTAMPTZ types are supported.
step_interval
Yes
The duration of a step size, which is the granularity for conversion calculation and analysis. Unit: seconds.
step_numbers
Yes
The number of steps to analyze. For example,
step_interval=86400 (1 day), step_number=7indicates that the funnel status is analyzed daily for seven days, starting from the time specified by `start_timestamp`.num_events
Yes
The number of events that need to be analyzed.
attr_related
Yes
Specifies whether an event has associated properties. This parameter takes a UINT8 number. When represented in binary, if the
ith bit is 1, it indicates that the(i+1)th event has associated properties. Typically, `attr_related` is used with `prop`. Ifattr_relatedis set to 1, the number of `prop` expressions you must enter is equal to the number of 1s you set.group_event_index
Yes
The event based on which grouping is performed.
For example, if you set
group_event_indexto 1, grouping starts as soon as the first event is reached. If you setgroup_event_indexto 2, grouping starts only when the second event is reached. Conversions that do not reach the second event are assigned to the `unreach` group.time_zone
Yes
The time zone of the input time. The value must be of the TEXT type and in a standard time zone format, such as
Asia/Shanghai. This parameter affects the output result only whenis_relative_windowis set to true. In this case, the result is displayed based on the time zone.is_relative_window
Yes
Specifies whether the window is a multi-calendar day window. The default value is false. If you set the parameter to true, the following limits are imposed on other parameters:
window: The parameter value must be an integer multiple of 86,400,000.
step_interval: The parameter value must be 86400, which means one observation step is one day.
NoteA calendar day refers to the period of time
from 00:00:00 to 23:59:59of each day. The first calendar day ranges from theevent time to 23:59:59, and the following calendar days are whole days. In most cases, you can use the calendar day as a window to observe the daily funnel data to implement refined business operations.server_timestamp
Yes
The server time when the event occurred. The TIMESTAMP and TIMESTAMPTZ types are supported. This parameter is used to calculate the event slot or step to which the event belongs when the function is running.
client_timestamp
Yes
The client time when the event occurred. The TIMESTAMP and TIMESTAMPTZ types are supported. The data type of this parameter must be the same as that of the start_timestamp parameter. This parameter is used to sort data when the function is running.
group_dimension
Yes
The dimension to group by. For example, if you select the
channel_idfield for dimension grouping, the grouped data is displayed based on the value ofchannel_id. To use multiple dimensions, connect them using `concat_ws` or `concat`. Currently, only fields of the TEXT type can be used for grouping.prop
No
The associated properties of the event. The data type of all properties must be the same. Otherwise, the comparison cannot be performed.
check_event
Yes
The list of conversion events to analyze. Events that meet the conditions within the duration defined by `window` are considered valid and are included in the conversion analysis. For example, if there are three events, enter
EventName = 'E0001', EventName = 'E0002', EventName = 'E0003'.Returned result
An encoded result of the BINARY type is returned. You must use the finder_group_funnel_res function to decode the result.
Example
This example describes how to use the finder_group_funnel function.
Prepare the finder_group_funnel_test table and insert data into the table.
CREATE TABLE finder_group_funnel_test(id INT, event_time TIMESTAMP, event TEXT, province TEXT,city TEXT); INSERT INTO finder_group_funnel_test VALUES (1111, '2024-01-02 00:00:00', 'registration', 'Beijing','Beijing'), (1111, '2024-01-02 00:00:01', 'logon', 'Beijing','Beijing'), (1111, '2024-01-02 00:00:02', 'payment', 'Beijing','Beijing'), (1111, '2024-01-02 00:00:03', 'exit', 'Beijing','Beijing'), (1111, '2024-01-03 00:00:00', 'registration', 'Beijing','Beijing'), (1111, '2024-01-03 00:00:01', 'logon', 'Beijing','Beijing'), (1111, '2024-01-03 00:00:02', 'payment', 'Beijing','Beijing'), (1111, '2024-01-04 00:00:00', 'registration', 'Beijing','Beijing'), (1111, '2024-01-04 00:00:01', 'logon', 'Beijing','Beijing'), (2222, '2024-01-02 00:00:00', 'registration', 'Zhejiang','Hangzhou'), (2222, '2024-01-02 00:00:00', 'logon', 'Zhejiang','Hangzhou'), (2222, '2024-01-02 00:00:01', 'payment', 'Zhejiang','Hangzhou'), (2222, '2024-01-02 00:00:03', 'payment', 'Zhejiang','Hangzhou');To group and display results by the province field, run the following command:
SELECT id, UNNEST(finder_group_funnel (86400000 * 3, EXTRACT(epoch FROM TIMESTAMP'2024-01-02 00:00:00')::BIGINT, 86400, 3, 4, 0, 1, 'Asia/Shanghai', FALSE, event_time, event_time, province, event = 'registration', event = 'logon', event = 'payment', event = 'exit')) AS result FROM finder_group_funnel_test GROUP BY id;The following result is returned: The value of result is the encoded result. You must use the finder_group_funnel_res function to decode the result. For more information, see finder_group_funnel_res.
id | result ------+----------------- 2222 | Zhejiang 2222 | unreach 1111 | Beijing 1111 | unreach (4 rows)
finder_group_funnel_res
This function is used to decode the funnel details in the BINARY result returned by the finder_group_funnel function.
Function syntax
Syntax
finder_group_funnel_res(finder_group_funnel())Parameters
finder_group_funnel(): This function is used to group specified events based on the selected dimension and calculate funnel results. For more information, see finder_group_funnel.
Returned result
The decoded result is returned.
Example
This example decodes the result from the finder_group_funnel example to show the detailed funnel for each user.
Run the following command:
SELECT id, finder_group_funnel_res (result) AS res FROM ( SELECT id, UNNEST(finder_group_funnel (86400000 * 3, EXTRACT(epoch FROM TIMESTAMP'2024-01-02 00:00:00')::BIGINT, 86400, 3, 4, 0, 1, 'Asia/Shanghai', FALSE, event_time, event_time, province, event = 'register', event = 'logon', event = 'pay', event = 'exit')) AS result FROM finder_group_funnel_test GROUP BY id) a;The following result is returned:
id | res ------+----------- 1111 | {4,4,3,2} 1111 | {0,0,0,0} 2222 | {3,3,0,0} 2222 | {0,0,0,0}Interpreting the decoded result: You can decode the result of `finder_group_funnel` to show the detailed funnel for each user. For example, consider the result `{4,4,3,2}` for user ID 1111:
4: The final event the user reached within 3 days is the fourth event, `exit`.
4: The user's fourth event on the first day is Exit.
3: The final event the user reached on the second day is the third event, `payment`.
2: The final event the user reached on the third day is the second event, `logon`.
finder_group_funnel_text_group
This function is used to decode the grouping field in the BINARY result returned by the finder_group_funnel function. This function is usually used in combination with the finder_group_funnel_res function.
Function description
Syntax
finder_group_funnel_text_group(finder_group_funnel())Parameters
finder_group_funnel(): This function is used to group specified events based on the selected dimension and calculate funnel results. For more information, see finder_group_funnel.
Returned result
The decoded result is returned.
Example
In this example, the result of the example for finder_group_funnel is decoded to display the funnel results, the final reached event, and the final reached event within each step size for each grouped user. Run the following command:
SELECT
id,
finder_group_funnel_text_group (result) AS key,
finder_group_funnel_res (result) AS res
FROM (
SELECT
id,
UNNEST(finder_group_funnel (86400000 * 3, EXTRACT(epoch FROM TIMESTAMP'2024-01-02 00:00:00')::BIGINT, 86400, 3, 4, 0, 1, 'Asia/Shanghai', FALSE, event_time, event_time, province, event = 'registration', event = 'logon', event = 'payment', event = 'exit')) AS result
FROM
finder_group_funnel_test
GROUP BY
id) a;The following result is returned:
id | key | res
------+---------+-----------
2222 | Zhejiang | {3,3,0,0}
2222 | unreach | {0,0,0,0}
1111 | Beijing | {4,4,3,2}
1111 | unreach | {0,0,0,0}
(4 rows)Aggregate function for funnel results (funnel_rep)
This function is used to aggregate the calculation results of FINDER_FUNNEL and finder_group_funnel to generate the aggregation results of all users at each layer of the funnel.
Function description
Syntax
funnel_rep(step_number, num_events, funnel_res)Parameters
Parameter
Required
Description
step_number
Yes
A UINT value that specifies the number of time slots. This value is typically the same as the
step_numbersvalue in the `finder_funnel` function.For example,
step_numbers=7indicates that seven time slots are observed.num_events
Yes
A UINT value that specifies the total number of events that participate in the conversion. This value is typically the same as the number of events in
check_eventin the `finder_funnel` function.funnel_res
Yes
The details of all conversion steps generated by each user, which is the output of finder_tunnel.
Returned result
The function returns a one-dimensional array of strings in the format
{"n1,...,nn","m1,...,mn"}. This array represents the overall funnel status (the number of users who match events 1 to N) and the funnel status within each step size (the number of users who match events 1 to N).
Example
This example describes how to calculate the event that each user reaches when the window period is three days and the step size is three days. The data in the example for finder_group_funnel is used. Run the following command:
-- With a 3-day window and a 3-day step size, calculate the event reached by each user.
SELECT
funnel_rep (3, 4, funnel_res)
FROM (
SELECT
id,
FINDER_FUNNEL (86400000 * 3, EXTRACT(epoch FROM TIMESTAMP'2024-01-02 00:00:00')::BIGINT, 86400, 3, 4, 0, 'Asia/Shanghai', FALSE, event_time, event_time, event = 'registration', event = 'logon', event = 'payment', event = 'exit') AS funnel_res
FROM
finder_group_funnel_test
GROUP BY
id) a;The following result is returned:
funnel_rep
-------------------------------------------
{"2,2,2,1","2,2,2,1","1,1,1,0","1,1,0,0"}
(1 row)Complete usage examples
Scenario 1: Group and display user funnel results using a multi-day window
Analyze the funnel data within 3 days and the funnel data of each day based on the province dimension when four events sequentially occurred.
To prepare data, run the following commands:
CREATE TABLE finder_group_funnel_test_1(id INT, event_time TIMESTAMP, event TEXT, province TEXT,city TEXT); INSERT INTO finder_group_funnel_test_1 VALUES (1111, '2024-01-02 00:00:00', 'registration', 'Beijing','Beijing'), (1111, '2024-01-02 00:00:01', 'logon', 'Beijing','Beijing'), (1111, '2024-01-02 00:00:02', 'payment', 'Beijing','Beijing'), (1111, '2024-01-02 00:00:03', 'exit', 'Beijing','Beijing'), (1111, '2024-01-03 00:00:00', 'registration', 'Beijing','Beijing'), (1111, '2024-01-03 00:00:01', 'logon', 'Beijing','Beijing'), (1111, '2024-01-03 00:00:02', 'payment', 'Beijing','Beijing'), (1111, '2024-01-04 00:00:00', 'registration', 'Beijing','Beijing'), (1111, '2024-01-04 00:00:01', 'logon', 'Beijing','Beijing'), (2222, '2024-01-02 00:00:00', 'registration', 'Zhejiang','Hangzhou'), (2222, '2024-01-02 00:00:00', 'logon', 'Zhejiang','Hangzhou'), (2222, '2024-01-02 00:00:01', 'payment', 'Zhejiang','Hangzhou'), (2222, '2024-01-02 00:00:03', 'payment', 'Zhejiang','Hangzhou'), (3333, '2024-01-02 00:00:00', 'registration', 'Shanghai','Shanghai'), (3333, '2024-01-02 00:00:00', 'logon', 'Shanghai','Shanghai'), (3333, '2024-01-02 00:00:01', 'payment', 'Shanghai','Shanghai'), (3333, '2024-01-02 00:00:03', 'payment', 'Shanghai','Shanghai'), (3333, '2024-01-02 00:00:04', 'exit', 'Shanghai','Shanghai');Analyze the funnel data within 3 days and the funnel data of each day based on the province dimension when four events sequentially occurred. Run the following command:
SELECT key, funnel_rep (3, 4, res) AS ans FROM ( SELECT id, finder_group_funnel_text_group (result) AS key, finder_group_funnel_res (result) AS res FROM ( SELECT id, UNNEST(finder_group_funnel (86400000 * 3, EXTRACT(epoch FROM TIMESTAMP'2024-01-02 00:00:00')::BIGINT, 86400, 3, 4, 0, 1, 'Asia/Shanghai', FALSE, event_time, event_time, province, event = 'Register', event = 'Log on', event = 'Payment', event = 'Exit')) AS result FROM finder_group_funnel_test_1 GROUP BY id) a) b GROUP BY key;The following result is returned:
key | ans ---------+------------------------------------------- Beijing | {"1,1,1,1","1,1,1,1","1,1,1,0","1,1,0,0"} unreach | {"0,0,0,0","0,0,0,0","0,0,0,0","0,0,0,0"} Shanghai | {"1,1,1,1","1,1,1,1","0,0,0,0","0,0,0,0"} Zhejiang | {"1,1,1,0","1,1,1,0","0,0,0,0","0,0,0,0"} (4 rows)
Scenario 2: Display the funnel results of users grouped by calendar day based on the multi-calendar day window
To prepare data, run the following commands:
CREATE TABLE finder_group_funnel_test_2(id INT, event_time TIMESTAMP, event TEXT, province TEXT,city TEXT); INSERT INTO finder_group_funnel_test_2 VALUES (1111, '2024-01-02 00:00:02', 'registration', 'Beijing','Beijing'), (1111, '2024-01-02 00:00:03', 'logon', 'Beijing','Beijing'), (1111, '2024-01-03 00:00:04', 'payment', 'Beijing','Beijing'), (1111, '2024-01-05 00:00:01', 'exit', 'Beijing','Beijing'), (2222, '2024-01-02 00:00:00', 'registration', 'Zhejiang','Hangzhou'), (2222, '2024-01-02 00:00:00', 'logon', 'Zhejiang','Hangzhou'), (2222, '2024-01-02 00:00:01', 'payment', 'Zhejiang','Hangzhou'), (2222, '2024-01-02 00:00:03', 'payment', 'Zhejiang','Hangzhou');Analyze the funnel data within 3 days and the funnel data of each calendar day based on the province dimension when four events sequentially occurred. Run the following command:
SELECT key, funnel_rep (3, 4, res) AS ans FROM ( SELECT id, finder_group_funnel_text_group (result) AS key, finder_group_funnel_res (result) AS res FROM ( SELECT id, UNNEST(finder_group_funnel (86400000 * 3, EXTRACT(epoch FROM TIMESTAMP'2024-01-02 00:00:00')::BIGINT, 86400, 3, 4, 0, 1, 'Asia/Shanghai', TRUE, event_time, event_time, province, event = 'register', event = 'logon', event = 'pay', event = 'exit')) AS result FROM finder_group_funnel_test_2 GROUP BY id) a) b GROUP BY key;The following result is returned:
key | ans ---------+------------------------------------------- unreach | {"0,0,0,0","0,0,0,0","0,0,0,0","0,0,0,0"} Zhejiang | {"1,1,1,0","1,1,1,0","0,0,0,0","0,0,0,0"} Beijing | {"1,1,1,0","1,1,1,0","0,0,0,0","0,0,0,0"} (3 rows)