For business scenarios with large data volumes and specific requirements on queries per second (QPS) and latency, combine Hologres RoaringBitmap with Dynamic Table to perform incremental computation and enable flexible UV calculation over any time period. This topic describes implementation approaches for both INT and TEXT field types.
Solution overview
This solution offers the following advantages: Dynamic Table uses incremental refresh to process only new data each time, delivering high performance with low resource consumption for high-QPS, low-latency UV calculations. It requires no additional scheduling tasks because Dynamic Table handles automatic refreshes. Queries are flexible and support any time range. Use this approach for high-QPS UV calculations over arbitrary long periods on large-scale data (hundreds of millions of records).
Depending on your business scenario and data type, implement RoaringBitmap in one of two ways:
Method 1: Long-period UV calculation for INT fields — Best for exact deduplication when UID is an INT type. Combined with Dynamic Table, it supports intersection, union, and difference operations on user groups based on any tag or property.
Method 2: TEXT fields with a mapping table — Best when UID is a TEXT type. Requires a user mapping table to convert TEXT UIDs to integers.
Method 1: RoaringBitmap long-period UV calculation for INT fields
Use this method for high-QPS UV calculations over arbitrary long periods on large-scale data (hundreds of millions of records) when the UID field is of type int.
Workflow
Create a user detail table to store granular data across all business dimensions.
Create a Dynamic Table based on your business logic to incrementally process the detail table. Group by base dimensions and aggregate UIDs into a RoaringBitmap stored in the Dynamic Table.
Query the Dynamic Table by desired dimensions. Apply RB_OR_AGG to the RoaringBitmap field to deduplicate, then compute cardinality for UV and count for PV—achieving sub-second query response.
Prepare base data
Create the RoaringBitmap extension before use:
CREATE EXTENSION IF NOT EXISTS roaringbitmap;Create a user detail table (partitioned by day is recommended). The following example shows table creation and property settings:
DROP TABLE IF EXISTS ods_app_detail;
BEGIN;
CREATE TABLE IF NOT EXISTS ods_app_detail (
uid int,
country text,
prov text,
city text,
ymd text NOT NULL
)
LOGICAL PARTITION BY LIST (ymd);
CALL set_table_property('ods_app_detail', 'orientation', 'column');
CALL set_table_property('ods_app_detail', 'bitmap_columns', 'country,prov,city,ymd');
CALL set_table_property('ods_app_detail', 'distribution_key', 'uid');
CALL set_table_property('ods_app_detail', 'clustering_key', 'ymd');
CALL set_table_property('ods_app_detail', 'event_time_column', 'ymd');
COMMIT;Create a Dynamic Table to process data
Create a Dynamic Table that aggregates the detail table by dimensions and computes RoaringBitmap using RB_BUILD_AGG on uid. Configure incremental refresh. The following example refreshes only the latest partition with 5 minute data freshness:
CREATE DYNAMIC TABLE dt_dws_app_rb
LOGICAL PARTITION BY LIST(ymd)
WITH (
freshness = '5 minutes',
auto_refresh_mode = 'incremental',
auto_refresh_partition_active_time = '1 days',
partition_key_time_format = 'YYYYMMDD'
)
AS
SELECT
RB_BUILD_AGG(uid) AS rb_uid,
country,
prov,
city,
ymd,
COUNT(1) AS pv
FROM ods_app_detail
GROUP BY country, prov, city, ymd;Manually REFRESH historical partitions. For example:
REFRESH DYNAMIC TABLE public.dt_dws_app_rb PARTITION(20251201) WITH (refresh_mode = 'full');Query UV over arbitrary long periods
-- Query UV and PV for a specific day
SELECT
RB_CARDINALITY(RB_OR_AGG(rb_uid)) AS uv,
country,
prov,
city,
sum(pv) AS pv
FROM dt_dws_app_rb
WHERE ymd = '20251223'
GROUP BY country, prov, city;-- Query UV and PV for a month
SELECT
RB_CARDINALITY(RB_OR_AGG(rb_uid)) AS uv,
country,
prov,
city,
sum(pv) AS pv
FROM dt_dws_app_rb
WHERE ymd >= '20251201' AND ymd <= '20251230'
GROUP BY country, prov, city;Method 2: RoaringBitmap long-period UV calculation for TEXT fields with a mapping table
Use this method when UID is a text type. Combine it with a user mapping table to map text UIDs to integers for RoaringBitmap processing. Use the hg_id_encoding function in Dynamic Table to automatically write to and update the mapping table.
Create and update the user mapping table
BEGIN;
CREATE TABLE uid_mapping (
uid text NOT NULL,
uid_int32 serial,
PRIMARY KEY (uid)
);
CALL set_table_property('uid_mapping', 'clustering_key', 'uid');
CALL set_table_property('uid_mapping', 'distribution_key', 'uid');
CALL set_table_property('uid_mapping', 'orientation', 'row');
COMMIT;Create a Dynamic Table to process data
In the Dynamic Table, use hg_id_encoding_int4(uid, 'uid_mapping') to map text UIDs to integers and automatically populate the mapping table. Then aggregate by dimensions into a RoaringBitmap:
CREATE DYNAMIC TABLE dt_dws_app_rb
WITH (
freshness = '5 minutes',
auto_refresh_mode = 'incremental',
auto_refresh_partition_active_time = '1 days',
partition_key_time_format = 'YYYYMMDD'
)
LOGICAL PARTITION BY LIST (ymd)
AS
SELECT
country,
prov,
city,
RB_BUILD_AGG(uid_int4) AS uid_rb,
ymd
FROM (
SELECT
country,
prov,
city,
hg_id_encoding_int4(uid, 'uid_mapping') AS uid_int4,
ymd
FROM ods_app_detail
) a
GROUP BY country, prov, city, ymd;Query UV over arbitrary long periods
SELECT
country,
prov,
city,
RB_CARDINALITY(RB_OR_AGG(uid_rb)) AS uv
FROM dt_dws_app_rb
WHERE ymd = '20251223'
GROUP BY country, prov, city;