RoaringBitmap pre-aggregates unique visitor (UV) counts into compressed bitmaps during each incremental refresh, reducing query-time work to a cardinality calculation over bitmaps. Combined with Dynamic Table's incremental refresh—which processes only newly arrived rows, requires no manual scheduling, and handles any time-range query—this approach delivers sub-second UV calculations on hundreds of millions of records under high-QPS, low-latency workloads.
Choose your implementation based on the UID field type:
| UID type | Method |
|---|---|
int | Method 1: RoaringBitmap UV for INT fields |
text | Method 2: RoaringBitmap UV for TEXT fields |
RoaringBitmap operates on integer inputs. ForintUID fields, no conversion is needed. FortextUID fields (Method 2), a mapping table converts text UIDs toint4values before aggregation.
When to use this approach
Use RoaringBitmap with Dynamic Table when:
Data volume is in the hundreds of millions of records or more
Queries require high QPS with low latency
UV calculations need to span arbitrary time ranges
Prerequisites
Before you begin, ensure that you have:
A Hologres instance with the required permissions to create tables and Dynamic Tables
The RoaringBitmap extension installed
Install the extension before creating any tables or Dynamic Tables:
CREATE EXTENSION IF NOT EXISTS roaringbitmap;Method 1: RoaringBitmap UV for INT fields
Use this method when the UID field is of type int. It supports intersection, union, and difference operations on user groups based on any tag or property.
How it works
Store granular event data in a detail table partitioned by day.
A Dynamic Table reads the detail table incrementally, groups rows by business dimensions, and aggregates UIDs into a RoaringBitmap per group using
RB_BUILD_AGG. Each row in the Dynamic Table holds a bitmap representing all distinct UIDs for that dimension combination and date.At query time,
RB_OR_AGGmerges bitmaps across the requested date range—deduplicating UIDs across days—andRB_CARDINALITYreturns the UV count.
Create the detail table
DROP TABLE IF EXISTS ods_app_detail;
BEGIN;
CREATE TABLE IF NOT EXISTS ods_app_detail (
uid int,
country text,
prov text,
city text,
ymd text NOT NULL
)
LOGICAL PARTITION BY LIST (ymd);
CALL set_table_property('ods_app_detail', 'orientation', 'column');
CALL set_table_property('ods_app_detail', 'bitmap_columns', 'country,prov,city,ymd');
CALL set_table_property('ods_app_detail', 'distribution_key', 'uid');
CALL set_table_property('ods_app_detail', 'clustering_key', 'ymd');
CALL set_table_property('ods_app_detail', 'event_time_column', 'ymd');
COMMIT;The table is partitioned by ymd (format YYYYMMDD) for efficient partition pruning at query time. The bitmap_columns property accelerates filtering on country, prov, city, and ymd.
Create the Dynamic Table
CREATE DYNAMIC TABLE dt_dws_app_rb
LOGICAL PARTITION BY LIST(ymd)
WITH (
freshness = '5 minutes',
auto_refresh_mode = 'incremental',
auto_refresh_partition_active_time = '1 days',
partition_key_time_format = 'YYYYMMDD'
)
AS
SELECT
RB_BUILD_AGG(uid) AS rb_uid, -- aggregates all UIDs in the group into a RoaringBitmap
country,
prov,
city,
ymd,
COUNT(1) AS pv
FROM ods_app_detail
GROUP BY country, prov, city, ymd;The Dynamic Table refreshes only the active partition (within the last day) every 5 minutes, processing only newly arrived rows. After each refresh, each row in dt_dws_app_rb contains a rb_uid bitmap for a specific (country, prov, city, ymd) combination—representing all distinct UIDs that appeared on that date with those dimension values. For example, the row for (US, California, San Francisco, 20251223) holds a single bitmap with every de-duplicated UID that visited on that date.
Refresh historical partitions
New partitions refresh automatically. For historical partitions outside the active window, run a full refresh manually. For example, to backfill December 1, 2025:
REFRESH DYNAMIC TABLE public.dt_dws_app_rb PARTITION(20251201) WITH (refresh_mode = 'full');Query UV over any time range
Both queries use RB_OR_AGG to union bitmaps across rows—deduplicating UIDs across dimension groups and dates—then RB_CARDINALITY to count distinct UIDs.
-- UV and PV for a single day
SELECT
RB_CARDINALITY(RB_OR_AGG(rb_uid)) AS uv, -- union bitmaps across rows, then count distinct UIDs
country,
prov,
city,
SUM(pv) AS pv
FROM dt_dws_app_rb
WHERE ymd = '20251223'
GROUP BY country, prov, city;
-- UV and PV for a month
SELECT
RB_CARDINALITY(RB_OR_AGG(rb_uid)) AS uv,
country,
prov,
city,
SUM(pv) AS pv
FROM dt_dws_app_rb
WHERE ymd >= '20251201' AND ymd <= '20251230'
GROUP BY country, prov, city;Method 2: RoaringBitmap UV for TEXT fields
Use this method when the UID field is of type text. Because RoaringBitmap requires integer inputs, a user mapping table stores the text-to-int4 mapping. The hg_id_encoding_int4 function handles this conversion automatically inside the Dynamic Table—looking up or inserting each text UID in the mapping table and returning its integer equivalent—so you do not need to manage the dictionary manually or run separate join jobs.
Create the user mapping table
BEGIN;
CREATE TABLE uid_mapping (
uid text NOT NULL,
uid_int32 serial,
PRIMARY KEY (uid)
);
CALL set_table_property('uid_mapping', 'clustering_key', 'uid');
CALL set_table_property('uid_mapping', 'distribution_key', 'uid');
CALL set_table_property('uid_mapping', 'orientation', 'row');
COMMIT;The mapping table uses row store for fast point lookups by uid. The uid_int32 serial column auto-assigns a unique integer to each new text UID.
Create the Dynamic Table
CREATE DYNAMIC TABLE dt_dws_app_rb
WITH (
freshness = '5 minutes',
auto_refresh_mode = 'incremental',
auto_refresh_partition_active_time = '1 days',
partition_key_time_format = 'YYYYMMDD'
)
LOGICAL PARTITION BY LIST (ymd)
AS
SELECT
country,
prov,
city,
RB_BUILD_AGG(uid_int4) AS uid_rb, -- aggregates mapped integer UIDs into a RoaringBitmap
ymd
FROM (
SELECT
country,
prov,
city,
hg_id_encoding_int4(uid, 'uid_mapping') AS uid_int4, -- converts text UID to int4, writes new entries to uid_mapping
ymd
FROM ods_app_detail
) a
GROUP BY country, prov, city, ymd;The subquery calls hg_id_encoding_int4(uid, 'uid_mapping') to look up or insert each text UID in uid_mapping, returning its integer equivalent. The outer query aggregates those integers into bitmaps per dimension group.
Query UV over any time range
SELECT
country,
prov,
city,
RB_CARDINALITY(RB_OR_AGG(uid_rb)) AS uv -- union bitmaps across rows, then count distinct UIDs
FROM dt_dws_app_rb
WHERE ymd = '20251223'
GROUP BY country, prov, city;Extend the WHERE clause to any date range, the same as Method 1.