This topic describes parameters in roaring bitmap functions and how to use roaring bitmap functions in Hologres.
Background information
Roaring bitmaps are efficiently compressed bitmaps used in various programming languages across multiple big data platforms. They are ideal for ultra-high-cardinality dimensions and can be used for deduplication, tag-based filtering, and collection of time series data.
The roaring bitmap algorithm divides 32-bit integers into 216 blocks (chunks). Each block corresponds to the high 16 bits of an integer. A container stores the low 16 bits of a value. Roaring bitmaps store containers in a dynamic array as primary indexes. Two types of containers are available: array containers for sparse chunks and bitmap containers for dense chunks. An array container can store up to 4,096 integers. A bitmap container can store more than 4,096 integers.
Roaring bitmaps use this storage structure to rapidly retrieve specific values. They also provide bitwise operations such as AND, OR, and XOR between the two types of containers. As a result, roaring bitmaps deliver excellent storage and computing performance.
Limits
When you use roaring bitmap functions in Hologres, note the following limits:
Only Hologres V0.10 and later exclusive instances support these functions.
NoteYou can check the version of your instance in the Hologres console. If your instance is earlier than V0.10, you can use Common error messages returned when upgrade preparations fail or join the Hologres DingTalk group to provide feedback. For more information, see How do I obtain more online support?.
These functions are loaded to the public schema by default and can be loaded only to the public schema.
Starting from Hologres V3.1, 64-bit roaring bitmaps (RoaringBitmap64) are supported. Some roaring bitmap functions can process data of the RoaringBitmap64 type. When processing data of the RoaringBitmap64 type, these functions do not support constant input parameters.
Before you use roaring bitmap functions, you must execute the following statement to enable the extension. The extension is a database-level function. You need to execute the statement only once for a database. If you create a new database, you must execute the statement again.
--Create an extension CREATE EXTENSION roaringbitmap;
If you want to drop the extension for roaring bitmap functions, execute the following statement:
DROP EXTENSION roaringbitmap;
ImportantWe do not recommend that you use the
DROP EXTENSION <extension_name> CASCADE;
statement to drop an extension. The CASCADE statement drops not only the specified extension but also the extension data and the objects that depend on the extension. The extension data includes the PostGIS data, roaring bitmap data, Proxima data, binary log data, and BSI data. The objects include metadata, tables, views, and server data.You cannot specify the fields that store roaring bitmaps as bitmap or dictionary indexes.
When you create a table that contains a roaring bitmap column, you must explicitly specify the data type of the column as RoaringBitmap (32-bit) or RoaringBitmap64 (64-bit). The two types of roaring bitmap data cannot be used in mixed calculations.
-- Create a table that contains a 32-bit roaring bitmap column CREATE TABLE t_rb_32 ( bucket int, x roaringbitmap ); -- Create a table that contains a 64-bit roaring bitmap column CREATE TABLE t_rb_64 ( bucket int, x roaringbitmap64 ); -- Mixed calculations of the two types of roaring bitmap data are not supported and an error is returned -- ERROR: operator does not exist: roaringbitmap & roaringbitmap64 SELECT a.x & b.x FROM t_rb_32 a JOIN t_rb_64 b ON a.bucket = b.bucket;
Operators
The following operators support processing data of the RoaringBitmap and RoaringBitmap64 types.
Operator | Input type | Output type | Description | Example | Remarks |
& | RoaringBitmap | RoaringBitmap64, RoaringBitmap | RoaringBitmap64 | Same as the input type. | Performs an AND operation. | rb_build('{1,2,3}') & rb_build('{3,4,5}') | None |
| | RoaringBitmap | RoaringBitmap64, RoaringBitmap | RoaringBitmap64 | Same as the input type. | Performs an OR operation. | rb_build('{1,2,3}') | rb_build('{3,4,5}') | None |
RoaringBitmap | RoaringBitmap64, INTEGER | RoaringBitmap | RoaringBitmap64 | rb_build('{1,2,3}') | 6 | Only Hologres V1.3.16 and later support this operator. | ||
INTEGER, RoaringBitmap | RoaringBitmap64 | RoaringBitmap | RoaringBitmap64 | 6 | rb_build('{1,2,3}') | Only Hologres V1.3.16 and later support this operator. | ||
# | RoaringBitmap | RoaringBitmap64, RoaringBitmap | RoaringBitmap64 | Same as the input type. | Performs an XOR operation. | rb_build('{1,2,3}') # rb_build('{3,4,5}') | None |
<< | RoaringBitmap | RoaringBitmap64, BIGINT | RoaringBitmap | RoaringBitmap64 | Shifts a value left by a specific number of places. | rb_build('{1,2,3}') << 2 | Only Hologres V1.3.16 and later support this operator. |
>> | RoaringBitmap | RoaringBitmap64, BIGINT | RoaringBitmap | RoaringBitmap64 | Shifts a value right by a specific number of places. | rb_build('{1,2,3}') >> 3 | Only Hologres V1.3.16 and later support this operator. |
- | RoaringBitmap | RoaringBitmap64, RoaringBitmap | RoaringBitmap64 | Same as the input type. | Performs an ANDNOT operation. | rb_build('{1,2,3}') - rb_build('{3,4,5}') | Only Hologres V1.3.16 and later support this operator. |
RoaringBitmap | RoaringBitmap64, INTEGER | RoaringBitmap | RoaringBitmap64 | rb_build('{1,2,3}') - 3 | None | ||
@> | RoaringBitmap | RoaringBitmap64, RoaringBitmap | RoaringBitmap64 | BOOLEAN | Checks whether A contains B. | rb_build('{1,2,3}') @> rb_build('{3,4,5}') | None |
RoaringBitmap | RoaringBitmap64, INTEGER | BOOLEAN | rb_build('{1,2,3}') @> 3 | None | ||
<@ | RoaringBitmap | RoaringBitmap64, RoaringBitmap | RoaringBitmap64 | BOOLEAN | Checks whether A is contained by B. | rb_build('{1,2,3}') <@ rb_build('{3,4,5}') | None |
integer, RoaringBitmap | RoaringBitmap64 | BOOLEAN | 3 <@ rb_build('{1,2,3}') | None | ||
&& | RoaringBitmap | RoaringBitmap64, RoaringBitmap | RoaringBitmap64 | BOOLEAN | Checks whether A intersects with B. | rb_build('{1,2,3}') && rb_build('{3,4,5}') | None |
= | RoaringBitmap | RoaringBitmap64, RoaringBitmap | RoaringBitmap64 | BOOLEAN | Checks whether two objects are equal. | rb_build('{1,2,3}') = rb_build('{3,4,5}') | None |
<> | RoaringBitmap | RoaringBitmap64, RoaringBitmap | RoaringBitmap64 | BOOLEAN | Checks whether two objects are not equal. | rb_build('{1,2,3}') <> rb_build('{3,4,5}') | None |
Roaring bitmap calculation functions
Functions that support processing data of the RoaringBitmap and RoaringBitmap64 types
Function
Input type
Output type
Description
Example
rb_build_agg
INTEGER|BIGINT
The input parameter data type is:
INTEGER: Returns the RoaringBitmap type.
BIGINT: Returns the RoaringBitmap64 type.
Aggregates offsets into a roaring bitmap.
NoteOnly Hologres V3.1 and later support the BIGINT input parameter type and return the RoaringBitmap64 type.
rb_build_agg(1)
rb_cardinality
RoaringBitmap | RoaringBitmap64
INTEGER
Calculates the cardinality.
rb_cardinality(rb_build('{1,2,3,4,5}'))
rb_and_cardinality
RoaringBitmap | RoaringBitmap64,RoaringBitmap | RoaringBitmap64
INTEGER
Calculates the cardinality by performing an AND operation on two roaring bitmaps.
rb_and_cardinality(rb_build('{1,2,3}'),rb_build('{3,4,5}'))
rb_or_cardinality
RoaringBitmap | RoaringBitmap64,RoaringBitmap | RoaringBitmap64
INTEGER
Calculates the cardinality by performing an OR operation on two roaring bitmaps.
rb_or_cardinality(rb_build('{1,2,3}'),rb_build('{3,4,5}'))
rb_range
RoaringBitmap | RoaringBitmap64,BIGINT,BIGINT
RoaringBitmap | RoaringBitmap64
Returns a new collection that ranges from the start position (included) to the end position (not included). The start position is 1.
NoteOnly Hologres V1.3.16 and later support this function.
rb_range(rb_build('{1,2,3}'), 2, 3)
rb_minimum
RoaringBitmap | RoaringBitmap64
INTEGER
Returns the minimum offset in a roaring bitmap. If the roaring bitmap is empty, -1 is returned.
rb_minimum(rb_build('{1,2,3}'))
rb_maximum
RoaringBitmap | RoaringBitmap64
INTEGER
Returns the maximum offset in a roaring bitmap. If the roaring bitmap is empty, 0 is returned.
rb_maximum(rb_build('{1,2,3}'))
rb_to_array
RoaringBitmap | RoaringBitmap64
INTEGER[]
Returns an integer array from which a roaring bitmap is created.
rb_to_array(rb_build('{1,2,3}'))
rb_to_array_string
RoaringBitmap | RoaringBitmap64, TEXT
TEXT
Returns a string that is generated by concatenating the integer array from which a roaring bitmap is created with input text.
rb_to_array_string(rb_build('{1,2,3}'),',')
Functions that support processing only data of the RoaringBitmap64 type
Function
Input type
Output type
Description
Example
rb64_build
BIGINT[]
RoaringBitmap64
Creates a 64-bit roaring bitmap from an array.
NoteHologres V3.1 and later support this function.
--Prepare data CREATE TABLE public.tn ( id INT, num BIGINT[] ); INSERT INTO public.tn ("id", "num") VALUES (01, '{1,2}'); SELECT rb64_build (num) rb_num,num FROM public.tn;
The following result is returned:
rb_num num \x030100000000000000000000003a30000001000000000001001000000001000200 {1,2}
Functions that support processing only data of the RoaringBitmap type
Function
Input type
Output type
Description
Example
rb_build
INTEGER[]
RoaringBitmap
Creates a 32-bit roaring bitmap from an array.
--The following result is returned: \x3a3000000100000000000000100000000100 SELECT rb_build_agg(1);
roaringbitmap_in
TEXT
RoaringBitmap
This function is used to convert data of the TEXT type into the roaring bitmap type.
NoteHologres V2.1.33 and later support this function.
--Create a sample table. CREATE TABLE rb_text ( id int, a text ); -- Insert data into the sample table. INSERT INTO rb_text VALUES (1, '\x3a300000010000000000090010000000010002000300040005000600070008000900c800'); --Convert the data type of field a into the roaring bitmap type and perform an AND operation. SELECT rb_and_cardinality_agg (roaringbitmap_in (a::cstring)) FROM rb_text; --The following result is returned: rb_and_cardinality_agg| ----------------------- 10|
rb_index
RoaringBitmap, INTEGER
BIGINT
Returns the index of which the element in the roaring bitmap data starts from 0. If the element does not exist, -1 is returned.
NoteOnly Hologres V1.3.16 and later support this function.
rb_index(rb_build('{1,2,3}'),3)
rb_and_null2empty
RoaringBitmap,RoaringBitmap
RoaringBitmap
Performs an AND operation. If the value of an input parameter is null, the function considers the parameter value as empty.
NoteOnly Hologres V1.1.42 and later support this function.
rb_and_null2empty(rb_build(null),rb_build('{3,4,5}'))
rb_or_null2empty
RoaringBitmap,RoaringBitmap
RoaringBitmap
Performs an OR operation. If the value of an input parameter is null, the function considers the parameter value as empty.
NoteOnly Hologres V1.1.42 and later support this function.
rb_or_null2empty(rb_build(null),rb_build('{3,4,5}'))
rb_andnot_null2empty
RoaringBitmap,RoaringBitmap
RoaringBitmap
Performs an ANDNOT operation. If the value of an input parameter is null, the function considers the parameter value as empty.
NoteOnly Hologres V1.1.42 and later support this function.
rb_andnot_null2empty(rb_build(null),rb_build('{3,4,5}'))
rb_and_null2empty_cardinality
RoaringBitmap,RoaringBitmap
INTEGER
Calculates the cardinality by performing an AND operation on two roaring bitmaps. If the value of an input parameter is null, the function considers the parameter value as empty.
NoteOnly Hologres V1.1.42 and later support this function.
rb_and_null2empty_cardinality(rb_build(null),rb_build('{3,4,5}'))
rb_or_null2empty_cardinality
RoaringBitmap,RoaringBitmap
INTEGER
Calculates the cardinality by performing an OR operation on two roaring bitmaps. If the value of an input parameter is null, the function considers the parameter value as empty.
NoteOnly Hologres V1.1.42 and later support this function.
rb_or_null2empty_cardinality(rb_build(null),rb_build('{3,4,5}'))
rb_xor_cardinality
RoaringBitmap,RoaringBitmap
INTEGER
Calculates the cardinality by performing an XOR operation on two roaring bitmaps.
rb_xor_cardinality(rb_build('{1,2,3}'),rb_build('{3,4,5}'))
rb_andnot_cardinality
RoaringBitmap,RoaringBitmap
INTEGER
Calculates the cardinality by performing an ANDNOT operation on two roaring bitmaps.
rb_andnot_cardinality(rb_build('{1,2,3}'),rb_build('{3,4,5}'))
rb_andnot_null2empty_cardinality
RoaringBitmap,RoaringBitmap
INTEGER
Calculates the cardinality by performing an ANDNOT operation on two roaring bitmaps. If the value of an input parameter is null, the function considers the parameter value as empty.
NoteOnly Hologres V1.1.42 and later support this function.
rb_andnot_null2empty_cardinality(rb_build(null),rb_build('{3,4,5}'))
rb_is_empty
RoaringBitmap
BOOLEAN
Checks whether a Roaring bitmap is empty.
rb_is_empty(rb_build('{1,2,3,4,5}'))
rb_fill
RoaringBitmap,BIGINT,BIGINT
RoaringBitmap
Fills in the specified range excluding the range end in a Roaring bitmap.
NoteOnly Hologres V1.3.16 and later support this function.
rb_fill(rb_build('{1,2,3}'), 5, 7)
rb_clear
RoaringBitmap,BIGINT,BIGINT
RoaringBitmap
Clears the specified range excluding the range end in a roaring bitmap.
NoteOnly Hologres V1.3.16 and later support this function.
rb_clear(rb_build('{1,2,3}'), 2, 3)
rb_contains_bitmap
RoaringBitmap,RoaringBitmap
BOOLEAN
Checks whether the first bitmap contains the second bitmap.
rb_contains(rb_build('{1,2,3}'),rb_build('{3}'))
rb_flip
RoaringBitmap,INTEGER,INTEGER
RoaringBitmap
Flips the specified offsets in a roaring bitmap.
rb_flip(rb_build('{1,2,3}'),2,3)
rb_range_cardinality
RoaringBitmap, BIGINT, BIGINT
BIGINT
Returns the cardinality of the range from the start position (included) to the end position (not included). The start position is 1.
NoteOnly Hologres V1.3.16 and later support this function.
rb_range_cardinality(rb_build('{1,2,3}'), 2, 3)
rb_rank
RoaringBitmap,INTEGER
INTEGER
Returns the number of elements that are less than or equal to a specific offset in a roaring bitmap.
rb_rank(rb_build('{1,2,3}'),3)
rb_jaccard_dist
RoaringBitmap,RoaringBitmap
DOUBLE PRECISION
Returns the Jaccard distance or the Jaccard similarity coefficient between two Roaring bitmaps.
NoteOnly Hologres V1.3.16 and later support this function.
rb_jaccard_dist(rb_build('{1,2,3}'), rb_build('{3,4}'))
rb_select
RoaringBitmap, bitset_limit bigint, bitset_offset bigint=0, reverse boolean=false, range_start bigint=-2147483648, range_end bigint=2147483647
RoaringBitmap
Returns a subset [bitset_offset,bitset_offset+bitset_limit) of the bitmap within the range [range_start,range_end).
rb_select(rb_build('{1,2,3,4,5,6,7,8,9}'), 5, 2)
rb_iterate
RoaringBitmap
Set of Integer
Returns a list of offsets from a Roaring bitmap.
rb_iterate(rb_build('{1,2,3}'))
Aggregate functions
Functions that support processing RoaringBitmap and RoaringBitmap64 types
Function name
Input type
Output type
Description
Example
rb_or_agg
RoaringBitmap | RoaringBitmap64
Consistent with the input parameter type.
Performs an OR aggregate operation.
rb_or_agg(rb_build('{1,2,3}'))
rb_and_agg
RoaringBitmap | RoaringBitmap64
Consistent with the input parameter type.
Performs an AND aggregate operation.
rb_and_agg(rb_build('{1,2,3}'))
rb_or_cardinality_agg
RoaringBitmap | RoaringBitmap64
INTEGER
Calculates the cardinality from an OR aggregate operation on two roaring bitmaps.
rb_or_cardinality_agg(rb_build('{1,2,3}'))
rb_and_cardinality_agg
RoaringBitmap | RoaringBitmap64
INTEGER
Calculates the cardinality from an AND aggregate operation on two roaring bitmaps.
rb_and_cardinality_agg(rb_build('{1,2,3}'))
Functions that only support processing RoaringBitmap type
Function name
Input type
Output type
Description
Example
rb_xor_agg
RoaringBitmap
RoaringBitmap
Performs an XOR aggregate operation.
rb_xor_agg(rb_build('{1,2,3}'))
rb_xor_cardinality_agg
RoaringBitmap
INTEGER
Calculates the cardinality from an XOR aggregate operation on two roaring bitmaps.
rb_xor_cardinality_agg(rb_build('{1,2,3}'))
Examples
The following functions only support processing data of the RoaringBitmap type.
Function name | Input type | Output type | Description | Example |
roaringbitmap_text | TEXT, BOOLEAN | RoaringBitmap | Deserializes binary RoaringBitmap data of TEXT type into a RoaringBitmap structure. The second parameter indicates whether to verify the format. We recommend that you set this parameter to true. Otherwise, invalid bitmap data will be returned. |
|
rb_to_text | RoaringBitmap | TEXT | Converts a RoaringBitmap structure to binary RoaringBitmap data of TEXT type for output. |
|
Operations
The following examples demonstrate how to use roaring bitmap functions.
Install an extension.
CREATE EXTENSION roaringbitmap;
Create a table that is used to store roaring bitmap data.
--Create a table named t1 CREATE TABLE public.t1 (id integer, bitmap roaringbitmap);
Use the rb_build function to insert roaring bitmap data into the table.
--Set the bit values at the array positions to 1 INSERT INTO public.t1 SELECT 1,RB_BUILD(ARRAY[1,2,3,4,5,6,7,8,9,200]); --Set the bit values of multiple elements to 1 and aggregate the bit values into a Roaring bitmap INSERT INTO public.t1 SELECT 2,RB_BUILD_AGG(e) FROM GENERATE_SERIES(1,100) e;
Perform bitwise operations such as OR, AND, XOR, and ANDNOT.
SELECT RB_OR(a.bitmap,b.bitmap) FROM ( SELECT bitmap FROM public.t1 WHERE id = 1 ) AS a ,( SELECT bitmap FROM public.t1 WHERE id = 2 ) AS b ;
Perform bitwise aggregate operations such as OR, AND, XOR, and BUILD to generate a new roaring bitmap.
SELECT RB_OR_AGG(bitmap) FROM public.t1; SELECT RB_AND_AGG(bitmap) FROM public.t1; SELECT RB_XOR_AGG(bitmap) FROM public.t1; SELECT RB_BUILD_AGG(id) FROM public.t1;
Calculate the cardinality of the roaring bitmap. The cardinality is the number of bits that are set to 1 in the roaring bitmap.
SELECT RB_CARDINALITY(bitmap) FROM public.t1;
Obtain the subscripts of the bits that are set to 1.
SELECT RB_ITERATE(bitmap) FROM public.t1 WHERE id = 1;
Convert a roaring bitmap into an array.
SELECT RB_TO_ARRAY(bitmap) FROM public.t1 WHERE id = 1;