All Products
Search
Document Center

AnalyticDB for PostgreSQL:Use efficient roaring bitmaps

Last Updated:Jan 23, 2024

A roaring bitmap is an efficient data structure. Compared with bitmaps, roaring bitmaps provide higher processing performance and use fewer memory resources. Roaring bitmaps are suitable for computing scenarios such as intersection, union, and difference set operations and deduplication. In most cases, roaring bitmaps are used for user profiling, personalized recommendation, and precision marketing.

Background information

A bitmap is a common data structure that consists of the values 0 and 1. A separate bitmap is created to house all possible values for each column. Each bit indicates whether a data row has a value in the corresponding column. Conventional bitmaps use large amounts of memory resources. We recommend that you use compressed bitmaps. Roaring bitmaps are efficient compressed bitmaps that are used in almost all popular programming languages on a variety of big data platforms.

Overview

In a roaring bitmap, each 32-bit integer contains 16 most significant bits and 16 least significant bits. The 16 most significant bits are stored in multiple chunks, and the 16 least significant bits are stored in a container. Each chunk can store up to 2^16 integers. A roaring bitmap stores containers in a dynamically expanding array and sorts the containers based on the 16 most significant bits. You can use the 16 most significant bits and the binary search method to quickly search for the containers. The following types of containers are used to store data based on data characteristics:

  • Array container: stores scattered and non-consecutive data. An array container can store fewer than 4,096 integers.

  • Bitmap container: stores a large number of consecutive integers. A bitmap container can store 4,096 or more integers.

  • Run container: stores a large number of consecutive increasing values. You can use a run container only if the size of the container is smaller than the size of an array container and the size of a bitmap container.  

Roaring bitmaps use the preceding storage structure to significantly improve the data compression ratio and rapidly retrieve specific values. Roaring bitmaps support bitmap operations such as AND, OR, and XOR between the three types of containers. Roaring bitmaps can deliver excellent storage and computing performance.

For more information about roaring bitmaps, visit GitHub.

Usage notes

Only AnalyticDB for PostgreSQL instances of V6.3.8.9 or later support roaring bitmaps. For information about how to view the minor version of an instance, see View the minor engine version.

Use roaring bitmaps

Create an extension

Before you can use roaring bitmaps, you must create the RoaringBitmap extension in databases. You cannot manually create the RoaringBitmap extension. To install or upgrade the extension, Submit a ticket.

CREATE EXTENSION roaringbitmap;

Create a table

Create a table that is used to store roaring bitmap data.

CREATE TABLE t1 (id integer, bitmap roaringbitmap);

Insert data

Call the rb_build function to insert roaring bitmap data.

-- Set the bit value of an array to 1. 
INSERT INTO t1 SELECT 1,RB_BUILD(ARRAY[1,2,3,4,5,6,7,8,9,200]);
-- Set the bit values of multiple elements to 1 and aggregate the bit values into a roaring bitmap.  
INSERT INTO t1 SELECT 2,RB_BUILD_AGG(e) FROM GENERATE_SERIES(1,100) e;

Query data

Obtain the subscripts of the bits that are set to 1.

SELECT RB_ITERATE(bitmap) FROM t1 WHERE id = 1;

Sample bitmap operations:

  • Perform bitmap operations such as OR, AND, XOR, and ANDNOT.

    SELECT RB_OR(a.bitmap,b.bitmap) FROM (SELECT bitmap FROM t1 WHERE id = 1) AS a,(SELECT bitmap FROM t1 WHERE id = 2) AS b;
  • Perform bitmap aggregate operations such as OR, AND, XOR, and BUILD to generate a new roaring bitmap.

    SELECT RB_OR_AGG(bitmap) FROM t1;
    SELECT RB_AND_AGG(bitmap) FROM t1;
    SELECT RB_XOR_AGG(bitmap) FROM t1;
    SELECT RB_BUILD_AGG(e) FROM GENERATE_SERIES(1,100) e;
  • Calculate the cardinality of the roaring bitmap. The cardinality is the number of bits that are set to 1 in the roaring bitmap.

    SELECT RB_CARDINALITY(bitmap) FROM t1;

Bitmap calculation functions

Function

Input

Output

Description

Example

rb_build

integer[]

roaringbitmap

Creates a roaring bitmap from an integer array.

rb_build('{1,2,3,4,5}')

rb_and

roaringbitmap,roaringbitmap

roaringbitmap

Performs an AND operation.

rb_and(rb_build('{1,2,3}'),rb_build('{3,4,5}'))

rb_or

roaringbitmap,roaringbitmap

roaringbitmap

Performs an OR operation.

rb_or(rb_build('{1,2,3}'),rb_build('{3,4,5}'))

rb_xor

roaringbitmap,roaringbitmap

roaringbitmap

Performs an XOR operation.

rb_xor(rb_build('{1,2,3}'),rb_build('{3,4,5}'))

rb_andnot

roaringbitmap,roaringbitmap

roaringbitmap

Performs an ANDNOT operation.

rb_andnot(rb_build('{1,2,3}'),rb_build('{3,4,5}'))

rb_cardinality

roaringbitmap

integer

Calculates the cardinality.

rb_cardinality(rb_build('{1,2,3,4,5}'))

rb_and_cardinality

roaringbitmap,roaringbitmap

integer

Calculates the cardinality by performing an AND operation on two roaring bitmaps.

rb_and_cardinality(rb_build('{1,2,3}'),rb_build('{3,4,5}'))

rb_or_cardinality

roaringbitmap,roaringbitmap

integer

Calculates the cardinality by performing an OR operation on two roaring bitmaps.

rb_or_cardinality(rb_build('{1,2,3}'),rb_build('{3,4,5}'))

rb_xor_cardinality

roaringbitmap,roaringbitmap

integer

Calculates the cardinality by performing an XOR operation on two roaring bitmaps.

rb_xor_cardinality(rb_build('{1,2,3}'),rb_build('{3,4,5}'))

rb_andnot_cardinality

roaringbitmap,roaringbitmap

integer

Calculates the cardinality by performing an ANDNOT operation on two roaring bitmaps.

rb_andnot_cardinality(rb_build('{1,2,3}'),rb_build('{3,4,5}'))

rb_is_empty

roaringbitmap

boolean

Checks whether a roaring bitmap is empty.

rb_is_empty(rb_build('{1,2,3,4,5}'))

rb_equals

roaringbitmap,roaringbitmap

boolean

Checks whether two roaring bitmaps are the same.

rb_equals(rb_build('{1,2,3}'),rb_build('{3,4,5}'))

rb_intersect

roaringbitmap,roaringbitmap

boolean

Checks whether two roaring bitmaps intersect.

rb_intersect(rb_build('{1,2,3}'),rb_build('{3,4,5}'))

rb_remove

roaringbitmap,integer

roaringbitmap

Removes a specific offset from a roaring bitmap.

rb_remove(rb_build('{1,2,3}'),3)

rb_flip

roaringbitmap,integer

roraingbitmap

Flips a specific offset in a roaring bitmap.

rb_flip(rb_build('{1,2,3}'),3)

rb_flip

roaringbitmap,integer,integer

roaringbitmap

Flips a specific offset range in a roaring bitmap.

rb_flip(rb_build('{1,2,3}'),2,3)

rb_minimum

roaringbitmap

integer

Returns the smallest offset in a roaring bitmap. If the roaring bitmap is empty, -1 is returned.

rb_minimum(rb_build('{1,2,3}'))

rb_maximum

roaringbitmap

integer

Returns the largest offset in a roaring bitmap. If the roaring bitmap is empty, 0 is returned.

rb_maximum(rb_build('{1,2,3}'))

rb_rank

roaringbitmap,integer

integer

Returns the number of elements that are less than or equal to a specific offset in a roaring bitmap.

rb_rank(rb_build('{1,2,3}'),3)

rb_iterate

roaringbitmap

setof integer

Returns a list of offsets from a roaring bitmap.

rb_iterate(rb_build('{1,2,3}'))

rb_contains

roaringbitmap, integer

bool

Checks whether a roaring bitmap contains a specific offset.

rb_contains(rb_build('{1,2,3}'),1)

rb_contains

roaringbitmap,integer,integer

bool

Checks whether a roaring bitmap contains a specific offset range.

rb_contains(rb_build('{1,2,3}'),rb_build('{3,4,5}'))

rb_contains

roaringbitmap, roaringbitmap

bool

Checks whether a roaring bitmap contains another roaring bitmap.

 rb_contains(rb_build('{1,2,3}'),rb_build('{3,4,5}'))

rb_becontained

integer, roaringbitmap

bool

Checks whether a specific offset is contained by a roaring bitmap.

rb_becontained(1, rb_build('{1,2,3}'))

rb_becontained

roaringbitmap,roaringbitmap

bool

Checks whether a roaring bitmap is contained by another roaring bitmap.

rb_becontained(rb_build('{1}'), rb_build('{1,2,3}'))

rb_add

roaringbitmap, integer

roaringbitmap

Adds a specific offset to a roaring bitmap.

rb_add(rb_build('{1,2,3,4}'), 5)

rb_add_2

integer, roaringbitmap

roaringbitmap

Add a specific offset to roaringbitmap.

rb_add_2(5, rb_build('{1,2,3,4}'))

rb_add

roaringbitmap,integer, integer

roaringbitmap

Adds a specific offset range to a roaring bitmap.

rb_add(rb_build('{1,2,3,4}'), 6,8)

rb_remove

roaringbitmap,integer, integer

roaringbitmap

Removes a specific offset from a roaring bitmap.

rb_remove(rb_build('{1,2,3,4,6,7,8}'), 6,8)

rb_jaccard_index

roaringbitmap,roaringbitmap

float8

Calculates the Jaccard similarity coefficient between two roaring bitmaps.

rb_jaccard_index(rb_build('{1,2,3,4}'), rb_build('{1,2}'))

rb_to_array

roaringbitmap

integer[]

Converts a roaring bitmap into an array.

rb_to_array(rb_build('{1,2,3,4}'))

rb_iterate_decrement

roaringbitmap

integer[]

Returns a list of offsets from a roaring bitmap in descending order.

rb_iterate_decrement(rb_build('{1,2,3,4}'))

Bitmap aggregate functions

Function

Input

Output

Description

Example

rb_build_agg

integer

roaringbitmap

Creates a roaring bitmap from a group of offsets.

rb_build_agg(1)

rb_or_agg

roaringbitmap

roaringbitmap

Performs an OR aggregate operation.

rb_or_agg(rb_build('{1,2,3}'))

rb_and_agg

roaringbitmap

roaringbitmap

Performs an AND aggregate operation.

rb_and_agg(rb_build('{1,2,3}'))

rb_xor_agg

roaringbitmap

roaringbitmap

Performs an XOR aggregate operation.

rb_xor_agg(rb_build('{1,2,3}'))

rb_or_cardinality_agg

roaringbitmap

integer

Calculates the cardinality by performing an OR aggregate operation on two roaring bitmaps.

rb_or_cardinality_agg(rb_build('{1,2,3}'))

rb_and_cardinality_agg

roaringbitmap

integer

Calculates the cardinality by performing an AND aggregate operation on two roaring bitmaps.

rb_and_cardinality_agg(rb_build('{1,2,3}'))

rb_xor_cardinality_agg

roaringbitmap

integer

Calculates the cardinality by performing an XOR aggregate operation on two roaring bitmaps.

rb_xor_cardinality_agg(rb_build('{1,2,3}'))

Operators

Operator

Left

Right

Output

Description

Example

&

roaringbitmap

roaringbitmap

roaringbitmap

Performs an AND operation on two roaring bitmaps.

rb_build('{1,2,3}') & rb_build('{1,2,4}')

|

roaringbitmap

roaringbitmap

roaringbitmap

Performs an OR operation on two roaring bitmaps.

rb_build('{1,2}') | rb_build('{1,3}')

#

roaringbitmap

roaringbitmap

roaringbitmap

Performs an XOR operation on two roaring bitmaps.

rb_build('{1,2}') # rb_build('{1,3}')

~

roaringbitmap

roaringbitmap

roaringbitmap

Performs an ANDNOT operation on two roaring bitmaps.

rb_build('{2,3}') ~ rb_build('{2,4}')

+

roraingbitmap

integer

roaringbitmap

Adds a specific offset to a roaring bitmap.

rb_build('{2,3}') + 1

-

roraingbitmap

integer

roaringbitmap

Removes a specific offset from a roaring bitmap.

rb_build('{1,2,3}') - 1

=

roaringbitmap

roaringbitmap

boolean

Checks whether two roaring bitmaps are the same.

rb_build('{2,3}') = rb_build('{2,3}')

<>

roaringbitmap

roaringbitmap

boolean

Checks whether two roaring bitmaps are different.

rb_build('{2,3}') <> rb_build('{1,2,3}')

&&

roaringbitmap

roaringbitmap

boolean

Checks whether two roaring bitmaps intersect.

rb_build('{2,3}') && rb_build('{3,4}')

@>

roaringbitmap

roaringbitmap

boolean

Checks whether a roaring bitmap contains another roaring bitmap.

rb_build('{2,3}') @> rb_build('{2}')

@>

roaringbitmap

integer

boolean

Checks whether a roaring bitmap contains a specific offset.

rb_build('{2,3}') @> 2

<@

roaringbitmap

roaringbitmap

boolean

Checks whether a roaring bitmap is contained by another roaring bitmap.

rb_build('{2,3}') <@ rb_build('{1,2,3}')

<@

integer

roaringbitmap

boolean

Checks whether a specific offset is contained by a roaring bitmap.

rb_build('{2,3}') <@ rb_build('{3}')