All Products
Search
Document Center

Realtime Compute for Apache Flink:BITMAP functions

Last Updated:Mar 26, 2026

BITMAP is a data type that stores a set of 32-bit unsigned integers. Based on the RoaringBitmap standard, it uses an automatic rotation compression format for efficient storage, precise deduplication, and set operations on large-scale data. Its standardized serialization format enables binary-level interoperability with external systems.

Only Ververica Runtime (VVR) 11.5 and later versions of Realtime Compute for Apache Flink support the BITMAP data type and its related functions.

Limits

BITMAP only supports the INT type (32-bit integer). To store other data types:

  • BIGINT (64-bit integer): Create a dictionary to map values to INT based on data characteristics, or use bucketing.

  • Non-integer types: Create a dictionary to map values to INT.

BITMAP compresses continuous, dense data far more effectively than discrete, sparse data. Use data mapping to improve compression efficiency.

How to get a BITMAP value

There are three ways to get a BITMAP value, each suited to a different scenario:

ScenarioFunction to use
Aggregate a column of integers during stream processingBITMAP_BUILD_AGG
Create a bitmap from an existing arrayBITMAP_BUILD
Deserialize a bitmap from an external systemBITMAP_FROM_BYTES

Performance guidance for aggregate functions

Important

When using BITMAP aggregate functions, enable MiniBatch or use window aggregation. This reduces state access overhead and significantly improves performance.

  • BITMAP aggregate functions perform best with Append-Only input. Performance degrades significantly with Retraction input.

  • Avoid multi-level GroupBy aggregations on BITMAPs.

Function list

BITMAP functions fall into four categories: bitmap construction, logical operations, statistics information, and format conversion.

Aggregate functions

Aggregate functions operate across rows within a group.

SQLTable APIInput typeOutput typeDescription
BITMAP_BUILD_AGG(value)value.bitmapBuildAgg()INTBITMAPAggregates 32-bit integers into a bitmap.
BITMAP_AND_AGG(bitmap)bitmap.bitmapAndAgg()BITMAPBITMAPAggregates the intersection (AND) of multiple bitmaps.
BITMAP_OR_AGG(bitmap)bitmap.bitmapOrAgg()BITMAPBITMAPAggregates the union (OR) of multiple bitmaps.
BITMAP_XOR_AGG(bitmap)bitmap.bitmapXorAgg()BITMAPBITMAPAggregates the exclusive OR (XOR) of multiple bitmaps.
BITMAP_BUILD_CARDINALITY_AGG(value)value.bitmapBuildCardinalityAgg()INTBIGINTAggregates 32-bit integers into a bitmap and returns its cardinality as a 64-bit integer.
BITMAP_AND_CARDINALITY_AGG(bitmap)bitmap.bitmapAndCardinalityAgg()BITMAPBIGINTAggregates the intersection (AND) of multiple bitmaps and returns the cardinality.
BITMAP_OR_CARDINALITY_AGG(bitmap)bitmap.bitmapOrCardinalityAgg()BITMAPBIGINTAggregates the union (OR) of multiple bitmaps and returns the cardinality.
BITMAP_XOR_CARDINALITY_AGG(bitmap)bitmap.bitmapXorCardinalityAgg()BITMAPBIGINTAggregates the exclusive OR (XOR) of multiple bitmaps and returns the cardinality.

Cardinality optimization: When you only need the count — not the bitmap itself — use BITMAP_XX_CARDINALITY_AGG() directly instead of wrapping BITMAP_CARDINALITY(BITMAP_XX_AGG()). Both return the same result, but the direct form avoids materializing an intermediate bitmap and offers better performance.

-- Preferred: returns cardinality directly
SELECT BITMAP_OR_CARDINALITY_AGG(bitmap_col) FROM my_table;

-- Equivalent but less efficient: materializes the bitmap first
SELECT BITMAP_CARDINALITY(BITMAP_OR_AGG(bitmap_col)) FROM my_table;

Scalar functions

Scalar functions operate on individual BITMAP values.

BITMAP_BUILD(array)

Creates a BITMAP from an array of 32-bit integers. Returns NULL if the input is NULL.

Table APIarray.bitmapBuild()
Input typeARRAY<INT>
Output typeBITMAP
SELECT BITMAP_TO_STRING(BITMAP_BUILD(ARRAY[1, 2, 3, 4, 5]));
-- Result: {1,2,3,4,5}

-- NULL input returns NULL
SELECT BITMAP_BUILD(NULL);
-- Result: NULL

BITMAP_CARDINALITY(bitmap)

Returns the number of distinct elements in a bitmap as a 64-bit integer. Duplicate values are counted once. Returns NULL if the input is NULL.

Table APIbitmap.bitmapCardinality()
Input typeBITMAP
Output typeBIGINT
SELECT BITMAP_CARDINALITY(BITMAP_BUILD(ARRAY[1, 2, 3, 3, 5]));
-- Result: 4  (3 appears twice but is counted once)
Tip: In aggregation pipelines, use BITMAP_XX_CARDINALITY_AGG() directly instead of BITMAP_CARDINALITY(BITMAP_XX_AGG()) for better performance.

BITMAP_AND(bitmap1, bitmap2)

Returns the intersection (AND) of two bitmaps — elements present in both. Returns NULL if either input is NULL.

Table APIbitmap1.bitmapAnd(bitmap2)
Input typeBITMAP, BITMAP
Output typeBITMAP
-- Normal case
SELECT BITMAP_TO_STRING(
  BITMAP_AND(BITMAP_BUILD(ARRAY[1, 2, 3]), BITMAP_BUILD(ARRAY[2, 3, 4]))
);
-- Result: {2,3}

-- NULL input returns NULL
SELECT BITMAP_TO_STRING(
  BITMAP_AND(BITMAP_BUILD(ARRAY[1, 2, 3]), NULL)
);
-- Result: NULL

BITMAP_OR(bitmap1, bitmap2)

Returns the union (OR) of two bitmaps — all distinct elements from both. Returns NULL if either input is NULL.

Table APIbitmap1.bitmapOr(bitmap2)
Input typeBITMAP, BITMAP
Output typeBITMAP
-- Normal case
SELECT BITMAP_TO_STRING(
  BITMAP_OR(BITMAP_BUILD(ARRAY[1, 2, 3]), BITMAP_BUILD(ARRAY[3, 4, 5]))
);
-- Result: {1,2,3,4,5}

-- NULL input returns NULL
SELECT BITMAP_TO_STRING(
  BITMAP_OR(BITMAP_BUILD(ARRAY[1, 2, 3]), NULL)
);
-- Result: NULL

BITMAP_XOR(bitmap1, bitmap2)

Returns the exclusive OR (XOR) of two bitmaps — elements in either bitmap but not both. Returns NULL if either input is NULL.

Table APIbitmap1.bitmapXor(bitmap2)
Input typeBITMAP, BITMAP
Output typeBITMAP
SELECT BITMAP_TO_STRING(
  BITMAP_XOR(BITMAP_BUILD(ARRAY[1, 2, 3]), BITMAP_BUILD(ARRAY[2, 3, 4]))
);
-- Result: {1,4}

BITMAP_ANDNOT(bitmap1, bitmap2)

Returns the difference (AND NOT) of two bitmaps — elements in bitmap1 that are not in bitmap2. Returns NULL if either input is NULL.

Table APIbitmap1.bitmapAndnot(bitmap2)
Input typeBITMAP, BITMAP
Output typeBITMAP
SELECT BITMAP_TO_STRING(
  BITMAP_ANDNOT(BITMAP_BUILD(ARRAY[1, 2, 3]), BITMAP_BUILD(ARRAY[2, 3, 4]))
);
-- Result: {1}

BITMAP_FROM_BYTES(bytes)

Deserializes a byte array into a BITMAP. The byte array must follow the 32-bit RoaringBitmap format specification. Use this function to import bitmaps produced by external systems. Returns NULL if the input is NULL.

Table APIbytes.bitmapFromBytes()
Input typeBYTES
Output typeBITMAP

BITMAP_TO_BYTES(bitmap)

Serializes a BITMAP into a byte array following the 32-bit RoaringBitmap format specification. Use this function to export bitmaps to external systems that support RoaringBitmap. Returns NULL if the input is NULL.

Table APIbitmap.bitmapToBytes()
Input typeBITMAP
Output typeBYTES

BITMAP_TO_ARRAY(bitmap)

Converts a BITMAP into an ARRAY<INT>. Values are sorted by Integer.compareUnsigned. Returns NULL if the input is NULL.

Table APIbitmap.bitmapToArray()
Input typeBITMAP
Output typeARRAY<INT>
SELECT BITMAP_TO_ARRAY(BITMAP_BUILD(ARRAY[3, 1, 2]));
-- Result: [1, 2, 3]  (sorted by Integer.compareUnsigned)

BITMAP_TO_STRING(bitmap)

Converts a BITMAP into a comma-separated string enclosed in curly braces. Values are sorted by Integer.compareUnsigned. If the resulting string is too long, it is truncated and ends with .... Returns NULL if the input is NULL.

Table APIbitmap.bitmapToString()
Input typeBITMAP
Output typeSTRING

Output examples:

InputOutput
Empty bitmap{}
BITMAP_BUILD(ARRAY[1, 2, 3, 4, 5]){1,2,3,4,5}
Bitmap containing unsigned equivalents of -2 and -1{0,1,4294967294,4294967295}
Bitmap with many elements{1,2,3,...}
BITMAP stores 32-bit unsigned integers. Values that appear negative as signed 32-bit integers (such as -1 and -2) are represented as their unsigned equivalents (4294967295 and 4294967294) in the output.

Storing non-INT data in BITMAP

Since BITMAP only accepts INT (32-bit integers), use the following patterns to handle other data types:

Data typeApproach
BIGINT (64-bit)Build a dictionary mapping each BIGINT value to a unique INT, or partition data into INT-range buckets.
String or other non-integer typesBuild a dictionary mapping each distinct value to a unique INT.

For continuous, dense integer ranges (such as sequential user IDs), BITMAP compression is highly effective. For sparse or non-sequential data, dictionary mapping improves compression efficiency.