BITMAP is a data type that stores a set of 32-bit unsigned integers. Based on the RoaringBitmap standard, it uses an automatic rotation compression format for efficient storage, precise deduplication, and set operations on large-scale data. Its standardized serialization format enables binary-level interoperability with external systems.
Only Ververica Runtime (VVR) 11.5 and later versions of Realtime Compute for Apache Flink support the BITMAP data type and its related functions.
Limits
BITMAP only supports the INT type (32-bit integer). To store other data types:
BIGINT (64-bit integer): Create a dictionary to map values to INT based on data characteristics, or use bucketing.
Non-integer types: Create a dictionary to map values to INT.
BITMAP compresses continuous, dense data far more effectively than discrete, sparse data. Use data mapping to improve compression efficiency.
How to get a BITMAP value
There are three ways to get a BITMAP value, each suited to a different scenario:
| Scenario | Function to use |
|---|---|
| Aggregate a column of integers during stream processing | BITMAP_BUILD_AGG |
| Create a bitmap from an existing array | BITMAP_BUILD |
| Deserialize a bitmap from an external system | BITMAP_FROM_BYTES |
Performance guidance for aggregate functions
When using BITMAP aggregate functions, enable MiniBatch or use window aggregation. This reduces state access overhead and significantly improves performance.
BITMAP aggregate functions perform best with Append-Only input. Performance degrades significantly with Retraction input.
Avoid multi-level GroupBy aggregations on BITMAPs.
Function list
BITMAP functions fall into four categories: bitmap construction, logical operations, statistics information, and format conversion.
Aggregate functions
Aggregate functions operate across rows within a group.
| SQL | Table API | Input type | Output type | Description |
|---|---|---|---|---|
BITMAP_BUILD_AGG(value) | value.bitmapBuildAgg() | INT | BITMAP | Aggregates 32-bit integers into a bitmap. |
BITMAP_AND_AGG(bitmap) | bitmap.bitmapAndAgg() | BITMAP | BITMAP | Aggregates the intersection (AND) of multiple bitmaps. |
BITMAP_OR_AGG(bitmap) | bitmap.bitmapOrAgg() | BITMAP | BITMAP | Aggregates the union (OR) of multiple bitmaps. |
BITMAP_XOR_AGG(bitmap) | bitmap.bitmapXorAgg() | BITMAP | BITMAP | Aggregates the exclusive OR (XOR) of multiple bitmaps. |
BITMAP_BUILD_CARDINALITY_AGG(value) | value.bitmapBuildCardinalityAgg() | INT | BIGINT | Aggregates 32-bit integers into a bitmap and returns its cardinality as a 64-bit integer. |
BITMAP_AND_CARDINALITY_AGG(bitmap) | bitmap.bitmapAndCardinalityAgg() | BITMAP | BIGINT | Aggregates the intersection (AND) of multiple bitmaps and returns the cardinality. |
BITMAP_OR_CARDINALITY_AGG(bitmap) | bitmap.bitmapOrCardinalityAgg() | BITMAP | BIGINT | Aggregates the union (OR) of multiple bitmaps and returns the cardinality. |
BITMAP_XOR_CARDINALITY_AGG(bitmap) | bitmap.bitmapXorCardinalityAgg() | BITMAP | BIGINT | Aggregates the exclusive OR (XOR) of multiple bitmaps and returns the cardinality. |
Cardinality optimization: When you only need the count — not the bitmap itself — use BITMAP_XX_CARDINALITY_AGG() directly instead of wrapping BITMAP_CARDINALITY(BITMAP_XX_AGG()). Both return the same result, but the direct form avoids materializing an intermediate bitmap and offers better performance.
-- Preferred: returns cardinality directly
SELECT BITMAP_OR_CARDINALITY_AGG(bitmap_col) FROM my_table;
-- Equivalent but less efficient: materializes the bitmap first
SELECT BITMAP_CARDINALITY(BITMAP_OR_AGG(bitmap_col)) FROM my_table;Scalar functions
Scalar functions operate on individual BITMAP values.
BITMAP_BUILD(array)
Creates a BITMAP from an array of 32-bit integers. Returns NULL if the input is NULL.
| Table API | array.bitmapBuild() |
|---|---|
| Input type | ARRAY<INT> |
| Output type | BITMAP |
SELECT BITMAP_TO_STRING(BITMAP_BUILD(ARRAY[1, 2, 3, 4, 5]));
-- Result: {1,2,3,4,5}
-- NULL input returns NULL
SELECT BITMAP_BUILD(NULL);
-- Result: NULLBITMAP_CARDINALITY(bitmap)
Returns the number of distinct elements in a bitmap as a 64-bit integer. Duplicate values are counted once. Returns NULL if the input is NULL.
| Table API | bitmap.bitmapCardinality() |
|---|---|
| Input type | BITMAP |
| Output type | BIGINT |
SELECT BITMAP_CARDINALITY(BITMAP_BUILD(ARRAY[1, 2, 3, 3, 5]));
-- Result: 4 (3 appears twice but is counted once)Tip: In aggregation pipelines, useBITMAP_XX_CARDINALITY_AGG()directly instead ofBITMAP_CARDINALITY(BITMAP_XX_AGG())for better performance.
BITMAP_AND(bitmap1, bitmap2)
Returns the intersection (AND) of two bitmaps — elements present in both. Returns NULL if either input is NULL.
| Table API | bitmap1.bitmapAnd(bitmap2) |
|---|---|
| Input type | BITMAP, BITMAP |
| Output type | BITMAP |
-- Normal case
SELECT BITMAP_TO_STRING(
BITMAP_AND(BITMAP_BUILD(ARRAY[1, 2, 3]), BITMAP_BUILD(ARRAY[2, 3, 4]))
);
-- Result: {2,3}
-- NULL input returns NULL
SELECT BITMAP_TO_STRING(
BITMAP_AND(BITMAP_BUILD(ARRAY[1, 2, 3]), NULL)
);
-- Result: NULLBITMAP_OR(bitmap1, bitmap2)
Returns the union (OR) of two bitmaps — all distinct elements from both. Returns NULL if either input is NULL.
| Table API | bitmap1.bitmapOr(bitmap2) |
|---|---|
| Input type | BITMAP, BITMAP |
| Output type | BITMAP |
-- Normal case
SELECT BITMAP_TO_STRING(
BITMAP_OR(BITMAP_BUILD(ARRAY[1, 2, 3]), BITMAP_BUILD(ARRAY[3, 4, 5]))
);
-- Result: {1,2,3,4,5}
-- NULL input returns NULL
SELECT BITMAP_TO_STRING(
BITMAP_OR(BITMAP_BUILD(ARRAY[1, 2, 3]), NULL)
);
-- Result: NULLBITMAP_XOR(bitmap1, bitmap2)
Returns the exclusive OR (XOR) of two bitmaps — elements in either bitmap but not both. Returns NULL if either input is NULL.
| Table API | bitmap1.bitmapXor(bitmap2) |
|---|---|
| Input type | BITMAP, BITMAP |
| Output type | BITMAP |
SELECT BITMAP_TO_STRING(
BITMAP_XOR(BITMAP_BUILD(ARRAY[1, 2, 3]), BITMAP_BUILD(ARRAY[2, 3, 4]))
);
-- Result: {1,4}BITMAP_ANDNOT(bitmap1, bitmap2)
Returns the difference (AND NOT) of two bitmaps — elements in bitmap1 that are not in bitmap2. Returns NULL if either input is NULL.
| Table API | bitmap1.bitmapAndnot(bitmap2) |
|---|---|
| Input type | BITMAP, BITMAP |
| Output type | BITMAP |
SELECT BITMAP_TO_STRING(
BITMAP_ANDNOT(BITMAP_BUILD(ARRAY[1, 2, 3]), BITMAP_BUILD(ARRAY[2, 3, 4]))
);
-- Result: {1}BITMAP_FROM_BYTES(bytes)
Deserializes a byte array into a BITMAP. The byte array must follow the 32-bit RoaringBitmap format specification. Use this function to import bitmaps produced by external systems. Returns NULL if the input is NULL.
| Table API | bytes.bitmapFromBytes() |
|---|---|
| Input type | BYTES |
| Output type | BITMAP |
BITMAP_TO_BYTES(bitmap)
Serializes a BITMAP into a byte array following the 32-bit RoaringBitmap format specification. Use this function to export bitmaps to external systems that support RoaringBitmap. Returns NULL if the input is NULL.
| Table API | bitmap.bitmapToBytes() |
|---|---|
| Input type | BITMAP |
| Output type | BYTES |
BITMAP_TO_ARRAY(bitmap)
Converts a BITMAP into an ARRAY<INT>. Values are sorted by Integer.compareUnsigned. Returns NULL if the input is NULL.
| Table API | bitmap.bitmapToArray() |
|---|---|
| Input type | BITMAP |
| Output type | ARRAY<INT> |
SELECT BITMAP_TO_ARRAY(BITMAP_BUILD(ARRAY[3, 1, 2]));
-- Result: [1, 2, 3] (sorted by Integer.compareUnsigned)BITMAP_TO_STRING(bitmap)
Converts a BITMAP into a comma-separated string enclosed in curly braces. Values are sorted by Integer.compareUnsigned. If the resulting string is too long, it is truncated and ends with .... Returns NULL if the input is NULL.
| Table API | bitmap.bitmapToString() |
|---|---|
| Input type | BITMAP |
| Output type | STRING |
Output examples:
| Input | Output |
|---|---|
| Empty bitmap | {} |
BITMAP_BUILD(ARRAY[1, 2, 3, 4, 5]) | {1,2,3,4,5} |
| Bitmap containing unsigned equivalents of -2 and -1 | {0,1,4294967294,4294967295} |
| Bitmap with many elements | {1,2,3,...} |
BITMAP stores 32-bit unsigned integers. Values that appear negative as signed 32-bit integers (such as -1 and -2) are represented as their unsigned equivalents (4294967295 and 4294967294) in the output.
Storing non-INT data in BITMAP
Since BITMAP only accepts INT (32-bit integers), use the following patterns to handle other data types:
| Data type | Approach |
|---|---|
| BIGINT (64-bit) | Build a dictionary mapping each BIGINT value to a unique INT, or partition data into INT-range buckets. |
| String or other non-integer types | Build a dictionary mapping each distinct value to a unique INT. |
For continuous, dense integer ranges (such as sequential user IDs), BITMAP compression is highly effective. For sparse or non-sequential data, dictionary mapping improves compression efficiency.