This topic describes the BITMAP data type and its related functions supported by Realtime Compute for Apache Flink.
Limits
Only Ververica Runtime (VVR) 11.5 and later versions of Realtime Compute for Apache Flink support the BITMAP data type and its related functions.
BITMAP data type
BITMAP is a data type that stores a set of 32-bit unsigned integers. It uses an automatic rotation compression format based on the RoaringBitmap standard. This format is ideal for efficient storage, precise deduplication, and set operations on large-scale data.
The standardized serialization format allows for seamless, binary-level interoperability with external systems.
Usage recommendations
When using BITMAP aggregate functions, enable MiniBatch or use window aggregation. This method reduces state access overhead and significantly improves performance.
BITMAP aggregate functions perform best with Append-Only input. Their performance degrades significantly with Retraction input. Avoid multi-level GroupBy aggregations on BITMAPs.
BITMAP currently supports only the INT type (32-bit integer). To store other data types:
For the BIGINT type (64-bit integer), you can create a dictionary to map values to the INT type based on data features or use bucketing for storage.
For non-integer types, you can create a dictionary to map values to the INT type.
The BITMAP type compresses continuous, dense data much more effectively than discrete, sparse data. You can use data mapping to improve BITMAP compression efficiency.
Function list
The supported BITMAP functions fall into four main categories: bitmap construction, logical operations, statistics information, and format conversion.
Aggregate functions
SQL | Table API | Input type | Output type | Description |
BITMAP_BUILD_AGG(value) | value.bitmapBuildAgg() | INT | BITMAP | Aggregates 32-bit integers into a bitmap. |
BITMAP_AND_AGG(bitmap) | bitmap.bitmapAndAgg() | BITMAP | BITMAP | Aggregates the intersection (AND) of multiple bitmaps. |
BITMAP_OR_AGG(bitmap) | bitmap.bitmapOrAgg() | BITMAP | BITMAP | Aggregates the union (OR) of multiple bitmaps. |
BITMAP_XOR_AGG(bitmap) | bitmap.bitmapXorAgg() | BITMAP | BITMAP | Aggregates the exclusive OR (XOR) of multiple bitmaps. |
BITMAP_BUILD_CARDINALITY_AGG(value) | value.bitmapBuildCardinalityAgg() | INT | BIGINT | Aggregates 32-bit integers into a bitmap and returns its 64-bit cardinal. |
BITMAP_AND_CARDINALITY_AGG(bitmap) | bitmap.bitmapAndCardinalityAgg() | BITMAP | BIGINT | Aggregates the intersection (AND) of multiple bitmaps and returns its 64-bit cardinal. |
BITMAP_OR_CARDINALITY_AGG(bitmap) | bitmap.bitmapOrCardinalityAgg() | BITMAP | BIGINT | Aggregates the union (OR) of multiple bitmaps and returns its 64-bit cardinal. |
BITMAP_XOR_CARDINALITY_AGG(bitmap) | bitmap.bitmapXorCardinalityAgg() | BITMAP | BIGINT | Aggregates the exclusive OR (XOR) of multiple bitmaps and returns its 64-bit cardinal. |
In cardinality statistics scenarios, if you do not need the specific bitmap, use BITMAP_XX_CARDINALITY_AGG() instead of BITMAP_CARDINALITY(BITMAP_XX_AGG()). The functions produce the same result, but the former offers better performance.
Scalar functions
SQL | Table API | Input type | Output type | Description |
BITMAP_BUILD(array) | array.bitmapBuild() | ARRAY<INT> | BITMAP | Creates a bitmap from an array of 32-bit integers. If the parameter is NULL, this function returns NULL. |
BITMAP_CARDINALITY(bitmap) | bitmap.bitmapCardinality() | BITMAP | BIGINT | Returns the 64-bit cardinal of a bitmap. If the parameter is NULL, this function returns NULL. |
BITMAP_AND(bitmap1, bitmap2) | bitmap1.bitmapAnd(bitmap2) | BITMAP, BITMAP | BITMAP | Calculates the intersection (AND) of two bitmaps. If either parameter is NULL, this function returns NULL. |
BITMAP_OR(bitmap1, bitmap2) | bitmap1.bitmapOr(bitmap2) | BITMAP, BITMAP | BITMAP | Calculates the union (OR) of two bitmaps. If either parameter is NULL, this function returns NULL. |
BITMAP_XOR(bitmap1, bitmap2) | bitmap1.bitmapXor(bitmap2) | BITMAP, BITMAP | BITMAP | Calculates the exclusive OR (XOR) of two bitmaps. If either parameter is NULL, this function returns NULL. |
BITMAP_ANDNOT(bitmap1, bitmap2) | bitmap1.bitmapAndnot(bitmap2) | BITMAP, BITMAP | BITMAP | Calculates the difference (AND NOT) of two bitmaps. If either parameter is NULL, this function returns NULL. |
BITMAP_FROM_BYTES(bytes) | bytes.bitmapFromBytes() | BYTES | BITMAP | Transforms a byte array into a bitmap. This function follows the format defined in the 32-bit RoaringBitmap format specification. If the parameter is NULL, this function returns NULL. |
BITMAP_TO_BYTES(bitmap) | bitmap.bitmapToBytes() | BITMAP | BYTES | Transforms a bitmap into a byte array. This function follows the format defined in the 32-bit RoaringBitmap format specification. If the parameter is NULL, this function returns NULL. |
BITMAP_TO_ARRAY(bitmap) | bitmap.bitmapToArray() | BITMAP | ARRAY<INT> | Transforms a bitmap into an array of 32-bit integers. The values are sorted by If the parameter is NULL, this function returns NULL. |
BITMAP_TO_STRING(bitmap) | bitmap.bitmapToString() | BITMAP | STRING | Transforms a bitmap into a string. The values are sorted by
If the parameter is NULL, this function returns NULL. |