All Products
Search
Document Center

AnalyticDB for MySQL:Roaring bitmap functions

Last Updated:Apr 08, 2024

Roaring bitmaps are efficiently compressed bitmaps that are widely used in various programming languages and big data platforms for deduplication, tag-based filtering, and computing of time series data. This topic describes how to use roaring bitmap functions.

Supported versions

Only AnalyticDB for MySQL clusters of V3.1.6.4 or later support roaring bitmaps and roaring bitmap functions.

Note

For information about how to query the minor version of a cluster, see How do I query the version of an AnalyticDB for MySQL cluster? To update the minor version of a cluster, contact technical support.

Limits

When you use roaring bitmap functions in AnalyticDB for MySQL, take note of the following limits:

  • You must use the UNNEST operator to query the elements in a roaring bitmap. Sample statement:

    SELECT * FROM unnest(rb_build(array[1,2,3]));
  • Roaring bitmaps are supported only in Object Storage Service (OSS) external tables and cannot be created in AnalyticDB for MySQL internal tables. If you want to use roaring bitmap functions on AnalyticDB for MySQL internal tables, first use the rb_build_varbinary function to convert VARBINARY data into the roaringbitmap type. Sample statement:

    // Create an AnalyticDB for MySQL internal table.
    CREATE TABLE test_rb_cstore (id int, rb varbinary);
    
    // Invoke roaring bitmap functions.
    SELECT rb_cardinality(rb_build_varbinary(rb)) FROM test_rb_cstore;

Functions

Roaring bitmap functions include scalar and aggregate functions.

Scalar functions

Function

Input data type

Output data type

Description

Example

rb_build

int

roaringbitmap

Generates a roaring bitmap from an integer array.

rb_build(array[1,2,3])

rb_build_varbinary

varbinary

roaringbitmap

Generates a roaring bitmap from VARBINARY data.

rb_build_varbinary(rb_to_varbinary (rb_build(array[1,2,3])))

rb_cardinality

roaringbitmap

bigint

Calculates the cardinality of a roaring bitmap.

rb_cardinality(rb_build(array[1,2,3]))

rb_contains

roaringbitmap, int

boolean

Checks whether a roaring bitmap contains integers.

rb_contains(rb_build(array[1,2,3]), 3)

rb_and

roaringbitmap, roaringbitmap

roaringbitmap

Obtains the intersection of two roaring bitmaps.

rb_and(rb_build(array[1,2,3]), rb_build(array[2,3,4]))

rb_or

roaringbitmap, roaringbitmap

roaringbitmap

Obtains the union of two roaring bitmaps.

rb_or(rb_build(array[1,2,3]), rb_build(array[2,3,4]))

rb_xor

roaringbitmap, roaringbitmap

roaringbitmap

Obtains the XOR value of two roaring bitmaps.

rb_xor(rb_build(array[1,2,3]), rb_build(array[2,3,4]))

rb_and_null2empty

roaringbitmap, roaringbitmap

roaringbitmap

Performs an AND operation. If the input value of a parameter is null, the function uses the value of another parameter as the output. If the input value of a parameter is a pair of empty braces ({}), the function uses a pair of empty braces ({}) as the output.

rb_cardinality(rb_and_null2empty(rb_build(null),rb_build(array[3,4,5])))

rb_or_null2empty

roaringbitmap, roaringbitmap

roaringbitmap

Performs an OR operation. If the input value is null, the function uses a pair of empty braces ({}) as the input.

rb_cardinality(rb_or_null2empty(rb_build(null),rb_build(array[3,4,5])))

rb_andnot_null2empty

roaringbitmap, roaringbitmap

roaringbitmap

Performs an ANDNOT operation. If the input value is null, the function uses a pair of empty braces ({}) as the input.

rb_cardinality(rb_andnot_null2empty(rb_build(null),rb_build(array[3,4,5])))

rb_and_cardinality

roaringbitmap, roaringbitmap

integer

Calculates the cardinality by performing an AND operation on two roaring bitmaps.

rb_and_cardinality(rb_build(array[1,2,3]),rb_build(array[3,4,5]))

rb_and_null2empty_cardinality

roaringbitmap, roaringbitmap

integer

Calculates the cardinality by performing an AND operation on two roaring bitmaps. If the input value is null, the function uses a pair of empty braces ({}) as the input.

rb_and_null2empty_cardinality(rb_build(null),rb_build(array[3,4,5]))

rb_or_cardinality

roaringbitmap, roaringbitmap

integer

Calculates the cardinality by performing an OR operation on two roaring bitmaps.

rb_or_cardinality(rb_build(array[1,2,3]),rb_build(array[3,4,5]))

rb_or_null2empty_cardinality

roaringbitmap, roaringbitmap

integer

Calculates the cardinality by performing an OR operation on two roaring bitmaps. If the input value is null, the function uses a pair of empty braces ({}) as the input.

rb_or_null2empty_cardinality(rb_build(null),rb_build(array[3,4,5]))

rb_xor_cardinality

roaringbitmap, roaringbitmap

integer

Calculates the cardinality by performing an XOR operation on two roaring bitmaps.

rb_xor_cardinality(rb_build(array[1,2,3]),rb_build(array[3,4,5]))

rb_andnot_cardinality

roaringbitmap, roaringbitmap

integer

Calculates the cardinality by performing an ANDNOT operation on two roaring bitmaps.

rb_andnot_cardinality(rb_build(array[1,2,3]),rb_build(array[3,4,5]))

rb_andnot_null2empty_cardinality

roaringbitmap, roaringbitmap

integer

Calculates the cardinality by performing an ANDNOT operation on two roaring bitmaps. If the input value is null, the function uses a pair of empty braces ({}) as the input.

rb_andnot_null2empty_cardinality(rb_build(array[1,2,3]),rb_build(array[3,4,5]))

rb_is_empty

roaringbitmap

boolean

Checks whether a roaring bitmap is empty.

rb_is_empty(rb_build(array[]))

rb_clear

roaringbitmap, bigint, bigint

roaringbitmap

Clears the specified range excluding the range end in a roaring bitmap.

rb_clear(rb_build('{1,2,3}'), 2, 3)

rb_contains

roaringbitmap, roaringbitmap

boolean

Checks whether the first roaring bitmap contains the second roaring bitmap.

rb_contains(rb_build(array[1,2,3]),rb_build(array[3]))

rb_flip

roaringbitmap, integer, integer

roaringbitmap

Flips the specified offsets in a roaring bitmap.

rb_flip(rb_build(array[1,2,3,4,5]), 2, 5)

rb_minimum

roaringbitmap

integer

Returns the smallest offset in a roaring bitmap. If the roaring bitmap is empty, an error is returned.

rb_minimum(rb_build(array[1,2,3]))

rb_maximum

roaringbitmap

integer

Returns the largest offset in a roaring bitmap. If the roaring bitmap is empty, an error is returned.

rb_maximum(rb_build(array[1,2,3]))

rb_rank

roaringbitmap,integer

integer

Returns the number of elements that are less than or equal to a specified offset in a roaring bitmap.

rb_rank(rb_build(array[1,2,3]),2)

rb_to_array

roaringbitmap

integer

Returns an integer array from which a roaring bitmap is created.

rb_to_array(rb_build(array[1,2,3]))

rb_to_varbinary

roaringbitmap

varbinary

Returns a VARBINARY array from which a roaring bitmap is created.

rb_build_varbinary(rb_to_varbinary(rb_build(array[1,2,3])))

rb_range_cardinality

roaringbitmap, integer, integer

integer

Returns the cardinality of the range from the start position (included) to the end position (not included). The start position is 1.

Important

Only AnalyticDB for MySQL V3.1.10.0 or later supports this function.

rb_range_cardinality(rb_build(ARRAY [1,2,3]),2,3)

rb_select

roaringbitmap, integer, integer

roaringbitmap

Returns the offsets of a roaring bitmap in the range from the start position (included) to the end position (not included).

Important

Only AnalyticDB for MySQL V3.1.10.0 or later supports this function.

rb_select(rb_build(ARRAY [1,3,4,5,7,9]),2, 3)

Aggregate functions

Function

Input data type

Output data type

Description

Example

rb_build_agg

integer

roaringbitmap

Creates a roaring bitmap from a group of offsets.

rb_cardinality(rb_build_agg(1))

rb_or_agg

roaringbitmap

roaringbitmap

Performs an OR aggregate operation.

rb_cardinality(rb_or_agg(rb_build(array[1,2,3])))

rb_and_agg

roaringbitmap

roaringbitmap

Performs an AND aggregate operation.

rb_cardinality(rb_and_agg(rb_build(array[1,2,3])))

rb_xor_agg

roaringbitmap

roaringbitmap

Performs an XOR aggregate operation.

rb_cardinality(rb_xor_agg(rb_build(array[1,2,3])))

rb_or_cardinality_agg

roaringbitmap

integer

Calculates the cardinality by performing an OR aggregate operation on two roaring bitmaps.

rb_or_cardinality_agg(rb_build(array[1,2,3]))

rb_and_cardinality_agg

roaringbitmap

integer

Calculates the cardinality by performing an AND aggregate operation on two roaring bitmaps.

rb_and_cardinality_agg(rb_build(array[1,2,3]))

rb_xor_cardinality_agg

roaringbitmap

integer

Calculates the cardinality by performing an XOR aggregate operation on two roaring bitmaps.

rb_xor_cardinality_agg(rb_build(array[1,2,3]))

Examples

The following examples describe how to use roaring bitmap functions:

  1. Create an external table that supports roaring bitmaps.

    CREATE TABLE `test_rb` (
      `id` int,
      `rb` roaringbitmap
      ) engine = 'oss'
    TABLE_PROPERTIES = '{
    "endpoint":"oss-cn-zhangjiakou.aliyuncs.com",
    "accessid":"LTAIF****5FsE"",
    "accesskey":"Ccw****iWjv",
    "url":"oss://testBucketName/roaringbitmap/test_for_user/",
    "format":"parquet"
    }';
    Note

    For information about the parameters of an external table, see the "Create an OSS external table for a non-partitioned object" section of the Use external tables to import data to Data Warehouse Edition topic.

  2. Write data to the table.

    Important

    Writing data by using INSERT INTO statements is not a performant method. If you want to write a large amount of data, we recommend that you use an extract-transform-load (ETL) tool to generate data files in the parquet format and upload them to an OSS directory.

    INSERT INTO test_rb SELECT 1, rb_build(array[1,2,3]);
    INSERT INTO test_rb SELECT 2, rb_build(array[2,3,4,5]);
  3. Obtain the cardinality of the roaring bitmaps.

    SELECT id, rb_cardinality(rb) FROM test_rb;

    Sample result:

    +------+--------------------+
    | id   | rb_cardinality(rb) |
    +------+--------------------+
    |    2 |                  4 |
    |    1 |                  3 |
    +------+--------------------+
  4. Perform aggregation on the roaring bitmaps.

    SELECT rb_or_cardinality_agg(rb) FROM test_rb;

    Sample result:

    +---------------------------+
    | rb_or_cardinality_agg(rb) |
    +---------------------------+
    |                         5 |
    +---------------------------+

Example: Identification of target customers

In this example, a tag table is converted into a roaring bitmap table, and analysis is performed based on the roaring bitmaps, as shown in the following figure. 1

Step 1: Prepare a tag table

  1. Create a tag table named users_base for testing.

    CREATE TABLE users_base(
       uid INT,
       tag1 STRING, // Valid values of tag1: x, y, and z 
       tag2 STRING, // Valid values of tag2: a and b 
       tag3 INT // Valid values of tag3: 1 to 10 
    );
  2. Insert data to generate 100 million rows of tag data.

    SUBMIT JOB
    INSERT OVERWRITE users_base
    SELECT CAST(ROW_NUMBER() OVER (ORDER BY c1) AS INT) AS uid, SUBSTRING('xyz', FLOOR(RAND() * 3) + 1, 1) AS tag1, SUBSTRING('ab', FLOOR(RAND() * 2) + 1, 1) AS tag2, CAST(FLOOR(RAND() * 10) + 1 AS INT) as tag3 FROM  
    (
    SELECT A.c1 FROM
    UNNEST(RB_BUILD_RANGE(0, 10000)) AS A(c1)
      JOIN
      (SELECT c1 FROM
    UNNEST(RB_BUILD_RANGE(0, 10000)) AS B(c1)
    ));
  3. Query 10 rows of data from the users_base table.

    SELECT * FROM users_base LIMIT 10;

    Sample result:

    +--------+------+------+------+
    | uid    | tag1 | tag2 | tag3 |
    +--------+------+------+------+
    |  74526 | y    | b    |    3 |
    |  75611 | z    | b    |   10 |
    |  80850 | x    | b    |    5 |
    |  81656 | z    | b    |    7 |
    | 163845 | x    | b    |    2 |
    | 167007 | y    | b    |    4 |
    | 170541 | y    | b    |    9 |
    | 213108 | x    | a    |   10 |
    |  66056 | y    | b    |    4 |
    |  67761 | z    | a    |    2 |
    +--------+------+------+------+

Step 2: Group the tag table

To utilize the concurrency capability of the distributed computing engines of AnalyticDB for MySQL, we recommend that you add columns that can group the data by uid. In this example, a column named user-group is created for grouping. The group sizes vary based on the total number of computing and storage AnalyticDB compute units (ACUs) in your cluster.

  • A large number of groups bring high computing efficiency. However, if each roaring bitmap contains only a small number of elements, the system cannot take full advantage of roaring bitmaps.

  • In actual scenarios, we recommend that you limit the number of records within each group to less than 100 million. For example, if the total number of uid values is 10 billion, you can create 100 groups that each have 100 million rows of data.

In this example, each group has 16 uid values. In the user_group column, the values are the uid values divided by 16. In the offset column, the values are the remainders of the division. The offset column is used for the computation of roaring bitmaps. A uid can be calculated based on the following formula: uid = 16 × offset + user_group.

This grouping method is for reference only. You need to design your grouping based on your business requirements.

  1. Create a tag table named users that contains the column for grouping.

    CREATE TABLE users(
       uid INT,
       tag1 STRING,
       tag2 STRING,
       tag3 INT,
       user_group INT, // The grouping field.
       offset INT // The offset field.
    );
  2. Write the data of the users_base table to the users table.

    SUBMIT JOB INSERT OVERWRITE users SELECT uid, tag1, tag2, tag3, CAST(uid%16 AS INT), CAST(FLOOR(uid/16) AS INT) FROM users_base;
  3. Query 10 rows of data from the users table.

    SELECT * FROM users LIMIT 10;

    Sample result:

    +---------+------+------+------+------------+--------+
    | uid     | tag1 | tag2 | tag3 | user_group | offset |
    +---------+------+------+------+------------+--------+
    |  377194 | z    | b    |   10 |         10 |  23574 |
    |  309440 | x    | a    |    1 |          0 |  19340 |
    |  601745 | z    | a    |    7 |          1 |  37609 |
    |  753751 | z    | b    |    3 |          7 |  47109 |
    |  988186 | y    | a    |   10 |         10 |  61761 |
    |  883822 | x    | a    |    9 |         14 |  55238 |
    |  325065 | x    | b    |    6 |          9 |  20316 |
    | 1042875 | z    | a    |   10 |         11 |  65179 |
    |  928606 | y    | b    |    5 |         14 |  58037 |
    |  990858 | z    | a    |    8 |         10 |  61928 |
    +---------+------+------+------+------------+--------+

Step 3: Build roaring bitmaps

  1. Create a roaring bitmap table named tag_tbl_1 for tag 1.

    CREATE TABLE `tag_tbl_1` (
      `tag1` STRING,
      `rb` ROARINGBITMAP,
      `user_group` INT
     ) engine = 'oss'
    TABLE_PROPERTIES = '{
    "endpoint":"oss-cn-zhangjiakou.aliyuncs.com",
    "accessid":"LTAIF****5FsE"",
    "accesskey":"Ccw****iWjv",
    "url":"oss://testBucketName/roaringbitmap/tag_tbl_1/",
    "format":"parquet"
    }';
  2. Write the data of the users table to the tag_tbl_1 table.

    INSERT OVERWRITE tag_tbl_1 SELECT tag1, RB_BUILD_AGG(offset), user_group FROM users GROUP BY tag1, user_group;
  3. View the data of tag_tbl_1.

    SELECT tag1, user_group, RB_CARDINALITY(rb) FROM tag_tbl_1;

    Sample result:

    +------+------------+--------------------+
    | tag1 | user_group | rb_cardinality(rb) |
    +------+------------+--------------------+
    | z    |          7 |            2082608 |
    | x    |         10 |            2082953 |
    | y    |          7 |            2084730 |
    | x    |         14 |            2084856 |
    .....
    | z    |         15 |            2084535 |
    | z    |          5 |            2083204 |
    | x    |         11 |            2085239 |
    | z    |          1 |            2084879 |
    +------+------------+--------------------+
  4. Create a roaring bitmap table named tag_tbl_2 for tag 2.

    CREATE TABLE `tag_tbl_2` (
      `tag2` STRING,
      `rb` ROARINGBITMAP,
      `user_group` INT
     ) engine = 'oss'
    TABLE_PROPERTIES = '{
    "endpoint":"oss-cn-zhangjiakou.aliyuncs.com",
    "accessid":"LTAIF****5FsE"",
    "accesskey":"Ccw****iWjv",
    "url":"oss://testBucketName/roaringbitmap/tag_tbl_2/",
    "format":"parquet"
    }';
  5. Write the data of the users table to the tag_tbl_2 table.

    INSERT OVERWRITE tag_tbl_2 SELECT tag2, RB_BUILD_AGG(offset), user_group FROM users GROUP BY tag2, user_group;
  6. View the data of tag_tbl_2.

    SELECT tag2, user_group, RB_CARDINALITY(rb) FROM tag_tbl_2;

    Sample result:

    +------+------------+--------------------+
    | tag2 | user_group | rb_cardinality(rb) |
    +------+------------+--------------------+
    | b    |         11 |            3121361 |
    | a    |          6 |            3124750 |
    | a    |          1 |            3125433 |
    ....
    | b    |          2 |            3126523 |
    | b    |         12 |            3123452 |
    | a    |          4 |            3126111 |
    | a    |         13 |            3123316 |
    | a    |          2 |            3123477 |
    +------+------------+--------------------+

Step 4: Perform analysis based on the roaring bitmaps

Scenario 1: Filtering analysis

In this example, the number of users for each value of tag2 that meet tag1 in (x, y) is obtained.

  1. Query the users that meet tag1 in (x, y).

    SELECT tag2, t1.user_group AS user_group, RB_CARDINALITY(RB_AND(t2.rb, rb1)) AS rb FROM tag_tbl_2 AS t2
    JOIN (
    SELECT user_group, rb AS rb1 FROM tag_tbl_1 WHERE tag1 IN ('x', 'y'))
    AS t1
    ON t1.user_group = t2.user_group;

    Sample result:

    +------+------------+---------+
    | tag2 | user_group | rb      |
    +------+------------+---------+
    | b    |          3 | 1041828 |
    | a    |          15| 1039859 |
    | a    |          9 | 1039140 |
    | b    |          1 | 1041524 |
    | a    |          4 | 1041599 |
    | b    |          1 | 1041381 |
    | b    |          10| 1041026 |
    | b    |          6 | 1042289 |
    +------+------------+---------+
  2. Obtain the number of users for each value of tag2 that meet tag1 in (x, y).

    SELECT tag2, SUM(cnt) FROM ( 
    SELECT tag2, t1.user_group AS user_group, RB_CARDINALITY(RB_AND(t2.rb, rb1)) AS cnt FROM tag_tbl_2 AS t2
    JOIN (
    SELECT user_group, rb AS rb1 FROM tag_tbl_1 WHERE tag1 IN ('x', 'y'))
    AS t1
    ON t1.user_group = t2.user_group
    ) GROUP BY tag2;

    Sample result:

    +------+----------+
    | tag2 | sum(cnt) |
    +------+----------+
    | a    | 33327868 |
    | b    | 33335220 |
    +------+----------+

Scenario 2: Calculate the intersection, union, and XOR of roaring bitmaps

Retrieve data from the tag_tbl_1 table that meets tag1 = 'x' or tag1 = 'y' and data from the tag_tbl_2 table that meets tag2 = 'b', and obtain the data that meets both conditions.

SELECT user_group, RB_CARDINALITY(rb) FROM (
    SELECT
      t1.user_group AS user_group,
      RB_AND(rb1, rb2) AS rb
    FROM
      (
        SELECT
          user_group,
          RB_OR_AGG(rb) AS rb1
        FROM
          tag_tbl_1
        WHERE
          tag1 = 'x'
          OR tag1 = 'y'
        GROUP BY
          user_group
      ) AS t1
      JOIN (
        SELECT
          user_group,
          RB_OR_AGG(rb) AS rb2
        FROM
          tag_tbl_2
        WHERE
          tag2 = 'b'
        GROUP BY
          user_group
      ) AS t2 ON t1.user_group = t2.user_group
  GROUP BY user_group);

Sample result:

+------------+--------------------+
| user_group | rb_cardinality(rb) |
+------------+--------------------+
|         10 |            2083679 |
|          3 |            2082370 |
|          9 |            2082847 |
|          2 |            2086511 |
...
|          1 |            2082291 |
|          4 |            2083290 |
|         14 |            2083581 |
|         15 |            2084110 |
+------------+--------------------+

Scenario 3: Calculate the intersection, union, and XOR of roaring bitmaps and original tag tables

Retrieve data from the tag_tbl_1 table that meets tag1 = 'x' or tag1 = 'y' and data from the users table that meets tag2 = 'b', and obtain the data that meets both conditions.

SELECT user_group, RB_CARDINALITY(rb) FROM (
    SELECT
      t1.user_group AS user_group,
      RB_AND(rb1, rb2) AS rb
    FROM
      (
        SELECT
          user_group,
          RB_OR_AGG(rb) AS rb1
        FROM
          tag_tbl_1
        WHERE
          tag1 = 'x'
          OR tag1 = 'y'
        GROUP BY
          user_group
      ) AS t1
      JOIN (
        SELECT
          user_group,
          RB_BUILD_AGG(offset) AS rb2
        FROM
          users
        WHERE
          tag2 = 'b'
        GROUP BY
          user_group
      ) AS t2 ON t1.user_group = t2.user_group
  GROUP BY user_group);

Sample result:

+------------+--------------------+
| user_group | rb_cardinality(rb) |
+------------+--------------------+
|          3 |            2082370 |
|          1 |            2082291 |
|          0 |            2082383 |
|          4 |            2083290 |
|         11 |            2081662 |
|         13 |            2085280 |
...
|         14 |            2083581 |
|         15 |            2084110 |
|          9 |            2082847 |
|          8 |            2084860 |
|          5 |            2083056 |
|          7 |            2083275 |
+------------+--------------------+

Scenario 4: Export results to OSS

  1. Create a tag table named tag_tbl_3 to which the calculation results are exported.

    CREATE TABLE `tag_tbl_3` (
      `user_group` INT,
      `rb` ROARINGBITMAP
      )engine = 'oss'
    TABLE_PROPERTIES = '{
    "endpoint":"oss-cn-zhangjiakou.aliyuncs.com",
    "accessid":"LTAIF****5FsE"",
    "accesskey":"Ccw****iWjv",
    "url":"oss://testBucketName/roaringbitmap/tag_tbl_3/",
    "format":"parquet"
    }';
  2. Export the results in Scenario 2 to tag_tbl_3.

    INSERT OVERWRITE tag_tbl_3
        SELECT
          t1.user_group AS user_group,
          RB_AND(rb1, rb2) AS rb
        FROM
          (
            SELECT
              user_group,
              RB_OR_AGG(rb) AS rb1
            FROM
              tag_tbl_1
            WHERE
              tag1 = 'x'
              OR tag1 = 'y'
            GROUP BY
              user_group
          ) AS t1
          JOIN (
            SELECT
              user_group,
              RB_OR_AGG(rb) AS rb2
            FROM
              tag_tbl_2
            WHERE
              tag2 = 'b'
            GROUP BY
              user_group
          ) AS t2 ON t1.user_group = t2.user_group;
    Note

    After the statement is executed, the file is stored as a parquet file in the oss://testBucketName/roaringbitmap/tag_tbl_3/ directory.

Scenario 5: Import data from the tag_tbl_1 table to an internal table for better query performance

  1. Create an internal table named tag_tbl_1_cstore and set the data type of the rb field to VARBINARY.

    CREATE TABLE `tag_tbl_1_cstore` (
     `tag1` VARCHAR,
     `rb` VARBINARY,
     `user_group` INT
    );
  2. Import data of the tag_tbl_1 table from OSS to tag_tbl_1_cstore.

    INSERT INTO tag_tbl_1_cstore SELECT tag1, RB_TO_VARBINARY(rb), user_group FROM tag_tbl_1;
  3. Query the data of the tag_tbl_1_cstore table.

    SELECT tag1, user_group, RB_CARDINALITY(RB_OR_AGG(RB_BUILD_VARBINARY(rb))) FROM tag_tbl_1_cstore GROUP BY tag1, user_group;

    Sample result:

    +------+------------+---------------------------------------------------+
    | tag1 | user_group | rb_cardinality(rb_or_agg(rb_build_varbinary(rb))) |
    +------+------------+---------------------------------------------------+
    | y    |          3 |                                           2082919 |
    | x    |          9 |                                           2083085 |
    | x    |          3 |                                           2082140 |
    | y    |         11 |                                           2082268 |
    | z    |          4 |                                           2082451 |
    ...
    | z    |          2 |                                           2081560 |
    | y    |          6 |                                           2082194 |
    | z    |          7 |                                           2082608 |
    +------+------------+---------------------------------------------------+