All Products
Search
Document Center

MaxCompute:Bitmap index (Beta)

Last Updated:Dec 27, 2024

Bitmap indexes are a new type of index. You can create bitmap indexes on columns with high frequency of duplicates. Bitmap indexes can improve the efficiency of point queries and range queries. This topic describes how to use bitmap indexes in MaxCompute.

Background information

MaxCompute provides four types of indexes: hash-clustered index, range-clustered index, Bloom filter index, and bitmap index. All types of indexes provide data filtering for point queries. Range-clustered indexes are suitable for range filtering in scenarios where the leftmost matching principle is applied. Bloom filter indexes are suitable for point queries. Bitmap indexes are more suitable for range queries and can filter out more than 60% of data in range filtering scenarios to accelerate queries.

Bitmap indexes provide the following benefits:

  • Each bitmap index-based query is independent and is not subject to the leftmost matching principle.

  • Multiple columns are supported. You can create a bitmap index for one or more columns in a table. You can also use bitmap indexes with other indexes to accelerate queries.

  • High filtering performance is delivered. Bitmap indexes allow you to create a bitmap for each value. This way, the filter results can be accurate to rows. Bitmap indexes provide high filtering performance in high frequency of duplicates and data-intensive scenarios.

  • Bitmap indexes are suitable for the intersection and union operations (OR and AND operations) and can optimize multidimensional queries.

Scenarios

  • Bitmap indexes are suitable for low-cardinality columns that have a limited number of distinct values, such as the gender and city columns.

  • Bitmap indexes are suitable for performing logical operations, such as AND and OR. Bitmap indexes can efficiently perform bitwise operations on bitmaps.

Note

If columns have an excessively large number of duplicate values or the query filtering effect is poor, a large number of storage resources may be consumed, and the read performance may deteriorate. Therefore, we recommend that you do not create bitmap indexes for the following columns:

  • The columns that have a large number of distinct values, such as the columns that list ID card numbers or mobile numbers.

  • The columns that need to be frequently updated.

Billing

  • Storage: Additional storage space is occupied after bitmap indexes are created. The occupied storage space is calculated based on the actual storage size of the indexes. The occupied storage space and storage size of the project data are consolidated for billing. The pricing is the same as that for standard storage.

  • Computing: The creation of bitmap indexes triggers additional computing workloads. This increases the consumption of computing resources. If the subscription billing method is used, the subscription resources of the projects are directly used. If the pay-as-you-go billing method is used, the cost of index-related computing jobs is calculated based on the following formula: Unit price of pay-as-you-go SQL jobs × Complexity 1 × Input data amount of index-related jobs.

Limits

  • Bitmap indexes can be created only for non-nested columns of the STRING, CHAR, VARCHAR, TINYINT, SMALLINT, INT, BIGINT and DOUBLE type.

  • After bitmap indexes are created, the queries that involve the following logical operators can be accelerated: <=, <,=, >, >=, IN, BETWEEN, and IS NULL.

Usage notes

  • Before you create a bitmap index, you must run the setproject odps.schema.evolution.enable=true; command to enable schema evolution.

  • You can create a bitmap index for only one column at a time. If you need to create a bitmap index for multiple columns of a table, you need to execute the bitmap index creation statement multiple times.

  • If you insert data into a table after you create a bitmap index for the table, the system automatically applies the created bitmap index to the inserted data.

  • To apply a bitmap index to existing data, you must rebuild the bitmap index.

Sample data

Note

The examples of using the syntaxes related to bitmap indexes are based on the data in the emp and sale_detail tables.

Non-partitioned table

  1. Create a table named emp.

    CREATE TABLE IF NOT EXISTS emp( 
      empno BIGINT, 
      ename STRING, 
      job STRING, 
      mgr BIGINT, 
      sex STRING 
    );
  2. Insert data into the table.

    INSERT INTO emp(empno,ename,job,mgr,sex) VALUES  
    (7369,'smith','clerk',7902,'Male'),
    (7499,'allen','salesman',7698,'Female'),
    (7521,'ward','salesman',7698,'Male'),
    (7654,'martin','salesman',7698,'Male'),
    (7698,'blake','manager',7839,'Male'),
    (7782,'clark','manager',7839,'Male'),
    (7788,'scott','analyst',7566,'Male'),
    (7839,'king','president',NULL,'Male'),
    (7844,'turner','salesman',7698,'Female'),
    (7876,'adams','clerk',7788,'Female'),
    (7900,'james','clerk',7698,'Male'),
    (7902,'ford','analyst',7566,'Male'),
    (7934,'miller','clerk',7782,'Female');

Partitioned table

  1. Create a partitioned table named sale_detail.

    CREATE TABLE IF NOT EXISTS sale_detail(
     shop_name     STRING,
     customer_id   STRING,
     total_price   DOUBLE)
    PARTITIONED BY (sale_date STRING, region STRING);
  2. Create partitions.

    ALTER TABLE sale_detail ADD PARTITION (sale_date='2023', region='china') PARTITION (sale_date='2024', region='shanghai');

Create a bitmap index

Syntax

CREATE BITMAP INDEX <index_name> ON TABLE <table_name> FOR COLUMNS(<col_name>) [COMMENT 'indexcomment'];

Parameters

  • index_name: required. The name of the index.

  • table_name: required. The name of the table.

  • col_name: required. The name of the column on which you want to create a bitmap index.

Examples

  • Create the empno_index bitmap index for the empno column of the emp table.

    CREATE BITMAP INDEX empno_index 
    ON TABLE emp 
    FOR COLUMNS(empno) 
    COMMENT 'idxcomment';
  • Create the job_index bitmap index for the job column of the emp table.

    CREATE BITMAP INDEX job_index 
    ON TABLE emp 
    FOR COLUMNS(job) 
    COMMENT 'idxcomment';
  • Create the shop_name_index bitmap index for the shop_name column of the sale_detail table.

    CREATE BITMAP INDEX shop_name_index  
    ON TABLE sale_detail 
    FOR COLUMNS(shop_name) 
    COMMENT 'indexcomment';

Rebuild a bitmap index

Syntax

Execute one of the following statements to rebuild a bitmap index for existing data:

  • Rebuild a bitmap index for a non-partitioned table.

    ALTER TABLE <table_name> REBUILD BITMAP INDEX;
  • Rebuild a bitmap index for a partitioned table.

    ALTER TABLE <table_name> PARTITION 
    (<partition_name1=value1>[, partition_name2=value2, ...]) REBUILD BITMAP INDEX;
    Note

    You can rebuild a bitmap index for only one partition in a partitioned table at a time.

Parameters

  • Non-partitioned table

    table_name: required. The name of the table.

  • Partitioned table

    • table_name: required. The name of the table.

    • partition_name: required. The name of the partition.

    • value: required. The partition value.

Examples

  • Example 1: Rebuild a bitmap index for a non-partitioned table.

    ALTER TABLE emp REBUILD BITMAP INDEX;
  • Example 2: Rebuild a bitmap index for a partitioned table.

    ALTER TABLE sale_detail PARTITION (sale_date='2023', region='china') REBUILD BITMAP INDEX;

Query bitmap indexes

Syntax

SHOW INDEXES ON <table_name>;

Parameters

table_name: required. The name of the table.

Examples

-- Query indexes of the emp table
SHOW INDEXES ON emp;

The following result is returned:

{"Indexes": [{
            "id": "00c84b0e9e6e4097bdfe3a01b91848ac",
            "indexColumns": [{"name": "job"}],
            "name": "job_index",
            "properties": {"comment": "jobidx"},
            "type": "BITMAP"},
           {
            "id": "18a9755c7a8a4182a6b51165e786aa62",
            "indexColumns": [{"name": "empno"}],
            "name": "empno_index",
            "properties": {"comment": "idxcomment"},
            "type": "BITMAP"}]}

Query data and view the filtering effect of bitmap indexes

  1. Examples

    • Example 1: Perform point queries on the job column.

      SELECT * FROM emp WHERE job = 'clerk';

      The following result is returned:

      +------------+-------+-----+------------+------------+-----+
      | empno      | ename | job | mgr        | hiredate   | sex |
      +------------+-------+-----+------------+------------+-----+
      | 7369       | smith | clerk | 7902       | NULL       | Male |
      | 7876       | adams | clerk | 7788       | NULL       | Female |
      | 7900       | james | clerk | 7698       | NULL       | Male |
      | 7934       | miller | clerk | 7782       | NULL       | Female |
      +------------+-------+-----+------------+------------+-----+
    • Example 2: Perform range queries on the empno column.

      SELECT * FROM emp WHERE empno BETWEEN 7300 AND 7800;

      The following result is returned:

      +------------+------------+------------+------------+------------+
      | empno      | ename      | job        | mgr        | sex        | 
      +------------+------------+------------+------------+------------+
      | 7369       | smith      | clerk      | 7902       | Male       | 
      | 7499       | allen      | salesman   | 7698       | Female     | 
      | 7521       | ward       | salesman   | 7698       | Male       | 
      | 7654       | martin     | salesman   | 7698       | Male       | 
      | 7698       | blake      | manager    | 7839       | Male       | 
      | 7782       | clark      | manager    | 7839       | Male       | 
      | 7788       | scott      | analyst    | 7566       | Male       | 
      +------------+------------+------------+------------+------------+
  2. Query the filtering effect of bitmap indexes.

    1. In the operational logs that are displayed in the console, click the LogView URL to go to the LogView page.

      image

    2. On the Json Summary tab of the LogView page, search for the bitmap keyword and view the filtering result based on bitmap indexes and the additional time consumed for creating bitmap indexes. Note that the time consumed by I/O operations is not included.

      image

Drop a bitmap index

Syntax

DROP INDEX [IF EXISTS] <index_name> ON TABLE <table_name>;

Parameters

  • index_name: required. The name of the index.

  • table_name: required. The name of the table.

Examples

DROP INDEX IF EXISTS job_index ON TABLE emp;