Note Currently, this feature is only available for AnalyticDB for PostgreSQL V4.3, but not V6.0 nor others.

AnalyticDB for PostgreSQL supports column-oriented storage and features a high data compression ratio and improved query performance. However, it must read data from the entire column or create a B-tree index when processing query criteria with a high filter rate. There are two potential issues with indexing. One is that indexes are not compressed, which results in serious data bloat. The other is that large result sets cause indexes to fail and the cost to be higher than that of a table scan. To address the issues, AnalyticDB for PostgreSQL provides the MetaScan feature, which offers excellent filter performance and occupies minimal storage space.

Enable MetaScan

The MetaScan feature is controlled by the following parameters:

  • RDS_ENABLE_CS_ENHANCEMENT

    Specifies whether to collect metadata. If it is set to ON, metadata collection is enabled. If it is set to OFF, metadata collection is disabled. By default, this parameter is set to ON. In this situation, if the data in column-oriented tables changes, the system automatically collects the metadata of the tables. This parameter is an instance-level parameter. If you want to change its setting, contact technical support by submitting a ticket.

  • RDS_ENABLE_COLUMN_META_SCAN
    Specifies whether to enable the MetaScan feature for a query. If it is set to ON, the feature is enabled for a query. If it is set to OFF, the feature is disabled for a query. By default, this parameter is set to OFF. This parameter is a session-level parameter. You can view or change its setting by executing the following SQL statements:
    --- Check whether MetaScan is enabled.
    Show RDS_ENABLE_COLUMN_META_SCAN;
    --- Disable MetaScan.
    Set RDS_ENABLE_COLUMN_META_SCAN = OFF;
    --- Enable MetaScan.
    Set RDS_ENABLE_COLUMN_META_SCAN = ON;
Notice After you set RDS_ENABLE_CS_ENHANCEMENT to OFF, metadata collection is stopped for all column-oriented tables. If you want to use the MetaScan feature after setting RDS_ENABLE_CS_ENHANCEMENT to ON, execute the ALTER TABLE table_name SET WITH (REORGANIZE=TRUE) statement to collect the metadata of the tables again.

Check whether MetaScan is used for a query

You can use the EXPLAIN statement to check whether a SELECT statement uses MetaScan.The "Append-only Columnar Meta Scan" node displayed is the MetaScan node. This indicates that the MetaScan feature is used in the query.

Data types and operators supported by MetaScan

MetaScan supports the following data types:
  • INT2/INT4/INT8
  • FLOAT4/FLOAT8
  • TIME/TIMETZ/TIMESTAMP/TIMESTAMPTZ
  • VARCHAR/TEXT/BPCHAR
  • CASH
MetaScan supports the following operators:
  • Equal to (=), less than (<), less than or equal to (≤), greater than (>), and greater than or equal to (≥)
  • Logical AND operator

Upgrade existing instances to use MetaScan

If you want to use MetaScan for existing instances, perform the following operations:

  • Update the kernel of your AnlayticDB for PostgreSQL instance

    In the AnalyticDB for PostgreSQL console, find the instance whose kernel you want to update and click its ID. In the upper-right corner of the page that appears, click Upgrade Minor Version.

  • Update the metadata of tables

    Update the metadata of tables to its latest version after you update the kernel. If RDS_ENABLE_CS_ENHANCEMENT is set to OFF, contact technical support by submitting a ticket to update the metadata. Otherwise, follow these steps:

    1. Use the administrator account to create an update function.
      CREATE OR REPLACE FUNCTION UPGRADE_AOCS_TABLE_META(tname TEXT) RETURNS BOOL AS $$
      DECLARE
          tcount INT := 0;
      BEGIN
          -- CHECK TABLE NAME
          EXECUTE 'SELECT COUNT(1) FROM PG_AOCSMETA WHERE RELID = ''' || tname || '''::REGCLASS' INTO tcount;
          IF tcount IS NOT NULL THEN
              IF tcount > 1 THEN
                  RAISE EXCEPTION 'found more than one table of name %', tname;
              ELSEIF tcount = 0 THEN
                  RAISE EXCEPTION 'not found target table in pg_aocsmeta, table name:%', tname;
              END IF;
          END IF;
      
          EXECUTE 'ALTER TABLE ' || tname || ' SET WITH(REORGANIZE=TRUE)';
          RETURN TRUE;
      END;
      $$  LANGUAGE PLPGSQL;
    2. Execute the following SQL statement to update the target table as the administrator or table owner:
      SELECT UPGRADE_AOCS_TABLE_META(table_name);
    3. Execute the following SQL statement to check the metadata version:
      SELECT version = 4 AS is_latest_version  FROM pg_appendonly WHERE relid = 'test'::REGCLASS

Enhance MetaScan performance by using SortKey

SortKey is a feature of AnlayticDB for PostgreSQL. It sorts data in tables by a specific column. Using MetaScan together with SortKey improves the performance of MetaScan. Column-oriented tables store data to blocks. MetaScan uses metadata to check whether blocks meet conditions. It skips blocks that do not meet conditions. This reduces I/O and improves scanning performance. If data in filtered columns is distributed in different blocks, all blocks need to be scanned even though most required data is filtered. If you create a SortKey for each filtered column, the same data in a column is combined into several consecutive blocks. This way, MetaScan can filter out blocks that do not meet conditions to improve scanning performance.

For more information about how to create a SortKey, see Use SortKey.

Limits

MetaScan is incompatible with the ORCA optimizer and is unavailable if the ORCA optimizer is used. You can execute the following statement to check whether the ORCA optimizer is used:
SHOW OPTIMIZER;
If ON is returned, the ORCA optimizer is used.