AnalyticDB for PostgreSQL supports the column store model to provide a higher data compression ratio and query performance. However, when AnalyticDB for PostgreSQL processes a query that may return just a small amount of data, it must still read the data of an entire column or create B-tree indexes. B-tree indexes also have potential issues. For example, severe data bloat may occur because indexes are not compressed. If large amounts of query results are returned, indexes may cause higher costs than sequential scans or even fail. To resolve these issues, AnalyticDB for PostgreSQL provides the MetaScan feature that offers excellent filter performance and occupies minimal storage space.

Notice
  • The MetaScan feature is supported only for AnalyticDB for PostgreSQL instances in reserved storage mode that are created with the minor version of 20200826 or later.
  • This feature is not supported for AnalyticDB for PostgreSQL instances in elastic storage mode.

Enable MetaScan

The MetaScan feature is controlled by the following parameters:

  • RDS_ENABLE_CS_ENHANCEMENT

    Specifies whether to collect metadata. If it is set to ON, metadata collection is enabled. If it is set to OFF, metadata collection is disabled. By default, this parameter is set to ON. When the value is ON, if the data in column-oriented tables changes, the system automatically collects the metadata of the tables. This parameter is an instance-level parameter. If you want to change its value, submit a ticket.

  • RDS_ENABLE_COLUMN_META_SCAN
    Specifies whether to enable the MetaScan feature for a query. If it is set to ON, the feature is enabled for a query. If it is set to OFF, the feature is disabled for a query. By default, this parameter is set to OFF. This parameter is a session-level parameter. You can view and change its value by executing the following SQL statements:
    • Check whether MetaScan is enabled.
      SHOW RDS_ENABLE_COLUMN_META_SCAN;
    • Disable MetaScan.
      SET RDS_ENABLE_COLUMN_META_SCAN = OFF;
    • Enable MetaScan.
      SET RDS_ENABLE_COLUMN_META_SCAN = ON;
Notice When RDS_ENABLE_CS_ENHANCEMENT is set to OFF, metadata collection is disabled for all column-oriented tables. If you want to use the MetaScan feature when RDS_ENABLE_CS_ENHANCEMENT is set to ON, execute the following statement to re-collect the metadata of tables:
ALTER TABLE table_name SET WITH (REORGANIZE=TRUE);

Check whether MetaScan is used for a query

You can use the EXPLAIN statement to check whether a SELECT statement uses MetaScan, as shown in the following figure.The "Append-only Columnar Meta Scan" node displayed in the result is the MetaScan node. This indicates that the MetaScan feature is used for the query.

Data types and operators supported by MetaScan

MetaScan supports the following data types:
  • INT2, INT4, and INT8
  • FLOAT4 and FLOAT8
  • TIME, TIMETZ, TIMESTAMP, and TIMESTAMPTZ
  • VARCHAR, TEXT, and BPCHAR
  • CASH
MetaScan supports the following operators:
  • =, <, <=, >, and >=
  • The AND logical operator

Update existing instances to use MetaScan

If you want to use MetaScan for existing instances, perform the following operations:

  1. Update the minor version of your AnalyticDB for PostgreSQL instance

    In the AnalyticDB for PostgreSQL console, find the instance whose minor version you want to update and click its ID. In the upper-right corner of the page that appears, click Upgrade Minor Version.

  2. Update the metadata of tables

    Update the metadata of tables to its latest version after you update the minor version. If RDS_ENABLE_CS_ENHANCEMENT is set to OFF, submit a ticket. If RDS_ENABLE_CS_ENHANCEMENT is set to ON, perform the following operations:

    1. Create an update function as an administrator.
      CREATE OR REPLACE FUNCTION UPGRADE_AOCS_TABLE_META(tname TEXT) RETURNS BOOL AS $$
      DECLARE
          tcount INT := 0;
      BEGIN
          -- CHECK TABLE NAME
          EXECUTE 'SELECT COUNT(1) FROM PG_AOCSMETA WHERE RELID = ''' || tname || '''::REGCLASS' INTO tcount;
          IF tcount IS NOT NULL THEN
              IF tcount > 1 THEN
                  RAISE EXCEPTION 'found more than one table of name %', tname;
              ELSEIF tcount = 0 THEN
                  RAISE EXCEPTION 'not found target table in pg_aocsmeta, table name:%', tname;
              END IF;
          END IF;
      
          EXECUTE 'ALTER TABLE ' || tname || ' SET WITH(REORGANIZE=TRUE)';
          RETURN TRUE;
      END;
      $$  LANGUAGE PLPGSQL;
    2. Execute the following SQL statement to update the metadata of tables as an administrator or a table owner:
      SELECT UPGRADE_AOCS_TABLE_META(table_name);
    3. Execute the following SQL statement to check the metadata version:
      SELECT version = 4 AS is_latest_version  FROM pg_appendonly WHERE relid = 'test'::REGCLASS

Improve MetaScan performance by using sort keys

The sort key feature of AnalyticDB for PostgreSQL sorts data in tables by a specific column. A combination of MetaScan with the sort key feature improves the performance of MetaScan. Column-oriented tables store data in blocks. MetaScan uses metadata to check whether blocks meet query conditions and skips blocks that do not meet query conditions. This reduces I/O operations and improves scanning performance. If data in filtered columns is distributed across all blocks, all blocks must be scanned even though a query may return a small amount of data. If you create a sort key for each filtered column, the same data in a column is combined into several consecutive blocks. This way, MetaScan can improve scanning performance by filtering out blocks that do not meet the query conditions.

For more information about how to create a sort key, see Use sort keys and rough set indexes to accelerate queries in column-oriented tables.

Limits

MetaScan is unavailable when the ORCA optimizer is used. You can execute the following statement to check whether the ORCA optimizer is used:
SHOW OPTIMIZER;
If on is returned, the ORCA optimizer is used. For more information about the ORCA optimizer, see Choose a query optimizer.