AnalyticDB for PostgreSQL supports the column store model to provide a higher data compression ratio and query performance. However, when AnalyticDB for PostgreSQL processes a query that may return just a small amount of data, it must still read the data of an entire column or create B-tree indexes. B-tree indexes also have potential issues. For example, severe data bloat may occur because indexes are not compressed. If large amounts of query results are returned, indexes may cause higher costs than sequential scans or even fail. To resolve these issues, AnalyticDB for PostgreSQL provides the MetaScan feature that offers excellent filter performance and occupies minimal storage space.
- The MetaScan feature is supported only for AnalyticDB for PostgreSQL instances in reserved storage mode that are created with the minor version of 20200826 or later.
- This feature is not supported for AnalyticDB for PostgreSQL instances in elastic storage mode.
Enable MetaScan
The MetaScan feature is controlled by the following parameters:
- RDS_ENABLE_CS_ENHANCEMENT
Specifies whether to collect metadata. If it is set to ON, metadata collection is enabled. If it is set to OFF, metadata collection is disabled. By default, this parameter is set to ON. When the value is ON, if the data in column-oriented tables changes, the system automatically collects the metadata of the tables. This parameter is an instance-level parameter. If you want to change its value, submit a ticket.
- RDS_ENABLE_COLUMN_META_SCAN
Specifies whether to enable the MetaScan feature for a query. If it is set to ON, the feature is enabled for a query. If it is set to OFF, the feature is disabled for a query. By default, this parameter is set to OFF. This parameter is a session-level parameter. You can view and change its value by executing the following SQL statements:
- Check whether MetaScan is enabled.
SHOW RDS_ENABLE_COLUMN_META_SCAN;
- Disable MetaScan.
SET RDS_ENABLE_COLUMN_META_SCAN = OFF;
- Enable MetaScan.
SET RDS_ENABLE_COLUMN_META_SCAN = ON;
- Check whether MetaScan is enabled.
ALTER TABLE table_name SET WITH (REORGANIZE=TRUE);
Check whether MetaScan is used for a query
Data types and operators supported by MetaScan
- INT2, INT4, and INT8
- FLOAT4 and FLOAT8
- TIME, TIMETZ, TIMESTAMP, and TIMESTAMPTZ
- VARCHAR, TEXT, and BPCHAR
- CASH
- =, <, <=, >, and >=
- The AND logical operator
Update existing instances to use MetaScan
If you want to use MetaScan for existing instances, perform the following operations:
- Update the minor version of your AnalyticDB for PostgreSQL instance
In the AnalyticDB for PostgreSQL console, find the instance whose minor version you want to update and click its ID. In the upper-right corner of the page that appears, click Upgrade Minor Version.
- Update the metadata of tables
Update the metadata of tables to its latest version after you update the minor version. If RDS_ENABLE_CS_ENHANCEMENT is set to OFF, submit a ticket. If RDS_ENABLE_CS_ENHANCEMENT is set to ON, perform the following operations:
- Create an update function as an administrator.
CREATE OR REPLACE FUNCTION UPGRADE_AOCS_TABLE_META(tname TEXT) RETURNS BOOL AS $$ DECLARE tcount INT := 0; BEGIN -- CHECK TABLE NAME EXECUTE 'SELECT COUNT(1) FROM PG_AOCSMETA WHERE RELID = ''' || tname || '''::REGCLASS' INTO tcount; IF tcount IS NOT NULL THEN IF tcount > 1 THEN RAISE EXCEPTION 'found more than one table of name %', tname; ELSEIF tcount = 0 THEN RAISE EXCEPTION 'not found target table in pg_aocsmeta, table name:%', tname; END IF; END IF; EXECUTE 'ALTER TABLE ' || tname || ' SET WITH(REORGANIZE=TRUE)'; RETURN TRUE; END; $$ LANGUAGE PLPGSQL;
- Execute the following SQL statement to update the metadata of tables as an administrator
or a table owner:
SELECT UPGRADE_AOCS_TABLE_META(table_name);
- Execute the following SQL statement to check the metadata version:
SELECT version = 4 AS is_latest_version FROM pg_appendonly WHERE relid = 'test'::REGCLASS
- Create an update function as an administrator.
Improve MetaScan performance by using sort keys
The sort key feature of AnalyticDB for PostgreSQL sorts data in tables by a specific column. A combination of MetaScan with the sort key feature improves the performance of MetaScan. Column-oriented tables store data in blocks. MetaScan uses metadata to check whether blocks meet query conditions and skips blocks that do not meet query conditions. This reduces I/O operations and improves scanning performance. If data in filtered columns is distributed across all blocks, all blocks must be scanned even though a query may return a small amount of data. If you create a sort key for each filtered column, the same data in a column is combined into several consecutive blocks. This way, MetaScan can improve scanning performance by filtering out blocks that do not meet the query conditions.
For more information about how to create a sort key, see Use sort keys and rough set indexes to accelerate queries in column-oriented tables.
Limits
SHOW OPTIMIZER;
If on is returned, the ORCA optimizer is used. For more information about the ORCA
optimizer, see Choose a query optimizer.