This topic describes how to use the Beam storage engine.
Usage notes
Only AnalyticDB for PostgreSQL V7.0.x instances in elastic storage mode support the Beam storage engine.
The Beam storage engine is officially available after the public preview is complete on AnalyticDB for PostgreSQL V7.0.6.2 in elastic storage mode. This version fixes the issues that occur during the public preview. We recommend that you update your AnalyticDB for PostgreSQL instance to V7.0.6.2 or later.
Create a Beam table
To create a table with the Beam engine, specify the USING beam clause in your `CREATE TABLE` statement. The following example creates a Beam table named test.
CREATE TABLE test(a INT, b INT)
USING beam
DISTRIBUTED BY (a);By default, Beam tables use LZ4 9-level compression for all columns.
Specify a sort key
You can specify a sort key to define how data is physically aggregated. Beam collects the maximum (Max) and minimum (Min) values for the specified sort key and primary key columns. This accelerates queries that contain filter conditions on the sort key. For example, if an SQL statement contains the sale_date='20230715' condition, sorting data by the sale_date column significantly improves query performance.
You can specify one or more sort keys in the ORDER BY clause. Beam automatically re-sorts the data in the background based on the specified sort keys for optimization.
Example
Create a Beam table that contains a sort key.
CREATE TABLE beam_example (
id integer,
name text,
ftime timestamp
)
USING beam
DISTRIBUTED BY (id)
ORDER BY(id);Write 10,000,000 rows of data to the Beam table and query data using a sort key filter condition.
INSERT INTO beam_example
SELECT r, md5((r*random())::text), now() + interval '1 seconds' *(r*random())::int
FROM generate_series(1,10000000)r;SELECT * FROM beam_example WHERE id = 100000;Specify a compression algorithm
Beam supports multiple compression algorithms, such as ZSTD, LZ4, AUTO, and GDICT. You can select a compression algorithm as needed.
Default compression algorithm:
For AnalyticDB for PostgreSQL instances that run kernel version V7.1.1.4 or later, AUTO Level 1 compression is used by default.
For AnalyticDB for PostgreSQL instances that run a kernel version earlier than V7.1.1.4, LZ4 Level 1 compression is used by default.
ZSTD
ZSTD provides a higher compression ratio, but lower compression and decompression performance than LZ4.
LZ4
LZ4 provides high compression and decompression performance at the expense of compression ratios.
AUTO
AUTO is a self-developed adaptive compression algorithm of the Beam storage engine. For numeric columns, the Beam storage engine provides a higher compression ratio and higher compression and decompression performance than general compression algorithms based on data layout. For other types of columns, the Beam storage engine uses LZ4 to compress data.
GDICT
GDICT is a self-developed global dictionary encoding compression algorithm of the Beam storage engine. GDICT provides high compression ratios and decompression performance for low-cardinality columns that have less than 256 unique values. In addition, GDICT uses the filter condition pushdown feature in specific scenarios to provide up to 100 times the scanning performance of general compression algorithms.
Example
Create a Beam table that uses ZSTD level 9 compression.
CREATE TABLE beam_example (
id integer,
name text,
ftime timestamp
) USING beam
WITH(compresstype='zstd', compresslevel=9) ;Auto-optimize
Multiple write, update, or delete operations on a Beam table can create a large amount of expired data over time, which degrades scan performance. Auto-optimization is a background process that optimizes your data. It automatically reclaims expired data, merges small files, and aggregates data by the sort key to maintain query performance. This process runs automatically by default. You can also manually trigger an optimization process by running the OPTIMIZE beam_example; command.