Beam is a columnar storage engine for AnalyticDB for PostgreSQL. It uses a Min/Max index on sort key columns to skip irrelevant data segments during scans, and runs background auto-optimization to keep tables compact over time. Use Beam when you need fast analytical queries on large datasets without manual tuning.
Supported versions
Beam is available only on AnalyticDB for PostgreSQL V7.0.x instances in elastic storage mode. General availability (GA) begins with V7.0.6.2. Update your instance to V7.0.6.2 or later to receive fixes applied during the public preview period.
Create a Beam table
Add the USING beam clause to your CREATE TABLE statement. Beam tables support the following options:
CREATE TABLE table_name (
column1 type1,
column2 type2,
...
)
USING beam
DISTRIBUTED BY (column1)
[ORDER BY (sort_key_column)]
[WITH (compresstype = 'algorithm', compresslevel = N)];
The following example creates a minimal Beam table:
CREATE TABLE test(a INT, b INT)
USING beam
DISTRIBUTED BY (a);
The default compression algorithm depends on the kernel version. Instances running kernel V7.1.1.4 or later use AUTO Level 1 by default. Earlier versions use LZ4 Level 1. See Specify a compression algorithm to override the default.
Specify a sort key
A sort key defines how Beam physically orders data on disk. Beam maintains a Min/Max index on sort key and primary key columns, recording the minimum and maximum values in each data segment. When a query filters on a sort key column, Beam uses this index to skip entire segments that cannot contain matching rows.
Specify one or more sort key columns in the ORDER BY clause of CREATE TABLE:
CREATE TABLE beam_example (
id integer,
name text,
ftime timestamp
)
USING beam
DISTRIBUTED BY (id)
ORDER BY (id);
To verify the performance benefit, insert 10,000,000 rows and run a point query:
INSERT INTO beam_example
SELECT r, md5((r * random())::text), now() + interval '1 seconds' * (r * random())::int
FROM generate_series(1, 10000000) r;
SELECT * FROM beam_example WHERE id = 100000;
With the id sort key, Beam uses the Min/Max index to skip segments that cannot contain id = 100000, scanning only a small fraction of the 10,000,000 rows.
Beam automatically re-sorts data in the background as new data arrives, keeping the index accurate over time.
Specify a compression algorithm
Beam supports four compression algorithms: ZSTD, LZ4, AUTO, and GDICT. The default depends on the kernel version:
-
Kernel V7.1.1.4 or later: AUTO Level 1
-
Earlier kernel versions: LZ4 Level 1
Choose an algorithm
| Algorithm | Best for | Strength | Trade-off |
|---|---|---|---|
| AUTO | Numeric columns | Higher compression ratio and speed than general algorithms, based on data layout; falls back to LZ4 for non-numeric columns | Beam-specific |
| GDICT | Low-cardinality columns (fewer than 256 unique values) | High compression ratio; filter condition pushdown delivers up to 100x scanning performance over general algorithms | Only effective for low-cardinality data |
| ZSTD | General-purpose high compression | Higher compression ratio than LZ4 | Slower compression and decompression than LZ4 |
| LZ4 | High-throughput workloads | Fast compression and decompression | Lower compression ratio than ZSTD |
Quick reference:
-
Numeric columns: AUTO
-
Low-cardinality string or categorical columns: GDICT
-
Storage-constrained workloads where CPU is not the bottleneck: ZSTD
-
Write-heavy or latency-sensitive workloads: LZ4
Algorithm details
ZSTD
ZSTD provides a higher compression ratio than LZ4, at the cost of slower compression and decompression.
LZ4
LZ4 prioritizes compression and decompression speed over compression ratio.
AUTO
AUTO is Beam's self-developed adaptive compression algorithm. For numeric columns, it delivers higher compression ratios and faster compression and decompression than general algorithms, based on data layout. For other column types, it uses LZ4.
GDICT
GDICT is Beam's self-developed global dictionary encoding compression algorithm, designed for low-cardinality columns with fewer than 256 unique values. GDICT uses filter condition pushdown in specific scenarios to achieve up to 100 times the scanning performance of general compression algorithms.
Example
Create a Beam table with ZSTD level 9 compression:
CREATE TABLE beam_example (
id integer,
name text,
ftime timestamp
)
USING beam
WITH (compresstype = 'zstd', compresslevel = 9);
Auto-optimize
Write, update, and delete operations accumulate expired data over time, which degrades scan performance. Auto-optimization is a background process that automatically reclaims expired data, merges small files, and re-aggregates data by the sort key to maintain query performance. It runs automatically by default and does not require configuration.
To trigger optimization immediately, run:
OPTIMIZE beam_example;