All Products
Search
Document Center

Hologres:Dictionary encoding

Last Updated:Mar 26, 2026

Dictionary encoding compresses low-cardinality columns by replacing repeated string values with compact numeric codes, reducing storage and accelerating GROUP BY and FILTER queries.

How it works

Hologres encodes raw column data as NUMERIC values and maintains a corresponding encoding schema. Reads decode the data on the fly using that schema.

This approach is most effective for TEXT columns where values repeat frequently—for example, status fields, category labels, or region codes. When a column's recurrence rate reaches 90% or higher, encoding delivers a significant reduction in storage and faster string comparisons, at the cost of a small decoding overhead.

For columns with a low recurrence rate (roughly half the values are unique) or columns used in join operations, the encoding and decoding overhead outweighs the benefit. Avoid enabling dictionary encoding for those columns.

When auto is set and a column's recurrence rate stays below 90%, Hologres does not enable encoding—the column is stored without dictionary mapping.

Dictionary encoding diagram

Configure dictionary encoding

Dictionary encoding applies per column through the dictionary_encoding_columns table property. Two syntax forms are available depending on your Hologres version.

Hologres V2.1 and later — WITH clause

CREATE TABLE <table_name> (...) WITH (dictionary_encoding_columns = '[<columnName>{:[on|off|auto]}[,...]]');

All Hologres versions — CALL statement

CREATE TABLE <table_name> (...);
CALL set_table_property('table_name', 'dictionary_encoding_columns', '[<columnName>{:[on|off|auto]}[,...]]');

The dictionary_encoding_columns property accepts a comma-separated list of column names, each with an optional value:

ValueBehavior
onEnables dictionary encoding for the column
offDisables dictionary encoding for the column
autoHologres evaluates the column's recurrence rate at write time. If the rate reaches 90% or higher, encoding is enabled; otherwise, the column is stored without encoding

Default behavior by version:

  • Hologres V0.8 and earlier: Dictionary encoding is enabled for all TEXT columns.

  • Hologres V0.9 and later: All TEXT columns default to auto. Encoding activates only when the recurrence rate reaches 90% or higher.

To modify dictionary mappings after a table is created, use ALTER TABLE or update_table_property. Changes take effect asynchronously—mappings are built or deleted in the background.

Supported storage types

Dictionary encoding is supported only for tables that use column store or row-column hybrid storage. The columns specified by dictionary_encoding_columns can be null.

Best practices

Use auto as the default. It avoids both the storage penalty of missed encoding opportunities and the decoding overhead of unnecessary encoding. Disabling dictionary encoding entirely increases storage consumption.

Target high-recurrence TEXT columns. Dictionary encoding delivers the most benefit for columns like status codes, category labels, and region identifiers—fields with a small number of distinct values that repeat across many rows.

Avoid encoding JSON stored as TEXT. JSON values are rarely repeated verbatim, so encoding adds overhead without reducing storage.

Don't encode every column. Applying dictionary encoding to all columns—including those with high cardinality or join columns—generates unnecessary encoding and decoding overhead that can slow queries rather than accelerate them.

Examples

Hologres V2.1 and later:

CREATE TABLE tbl (
    a int NOT NULL,
    b text NOT NULL,
    c text NOT NULL
)
WITH (
    dictionary_encoding_columns = 'a:on,b:off,c:auto'
);

-- Modify dictionary mappings for all columns.
ALTER TABLE tbl SET (dictionary_encoding_columns = 'a:off');

All Hologres versions:

-- Create a table and set dictionary mappings.
BEGIN;
CREATE TABLE tbl (
    a int NOT NULL,
    b text NOT NULL,
    c text NOT NULL
);
CALL set_table_property('tbl', 'dictionary_encoding_columns', 'a:on,b:off,c:auto');
COMMIT;

-- Modify dictionary mappings for all columns.
-- Columns b and c are TEXT columns, so they retain dictionary mappings by default.
CALL set_table_property('tbl', 'dictionary_encoding_columns', 'a:off');

-- Modify dictionary mappings for a single column only.
CALL update_table_property('tbl', 'dictionary_encoding_columns', 'c:off');
ALTER TABLE and set_table_property apply changes to all columns. To update a single column without affecting others, use update_table_property.

What's next