All Products
Search
Document Center

Hologres:PK

Last Updated:Dec 20, 2023

You can configure a primary key (PK) for a table to ensure the uniqueness of each record in the table and data consistency. The PK facilitates data management. The characteristics of PKs in Hologres are the same as those in traditional databases. PKs uniquely identify each data record in tables. Therefore, the values of a column that is used as a PK or multiple columns that constitute a PK must be unique and cannot be null. Multiple columns can be specified to constitute a PK. This topic describes how to configure a PK for a table in Hologres.

Overview

Hologres automatically stores a PK index file at the underlying layer. This index file uses the row-oriented storage mode to provide high-speed key-value services. The key indicates the PK of a table, and the value indicates the row identifier (RID) and the clustering key. The RID was originally known as unique_id. RID values are automatically generated each time when you execute the UPSERT statement, and the values monotonically increase. The PK index file can help you determine a PK conflict and locate data files. If you configure a PK for a table, you can quickly locate the RID and clustering key based on the PK in the PK index file and locate the required data file based on the RID and clustering key.

Therefore, the following requirements can be met in Hologres after you configure a PK:

  • High-performance UPSERT or DELETE operations

    Hologres allows you to write data in Append Only mode, and write or update data in an entire row or specific columns in a row based on a PK. When you execute the UPSERT statement to insert data into a table, Hologres updates the table based on the PK without the need to scan the entire table. This achieves high-performance UPSERT operations and ensures data uniqueness.

  • PK-based queries with high queries per second (QPS)

    After you configure a PK for a table, you can quickly locate an entire row of data based on the PK when you perform a PK-based query. This improves query performance. For more information, see Storage modes of tables: row-oriented storage, column-oriented storage, and row-column hybrid storage. If the table uses the row-oriented storage mode, the PK is used as the clustering key and distribution key by default. In this case, you can locate the required data file based on the PK. This achieves point queries with high QPS and millisecond-level delay and applies to online application scenarios such as real-time risk control and real-time recommendations.

Usage notes

We recommend that you configure informative columns to constitute a PK. We recommend that you do not configure columns of the SERIAL data type to constitute a PK. If you write data of the SERIAL data type to a table, a table-level lock is acquired. This results in write performance degradation. As the data grows, the SERIAL data length may exceed the upper limit.

Limits

  • The values of a column that is used as a PK or multiple columns that constitute a PK must be unique and cannot be null. You can specify multiple columns to constitute a PK for a table in only one statement.

  • You can configure up to 32 columns to constitute a PK.

  • The columns of the following data types cannot be configured as PKs: FLOAT, DOUBLE, NUMERIC, ARRAY, JSON, DATE, and complex data types. Hologres V1.3.22 and later allow you to configure columns of the DATE data type as PKs. To configure a column of the DATE data type as a PK, check the version of your Hologres instance. If the version of your Hologres instance is earlier than V1.3.22, upgrade your Hologres instance. For more information, see Instance configurations and Instance upgrades.

  • You must specify a PK for a row-oriented table or a row-column hybrid table. A PK is optional for a column-oriented table.

  • A PK cannot be modified. If you want to change the PK of a table, you must create another table.

Examples

This section provides examples for Hologres V2.1 and later. If the version of your Hologres instance is V2.0 or earlier, change the WITH (property = 'value') statement to the CALL set_table_property statement. For more information, see Overview.

  • Create a column-oriented table and configure a column as the PK of the table.

    • Syntax supported by Hologres V2.1 and later:

      CREATE TABLE tbl_1 (
          id bigint NOT NULL,
          name text NOT NULL,
          age bigint NOT NULL,
          class text,
          reg_timestamp timestamptz NOT NULL,
          PRIMARY KEY (id)
      )
      WITH (
          orientation = 'column',
          distribution_key = 'id',
          clustering_key = 'age',
          event_time_column = 'reg_timestamp',
          bitmap_columns = 'name,class',
          dictionary_encoding_columns = 'class:auto'
      );
    • Syntax supported by all Hologres versions:

      BEGIN;
      CREATE TABLE tbl_1 (
       id bigint NOT NULL,
       name text NOT NULL,
       age bigint,
       class text,
       reg_timestamp timesatmptz, 
      PRIMARY KEY (id)
      );
      CALL set_table_property('tbl_1', 'orientation', 'column');
      CALL set_table_property('tbl_1', 'distribution_key', 'id');
      CALL set_table_property('tbl_1', 'clustering_key', 'age');
      CALL set_table_property('tbl_1', 'event_time_column', 'reg_timestamp');
      CALL set_table_property('tbl_1', 'bitmap_columns', 'name,class');
      CALL set_table_property('tbl_1', 'dictionary_encoding_columns', 'class:auto');
      COMMIT;
  • Create a column-oriented table and configure two columns to constitute a PK for the table.

    • Syntax supported by Hologres V2.1 and later:

      CREATE TABLE tbl_1 (
          id bigint NOT NULL,
          name text NOT NULL,
          age bigint NOT NULL,
          class text NOT NULL,
          reg_timestamp timestamptz NOT NULL,
          PRIMARY KEY (id,age)
      )
      WITH (
          orientation = 'column',
          distribution_key = 'id',
          clustering_key = 'age',
          event_time_column = 'reg_timestamp',
          bitmap_columns = 'name,class',
          dictionary_encoding_columns = 'class:auto'
      );
    • Syntax supported by all Hologres versions:

      BEGIN;
      CREATE TABLE tbl_2 (
       id bigint NOT NULL,
       name text NOT NULL,
       age bigint NOT NULL,
       class text NOT NULL,
       reg_timestamp timestamptz NOT NULL,
      PRIMARY KEY (id,age)
      );
      call set_table_property('tbl_2', 'orientation', 'column');
      call set_table_property('tbl_2', 'distribution_key', 'id');
      call set_table_property('tbl_2', 'clustering_key', 'age');
      call set_table_property('tbl_2', 'event_time_column', 'reg_timestamp');
      call set_table_property('tbl_2', 'bitmap_columns', 'name,class');
      call set_table_property('tbl_2', 'dictionary_encoding_columns', 'class:auto');
      COMMIT;
  • Create a row-oriented table and configure a PK for the table.

    • Syntax supported by Hologres V2.1 and later:

      CREATE TABLE public.tbl_row (
          id text NOT NULL,
          name text NOT NULL,
          class text,
          PRIMARY KEY (id)
      )
      WITH (
          orientation = 'row',
          distribution_key = 'id',
          clustering_key = 'id'
      );
    • Syntax supported by all Hologres versions:

      begin;
      create table public.tbl_row (
          id text NOT NULL,
          name text NOT NULL,
          class text ,
      PRIMARY KEY (id)
      );
      call set_table_property('public.tbl_row', 'orientation', 'row');
      call set_table_property('public.tbl_row', 'clustering_key', 'id');
      call set_table_property('public.tbl_row', 'distribution_key', 'id');
      commit;
  • Create a partitioned table and configure a PK for the table.

    • Syntax supported by Hologres V2.1 and later:

      BEGIN;
      CREATE TABLE public.tbl_parent(
        a text , 
        b int, 
        c timestamp, 
        d text,
        ds text,
        PRIMARY KEY (ds,b)
        )
       PARTITION BY LIST(ds)
       WITH ( orientation = 'column');
      CREATE TABLE public.tbl_child_1 PARTITION OF public.tbl_parent FOR VALUES IN('20221207');
      CREATE TABLE public.tbl_child_2 PARTITION OF public.tbl_parent FOR VALUES IN('20221208');
      COMMIT;
    • Syntax supported by all Hologres versions:

      BEGIN;
      CREATE TABLE public.tbl_parent(
        a text , 
        b int, 
        c timestamp, 
        d text,
        ds text,
        PRIMARY KEY (ds,b)
        )
        PARTITION BY LIST(ds);
      CALL set_table_property('public.tbl_parent', 'orientation', 'column');
      CREATE TABLE public.tbl_child_1 PARTITION OF public.tbl_parent FOR VALUES IN('20221207');
      CREATE TABLE public.tbl_child_2 PARTITION OF public.tbl_parent FOR VALUES IN('20221208');
      COMMIT;

References