AnalyticDB for PostgreSQL supports three table distribution schemes: hash distribution, random distribution, and replicated distribution.

Syntax

CREATE TABLE table_name (...) [ DISTRIBUTED BY (column  [,..] ) | DISTRIBUTED RANDOMLY | DISTRIBUTED REPLICATED ]

Hash distribution

DISTRIBUTED BY (column, [ ... ])

Data rows are distributed across compute nodes based on the hash values of the distribution column values. Rows with identical hash values are assigned to the same compute node. To ensure that data is evenly distributed across compute nodes, we recommend that you use a unique key as the distribution key, such as the primary key.

The default table distribution scheme of AnalyticDB for PostgreSQL is hash distribution. If you do not specify a DISTRIBUTED clause when you create a table, the table uses its primary key or the first key that the system considers suitable as the distribution key. If no suitable distribution key is identified, the system uses random distribution.

Execute the following statement to create a table:

CREATE TABLE products 
(name varchar(40),                   
 prod_id integer,                   
 supplier_id integer) DISTRIBUTED BY (prod_id);   

Random distribution

DISTRIBUTED RANDOMLY

Data rows are evenly distributed across all compute nodes by using a round-robin algorithm. Rows with identical hash values may be assigned to different compute nodes.

We recommend that you use random distribution only if a table contains no columns that are suitable for implementing hash distribution.

Execute the following statement to create a table:

CREATE TABLE random_stuff (things text,
                           doodads text,
                           etc text)  DISTRIBUTED RANDOMLY;

Replicated distribution

DISTRIBUTED REPLICATED

Each compute node stores a copy of the full table data.

If large tables are frequently joined with small tables in your use cases, you can specify replicated distribution for small tables to increase join performance.

Execute the following statement to create a table:

CREATE TABLE replicated_stuff (things text,
                               doodads text,
                               etc text) DISTRIBUTED REPLICATED;

In the preceding example, a table that uses replicated distribution is created. Each compute node stores a copy of the full table data.