All Products
Search
Document Center

PolarDB:UNI_HASH

Last Updated:Jul 24, 2023

This topic describes how to use the UNI_HASH function.

Background information

  • You can use the UNI_HASH function to perform database sharding. The function divides each value of the database shard key by the number of database shards to obtain remainders. As a result, data is partitioned into database shards based on the remainders. If the key values are strings, the strings are converted to hash values. Then, the hash values are calculated to complete route computing. For example, UNI_HASH('8') is equivalent to 8 % D. D specifies the number of database shards into which data is partitioned.

    Note

    If the remainder is obtained based on a key value or a hash value, the key value or hash value is not truncated.

  • If you run the UNI_HASH function to use the same shard key to perform database sharding and table sharding, the values of the database shard key are divided by the number of database shards to obtain the remainders. Then, data is evenly partitioned into each table shard of the database shard.

Precautions

The UNI_HASH function performs simple modulo operations. The hash output can be evenly distributed only when the values in the sharding column are evenly distributed.

Limits

  • A shard key must be of an INTEGER or STRING type.

  • The shard key that is of the STRING type is case-sensitive when data routing is performed.

Scenarios

  • The UNI_HASH function is suitable for scenarios in which database sharding is performed based on user IDs or order IDs.

  • A shard key is of the INTEGER or STRING type.

  • Database sharding needs to be performed on two logical tables based on the same shard key. The number of table shards for one logical table is different from that of the other logical table. The two logical tables are frequently joined by using the shard key.

Examples

You want to run the UNI_HASH function to perform sharding based on the values of the ID column and each database shard contains four table shards. In this case, you can execute the following DDL statement:

create table test_hash_tb (
    id int,
    name varchar(30) DEFAULT NULL,  
    create_time datetime DEFAULT NULL,
    primary key(id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
dbpartition by UNI_HASH(id)
tbpartition by UNI_HASH(id) tbpartitions 4;

Comparison between UNI_HASH and HASH

Item

UNI_HASH

HASH

Database sharding is performed and table sharding is not performed.

The two functions use the same method to partition data into database shards. The values of the database shard key are divided by the number of database shards to obtain the remainders. Then, data is evenly partitioned into database shards based on the remainders.

The same shard key is used to perform database sharding and table sharding.

The results that are partitioned into database shards by using the same key value do not vary based on the number of table shards.

If the number of table shards in each database shard of one logical table is different from that of the other logical table, the routing result of database shards for one logical table is different from that for the other logical table.

Sharding needs to be performed on two logical tables based on the same shard key. In each database shard, the number of table shards for one logical table is different from that of the other logical table.

One logical table can be joined with the other logical table based on the shard key in the same database shard.

One logical table can be joined with the other logical table based on the shard key across database shards.