This topic describes how to use the HASH function.

Description

  • If the HASH function uses different shard keys for database sharding and table sharding, the values of the database shard key are divided by the number of database shards to obtain the remainders into which data is partitioned. If the key values are strings, the strings are first converted into hash values and then used for route calculation.

    For example, HASH(8) is equivalent to 8%D. D specifies the number of database shards. HASH("ABC") is equivalent to hashcode("ABC").abs()%D. D specifies the number of database shards into which data is partitioned.

  • If the HASH function uses the same shard key for database sharding and table sharding, the values of the shard key are divided by the total number of table shards to obtain the remainders.

    For example, two database shards are created. Each database shard has four table shards. Database shard 0 stores Table shards 0 to 3 and Database shard 1 stores Table shards 4 to 7. If a key value is 15, the record for the key value 15 is distributed to Table shard 7 in Database shard 1 based on the following equation: 15 % (2 * 4) = 7.

Precautions

The output of the HASH function can be evenly distributed only when the values in a shard key column are evenly distributed.

Limits

  • A shard key must be of an INTEGER or STRING type.
  • The shard key that is of the STRING type is case-sensitive when data routing is performed.

Common scenarios

The HASH function can be used in the following scenarios:
  • Database sharding is performed based on user IDs or order IDs.
  • Values of shard keys are strings.
Example

If you want to use the HASH function only for database sharding based on the values of the ID column, you can execute the following DDL statement:

create table test_hash_tb (
    id int,
    name varchar(30) DEFAULT NULL,  
    create_time datetime DEFAULT NULL,
    primary key(id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 dbpartition by HASH(ID);