You can execute the CREATE TABLE statement to create a partitioned table. This topic describes the syntax of the CREATE TABLE statement and the clauses, parameters, and basic methods that can be used in the CREATE TABLE statement.

Usage notes

Before you execute the CREATE TABLE statement to create a partitioned table in a logical database, make sure that the logical database is created by using the auto partitioning mode. The CREATE TABLE statement cannot be used to create a partitioned table in databases that are created by using the drds partitioning mode. You can execute the SHOW CREATE DATBASE db_name statement to view the partitioning mode of the logical database in which you want to create a table. Example:

CREATE DATABASE part_db mode='auto';
Query OK, 1 row affected (4.29 sec)

SHOW CREATE DATABASE part_db;
+----------+-----------------------------------------------+
| DATABASE | CREATE DATABASE                               |
+----------+-----------------------------------------------+
| part_db  | CREATE DATABASE `part_db` /* MODE = 'auto' */ |
+----------+-----------------------------------------------+
1 row in set (0.18 sec)

For more information about the statement and syntax that are used to create a database, see CREATE DATABASE.

Syntax

CREATE [PARTITION] TABLE [IF NOT EXISTS] tbl_name
    (create_definition, ...)
    [table_options]
    [table_partition_definition]
    [local_partition_definition]

create_definition:
    col_name column_definition
  | mysql_create_definition
  | [UNIQUE] GLOBAL INDEX index_name [index_type] (index_sharding_col_name,...)
      [global_secondary_index_option]
      [index_option] ...

index_sharding_col_name:
    col_name [(length)] [ASC | DESC]

index_option:
    KEY_BLOCK_SIZE [=] value
  | index_type
  | WITH PARSER parser_name
  | COMMENT 'string'

index_type:
    USING {BTREE | HASH}

# Create a global secondary index.
global_secondary_index_option:
    [COVERING (col_name,...)]
    [partition_options]

table_options:
    table_option [[,] table_option] ...

table_option: {
# Specify the table group to which the table that you want to create belongs.
    TABLEGROUP [=] value,...,}

# Specify the type of the table that you want to create.
table_partition_definition:
        single
  |    broadcast
  | partition_options


# Specify the partitioning policy.
partition_options:
    PARTITION BY
          HASH({column_name | partition_func(column_name)})
        | KEY(column_list)
        | RANGE{({column_name | partition_func(column_name)})
        | RANGE COLUMNS(column_list)}
        | LIST{({column_name | partition_func(column_name)})
        | LIST COLUMNS(column_list)} }
    partition_list_spec

# Specify the partitioning function.
partition_func:
    YEAR
  | TO_DAYS
  | TO_SECOND
  | UNIX_TIMESTAMP
  | MONTH

# Specify the partitioning type.
partition_list_spec:
    hash_partition_list
  | range_partition_list
  | list_partition_list

# Specify the hash or key by which data is partitioned in HASH partitioning or KEY partitioning.
hash_partition_list:
    PARTITIONS partition_count

# Specify the ranges by which data is partitioned in RANGE partitioning or RANGE COLUMNS partitioning.
range_partition_list:
    range_partition [, range_partition ...]

range_partition:
    PARTITION partition_name VALUES LESS THAN {(expr | value_list)} [partition_spec_options]

# Specify the lists by which data is partitioned in LIST partitioning or LIST COLUMNS partitioning.
list_partition_list:
    list_partition [, list_partition ...]

list_partition:
    PARTITION partition_name VALUES IN (value_list) [partition_spec_options]

partition_spec_options:
        [[STORAGE] ENGINE [=] engine_name]
        [COMMENT [=] 'string']
        [{CHARSET | CHARACTER SET} [=] charset]
        [COLLATE [=] collation]
        [TABLEGROUP [=] table_group_id]
        [LOCALITY [=] locality_option]

locality_option:
    'dn=storage_inst_id_list'

storage_inst_id_list:
    storage_inst_id[,storage_inst_id_list]

local_partition_definition:
    LOCAL PARTITION BY RANGE (column_name)
    [STARTWITH 'yyyy-MM-dd']
    INTERVAL interval_count [YEAR|MONTH|DAY]
    [EXPIRE AFTER expire_after_count]
    [PRE ALLOCATE pre_allocate_count]
    [PIVOTDATE pivotdate_func]
    [DISABLE SCHEDULE]

pivotdate_func:
    NOW()
  | DATE_ADD(...)
  | DATE_SUB(...)
Note The DDL syntax of PolarDB-X is based on the MySQL syntax. The preceding code lists the syntax that is different from the MySQL syntax. For more information about the syntax, see MySQL documentation.

Terms

  • Partition key: the column or columns based on which a table is horizontally partitioned.
  • Partition column: the column that is used to calculate how data is routed to the partitions of the table. A partition column is a part of the partition key. A partition key can contain one or more partition columns.
  • Vector partition key: the partition key that contains one or more partition columns.
  • Single-column partition key: the partition key that contains only one partition column.
  • Prefix partition column: If a vector partition key contains N (N>1) partition columns, the first K (1<=K<=N) partition columns are defined as a prefix partition column of the vector partition key.
  • Partitioning function: the function that uses the partition column as the input parameter and returns the results based on which data is routed to the partitions of the table.
  • Partition pruning: a feature that optimizes queries by filtering out the partitions that do not need to be scanned based on the partitioning type and query conditions.
  • Hot partition splitting: a feature that is used to balance the workload of partitions. When access to a partition focus on hot data in a prefix partition column of the vector partition key, you can split the partition based on the next partition key column.

Parameters

Parameter Description
CHARSET | CHARACTER SET Specifies the default character set that is used for columns in a table. You can use one of the following character sets:
  • utf8
  • utf8mb4
  • gbk
COLLATE Specifies the default collation for columns in a table. You can use one of the following collations:
  • utf8_bin
  • utf8_general_ci
  • utf8_unicode_ci
  • gbk_bin
  • gbk_chinese_ci
  • utf8mb4_general_ci
  • utf8mb4__general_cs
  • utf8mb4_bin
  • utf8mb4_unicode_ci
TABLEGROUP Specifies the table group to which the partitioned table belongs. If this parameter is not specified, PolarDB-X automatically assigns the partitioned table to a table group or creates a table group. In the table group, all tables are partitioned based on the same partitioning method.
LOCALITY Specifies the data node on which the partitioned table is deployed.

Default automatic partitioning

  • If you create a partitioned table without specifying a partition key, PolarDB-X automatically uses KEY partitioning to partition the table based on the primary key. If no primary key is specified for the table, PolarDB-X uses the implicit primary key to partition the table.

    The number of partitions into which a table is partitioned by using default automatic partitioning can be calculated by by using the following formula: Number of partitions = Number of the logical nodes of the PolarDB-X instance x 8. For example, you set the number of logical nodes to 2 when you create a PolarDB-X instance. In this case, when you create a table on the instance without specifying a partition key, the table is automatically partitioned to 16 partitions by default.

  • By default, the primary table is partitioned based on the primary key. All indexes in the primary table are automatically partitioned based on the index column and primary key column.

The following example shows the standard syntax of the CREATE TABLE statement. In the example, id indicates the primary key and name indicates the index column.

CREATE TABLE auto_part_tbl(
 id bigint not null auto_increment,
 bid int,
 name varchar(30),
 primary key(id),
 index idx_name (name)
);

If you execute the SHOW CREATE TABLE statement to query the details of a CREATE TABLE statement, the standard MySQL syntax of the CREATE TABLE statement is displayed. The information about the partitions are not displayed. The following example shows the execution results of the SHOW CREATE TABLE statement:

show create table auto_part_tbl;
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE         | CREATE TABLE                                                                                                                                                                                                                     |
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| auto_part_tbl | CREATE TABLE `auto_part_tbl` (
    `id` bigint(20) NOT NULL AUTO_INCREMENT,
    `bid` int(11) DEFAULT NULL,
    `name` varchar(30) DEFAULT NULL,
    PRIMARY KEY (`id`),
    INDEX `idx_name` (`name`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8 |
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.06 sec)

If you execute the SHOW FULL CREATE TABLE statement to query the details of a CREATE TABLE statement, the information about all partitions of the primary table and the index table are displayed. The following example shows the execution results of the SHOW FULL CREATE TABLE statement:

show full create table auto_part_tbl;
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE         | CREATE TABLE                                                                                                                                                                                                                                                                                                                                                                                                              |
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| auto_part_tbl | CREATE PARTITION TABLE `auto_part_tbl` (
    `id` bigint(20) NOT NULL AUTO_INCREMENT,
    `bid` int(11) DEFAULT NULL,
    `name` varchar(30) DEFAULT NULL,
    PRIMARY KEY (`id`),
    GLOBAL INDEX /* idx_name_$a870 */ `idx_name` (`name`) PARTITION BY KEY (`name`, `id`) PARTITIONS 16,
    LOCAL KEY `_local_idx_name` (`name`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8
PARTITION BY KEY(`id`)
PARTITIONS 16
/* tablegroup = `tg108` */ |
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.03 sec)
The preceding execution results show that:
  • The primary table named auto_part_tbl is automatically partitioned based on the id column into 16 partitions by using KEY partitioning.
  • The index table named idx_name uses the global index by default and is partitioned into 16 partitions based on the 'name' and 'id' partition keys.

Create a non-partitioned table and a broadcast table

PolarDB-X allows you to create a non-partitioned table by specifying the SINGLE keyword in the CREATE TABLE statement. The following example shows how to create a non-partitioned table:
CREATE TABLE single_tbl(
 id bigint not null auto_increment, 
 bid int, 
 name varchar(30), 
 primary key(id)
) SINGLE;
PolarDB-X allows you to create a broadcast table by specifying the BROADCAST keyword in the CREATE TABLE statement. When you create a broadcast table on a PolarDB-X instance, the broadcast table is replicated on each data node of the PolarDB-X instance. The following example shows how to create a broadcast table:
CREATE TABLE broadcast_tbl(
 id bigint not null auto_increment, 
 bid int, 
 name varchar(30), 
 primary key(id)
) BROADCAST;

Partitioning types

PolarDB-X allows you to create a partitioned table by specifying a PARTITION BY clause in the CREATE TABLE statement. This way, you can create a partitioned table based on your business requirements. PolarDB-X supports the following partitioning types:
  • HASH partitioning: This type of partitioning routes data rows to partitions based on the values that are hashed by using the built-in consistent hashing algorithm. The hash values are calculated based on the values of a specified partition column or the values that are returned by a specified expression that contains a partitioning function. A variant of HASH partitioning is KEY partitioning. HASH partitioning and KEY partitioning are different in usage. In HASH partitioning, you can use an expression that contains a partitioning function. In KEY partitioning, you can use multiple partitioning columns as the partition key.
  • RANGE partitioning: This type of partitioning routes data rows to partitions by evaluating whether the values of the specified partitioning column or the values that are returned by the specified expression that contains a partitioning function are within the specified ranges. A variant of RANGE partitioning is RANGE COLUMNS partitioning. RANGE partitioning and RANGE COLUMNS partitioning are different in usage. For range partitioning, you can use an expression that contains a partitioning function. For range columns partitioning, you can use multiple partitioning columns as the partition key.
  • List partitioning: This type of partitioning is similar to range partitioning. List partitioning routes data rows to partitions by evaluating whether the values of a specified partitioning column or the values that are returned by a specified expression that contains a partitioning function are within the given lists of discrete values. The list type is divided into two types of partitioning policies: List Columns and List Partition, depending on whether multiple partition key column are used as partition keys and how they are used.

HASH partitioning

In PolarDB-X, a variant of HASH partitioning is KEY partitioning. HASH partitioning and KEY partitioning are standard partitioning syntaxes supported by the native MySQL. PolarDB-X provides flexible and powerful capabilities, such as splitting, merging, and migration, to manage partitions and supports hot partition splitting based on vector partition keys. PolarDB-X supports the HASH partitioning and KEY partitioning syntaxes of the native MySQL and further defines different routing behaviors for the two partitioning types. PolarDB-X supports only the syntaxes of HASH partitioning and KEY partitioning of the native MySQL but uses different manners to route data to different partitions. The following table lists the differences between KEY partitioning and HASH partitioning.

Table 1. Comparison between KEY partitioning and Hash partitioning
Partitioning type Supported partition key Partitioning function Syntax sample Features and limits Routing policy (point query)
KEY partitioning (default) Single-column partition key Not supported PARTITION BY KEY(c1)
  • Hot partition splitting is not supported.
  • Up to one partition column is used to calculate how data is routed to partitions.
  1. The hash value c1_hash is calculated by using the consistent hashing algorithm based on the value of the c1 column.
  2. Data is routed to different partitions based on the value of c1_hash.
Vector partition key Not supported PARTITION BY KEY(c1,c2,...,cn)
  • Hot partition splitting is supported.
  • By default, only the first partition column c1 is used to calculate how data is routed to partitions. Other columns can be used for hot partition splitting.
  • A maximum of n partition columns can be used at the same time to calculate how data is routed to partitions.
  • By default, up to five partition columns are supported.
  1. The values of the partition columns (c1,c2,...,cn) are used as a vector partition key to calculate the hashes of each column in the vector by using the consistent hashing algorithm. A vector of hashes (c1_hash,c2_hash,...,cn_hash) corresponding to the vector partition key is obtained.
  2. The vector of hashes (c1_hash,c2_hash,...,cn_hash) is used to route data to partitions by range.
HASH partitioning Single-column partition key Not supported PARTITION BY HASH(c1)
  • Hot partition splitting is not supported.
  • One partition column is used to calculate how to route data to partitions.
  • Only the following five partitioning functions are supported:
    • YEAR
    • TO_DAYS
    • TO_SECOND
    • MONTH
    • UNIX_TIMESTAMP
The routing policy of PARTITION BY HASH(c1) is the same as that of PARTITION BY KEY(c1).
Supported PARTITION BY HASH(YEAR(c1))
  1. The YEAR function is used to calculate the value of year based on the value of the c1 column.
  2. The value of year is used to calculate the hash value year_hash by using the consistent hashing algorithm.
  3. Data is routed to partitions based on the value of year_hash.
Vector partition key Not supported PARTITIONBY HASH(c1,c2,...,cn)
  • Hot partition splitting is not supported.
  • After a table is created, multiple partition columns are used at the same time to calculate how data in the table is routed to partitions.
  • By default, up to five partition columns are supported.
  1. The values of the partition columns (c1,c2,...,cn) are used as a vector partition key to calculate the hashes of each column in the vector by using the consistent hashing algorithm. A vector of hashes hashVal corresponding to the vector partition key is obtained.
  2. Data is routed to partitions based on the value of hashVal.
Example 1-1:KEY partitioning
KEY partitioning is the default partitioning type for PolarDB-X. KEY partitioning supports vector partition keys. For example, if you want to use the name and id columns as the partition key, you can set the number of partitions to 8 and execute the following statement to create the table:
CREATE TABLE key_tbl(
 id bigint not null auto_increment,
 bid int,
 name varchar(30),
 birthday datetime not null,
 primary key(id)
)
PARTITION BY KEY(name, id)
PARTITIONS 8;
In this example, a vector partition key (name, id) is used to partition the table. By default, only the first column name in the vector is used to calculate how data is routed to partitions. Therefore, to optimize a SQL statement by using partition pruning, you need only to specify a condition based on the first column in the vector partition key in the WHERE clause of the SQL query.
## The following SQL statement meets the partition pruning condition and scans data only in one partition.
SELECT id from key_tbl where name='Jack';

If the data in the name column is not evenly distributed or specific data is frequently accessed, you can use another partition columns in the vector, such as the id column, to split partitions for optimization. For more information, see ALTER TABLEGROUP.

If a vector contains N partition columns and the first K (1 ≤ K ≤ N) partition columns are used to calculate how data is routed to partitions, you need only to specify a condition based on the first K columns in the vector partition key in the WHERE clause of the SQL query for partition pruning.

Example 1-2: HASH partitioning
If you want to use the id column as the partition key to horizontally partition the table, you can set the number of partitions to 8 and execute the following statement to create the table by using HASH partitioning:
CREATE TABLE hash_tbl(
 id bigint not null auto_increment,
 bid int,
 name varchar(30),
 birthday datetime not null,
 primary key(id)
)
partition by hash(id)
partitions 8;
In HASH partitioning, you can use partitioning functions, such as YEAR and TO_DAYS, as expressions to convert data in the time format to integers. Therefore, to use the birthday column as the partition key, you can set the number of partitions to 8 and execute the following statement to create the table:
CREATE TABLE hash_tbl_todays(
 id bigint not null auto_increment,
 bid int,
 name varchar(30),
 birthday datetime not null,
 primary key(id)
)
PARTITION BY HASH(TO_DAYS(birthday))
PARTITIONS 8;

PolarDB-X supports the following partitioning functions:

  • YEAR
  • TO_DAYS
  • TO_SECOND
  • MONTH
  • UNIX_TIMESTAMP

Therefore, if you want to use partitioning functions in the statement, data in the partition key must be in the following formats: DATE, DATETIME, and TIMESTAMP.

Example 1-3: Expanded HASH partitioning
PolarDB-X expands the syntax of HASH partitioning and allows you to use vector partition keys that are supported by the native MySQL syntax. For example, you can execute the following statement to use a vector partition key (name, birthday) for HASH partitioning:
CREATE TABLE hash_tbl2(
 id bigint not null auto_increment,
 bid int,
 name varchar(30),
 birthday datetime not null,
 primary key(id)
)
PARTITION BY HASH(name, birthday)
PARTITIONS 8;

Different from KEY partitioning, HASH partitioning uses vector partition keys. All partition columns are used to calculate how data is routed to partitions. Therefore, you must specify a condition based on all partition columns in the WHERE clause of a SQL statement for partition pruning. In the following examples, SQL1 can meet the partition pruning condition of hash_tbl2, and SQL2 cannot meet the partition pruning condition of hash_tbl2.

## The following SQL statement SQL1 meets the partition pruning condition and scans data only in one partition.
SELECT id from hash_tbl2 where name='Jack' and birthday='1990-11-11';

## The following SQL statement SQL2 does not meet the partition pruning condition and scans data in all partitions.
SELECT id from hash_tbl2 where name='Jack';

In HASH partitioning, all partition keys are used to calculate the hashes. Therefore, data is more evenly distributed in HASH partitioning than KEY partitioning in which only part of columns in the vector partition key are used to calculate the hashes. However, HASH partitioning does not support hot partition splitting because all partition columns are used to calculate the hashes and no columns are available for hot partition splitting.

Limits
  • Limits on data type
    • Only the following integer formats are supported: BIGINT, BIGINT UNSINGED, INT UNSINGED, INT, MEDIUMINT, MEDIUMINT UNSINGED, SMALLINT, SMALLINT UNSINGED, TINYINT, and TINYINT UNSINGED.
    • Only the following date and time formats are supported: DATETIME, DATE, and TIMESTAMP.
    • Only the following string formats are supported: CHAR and VARCHAR.
  • Limits on syntax
    • HASH partitioning supports partitioning functions in single-column partition keys that are in the time formats.
    • HASH partitioning does not support partitioning functions in vector partition keys and does not support hot partition splitting.
    • By default, the number of partitions cannot exceed 8192.
    • By default, up to five partition columns are supported.
Data distribution balancing
  • KEY partitioning and HASH partitioning both use the built-in consistent hashing algorithm MurmurHash3. This algorithm is widely tested in the industry and is proven to have low data collision and high performance.
  • When you use KEY partitioning or HASH partitioning, data distribution across different partitions become balanced when the number of different values in the partition key is larger than 3000 based on the MurmurHash3 algorithm. The more the number of different values in the partition key, the more balanced the data distribution is.

RANGE partitioning

In PolarDB-X, a variant of RANGE partitioning is RANGE COLUMNS partitioning. RANGE partitioning and RANGE COLUMNS partitioning are standard partitioning syntaxes supported by the native MySQL. The following table lists the difference between RANGE partitioning and RANGE COLUMNS partitioning.

Table 2. Comparison between RANGE partitioning and RANGE COLUMNS partitioning
Partitioning type Supported partition key Partitioning function Syntax sample Features and limits Routing policy (point query)
RANGE COLUMNS partitioning Single-column partition key and vector partition key Not supported PARTITION BY RANGE COLUMNS (c1,c2,...,cn) ( PARTITION p1 VALUES LESS THAN (1,10,...,1000), PARTITION p2 VALUES LESS THAN (2,20,...,2000) ...) Hot partition splitting is supported. For example, if a large number of rows in the c1 partition column are the same value, such as 88, you can use the c2 partition column to perform hot partition splitting.
  1. The values of the partition columns (c1,c2,...,cn) are used as a vector partition key.
  2. This partition key is used to determine the partition to which data is routed based on the binary search algorithm.
RANGE partitioning Single-column partition key Supported PARTITION BY RANGE(YEAR(c1)) ( PARTITION p1 VALUES LESS THAN (2019), PARTITION p2 VALUES LESS THAN (2021) ...)
  • Hot partition splitting is not supported.
  • Only the following five partitioning functions are supported:
    • YEAR
    • TO_DAYS
    • TO_SECOND
    • MONTH
    • UNIX_TIMESTAMP
  1. The YEAR function is used to calculate the value of year based on the value of the c1 column.
  2. The value of year is used to determine the partition to which data is routed based on the binary search algorithm.
Example 2-1: RANGE COLUMNS partitioning

RANGE COLUMNS partitioning supports vector partition keys but does not support partitioning functions. For example, you can execute the following statement to create a table by using the vector partition key (order_id, order_time):

CREATE TABLE orders(
 order_id int,
 order_time datetime not null)
PARTITION BY range columns(order_id,order_time)
(
  PARTITION p1 VALUES LESS THAN (10000,'2021-01-01'),
  PARTITION p2 VALUES LESS THAN (20000,'2021-01-01'),
  PARTITION p3 VALUES LESS THAN (30000,'2021-01-01'),
  PARTITION p4 VALUES LESS THAN (40000,'2021-01-01'),
  PARTITION p5 VALUES LESS THAN (50000,'2021-01-01'),
  PARTITION p6 VALUES LESS THAN (MAXVALUE,MAXVALUE)
);
Notice RANGE COLUMNS partitioning does not support partition keys that contains data in the TIMESTAMP or TIME format.
Example 2-2:RANGE partitioning

RANGE partitioning supports only single-column partition keys. If you want to use a column that contains date and time values as the partition key, you must use an expression that contains a partitioning function such as YEAR, TO_DAYS, TO_SECONDS, or MONTH to generate integer values based on the date and time values in the column.

Notice Columns that contain string values cannot be used to as partitioning columns in RANGE partitioning.

For example, you can execute the following statement to create a partitioned table that is partitioned based on the order_time column and stores data in different partitions by quarter:

CREATE TABLE orders_todays(
 id int,
 order_time datetime not null)
PARTITION BY RANGE(to_days(order_time))
(
  PARTITION p1 VALUES LESS THAN (to_days('2021-01-01')),
  PARTITION p2 VALUES LESS THAN (to_days('2021-04-01')),
  PARTITION p3 VALUES LESS THAN (to_days('2021-07-01')),
  PARTITION p4 VALUES LESS THAN (to_days('2021-10-01')),
  PARTITION p5 VALUES LESS THAN (to_days('2022-01-01')),
  PARTITION p6 VALUES LESS THAN (MAXVALUE)
);
Notice RANGE partitioning supports only columns that contain integer values as partition keys. In RANGE partitioning, a partition key can contain only one partition column.
Limits
  • Limits on data type
    • Only the following integer formats are supported: BIGINT, BIGINT UNSINGED, INT UNSINGED, INT, MEDIUMINT, MEDIUMINT UNSINGED, SMALLINT, SMALLINT UNSINGED, TINYINT, and TINYINT UNSINGED.
    • Only the following date and time formats are supported: DATETIME and DATE.
    • Only the following string formats are supported: CHAR and VARCHAR.
  • Limits on syntax
    • RANGE COLUMNS partitioning and RANGE partitioning do not support the value of null as the boundary of a value range.
    • RANGE COLUMNS partitioning does not support columns that contains timestamp values as partition keys.
    • RANGE partitioning supports only columns that contain integer values as partition keys. To use columns that contain timestamp values as partition keys, you must use the UNIX_TIMSTAMP partitioning function to ensure that the time zones of the data are consistent.
    • RANGE partitioning does not support hot partition splitting.
    • If you query a value of null, the data is routed as the minimum value.
    • By default, the number of partitions cannot exceed 8192.
    • By default, up to five partition columns are supported.

LIST partitioning

In PolarDB-X, a variant of LIST partitioning is LIST COLUMNS partitioning. LIST partitioning and LIST COLUMNS partitioning are standard partitioning syntaxes supported by the native MySQL. The following table lists the difference between LIST partitioning and LIST COLUMNS partitioning.

Table 3. Comparison between LIST partitioning and LIST COLUMNS partitioning
Partitioning type Supported partition key Partitioning function Syntax sample Features and limits Routing policy (point query)
LIST COLUMNS partitioning Single-column partition key and vector partition key Not supported PARTITION BY LIST COLUMNS (c1,c2,...,cn) ( PARTITION p1 VALUES IN ((1,10,...,1000),(2,20,...,2000) ), PARTITION p2 VALUES IN ((3,30,...,3000),(3,30,...,3000) ), ...) Hot partition splitting is not supported.
  1. The values of the partition columns (c1,c2,...,cn) are used as a vector partition key.
  2. This partition key is used to determine the partition to which data is routed based on the binary search algorithm.
LIST partitioning Single-column partition key Supported PARTITION BY LIST(YEAR(c1)) ( PARTITION p1 VALUES IN (2018,2019), PARTITION p2 VALUES IN (2020,2021) ...) Hot partition splitting is not supported.
Example 3-1: LIST COLUMNS partitioning

LIST COLUMNS partitioning supports vector partition keys. For example, you can execute the following statement to create a table by using the vector partition key (country, city):

CREATE TABLE orders_region(
 id int,
 country varchar(64),
 city varchar(64),
 order_time datetime not null)
PARTITION BY LIST COLUMNS(country,city)
(
  PARTITION p1 VALUES IN (('China','Hangzhou'), ('China','Beijing')),
  PARTITION p2 VALUES IN (('United States','NewYork'),('United States','Chicago')),
  PARTITION p3 VALUES IN (('Russian','Moscow'))
);
Notice LIST COLUMNS partitioning does not support partition keys that contains data in the TIMESTAMP or TIME format.
Example 3-2: LIST partitioning

LIST partitioning supports only single-column partition keys. If you want to use a column that contains date and time values as the partition key, you must use an expression that contains a partitioning function such as YEAR, TO_DAYS, TO_SECONDS, or MONTH to generate integer values based on the date and time values in the column.

For example, you can execute the following statement to create a partitioned table that is partitioned based on the order_time column and stores data in different partitions by year:

CREATE TABLE orders_years(
 id int,
 country varchar(64),
 city varchar(64),
 order_time datetime not null)
PARTITION BY LIST(YEAR(order_time))
(
  PARTITION p1 VALUES IN (1990,1991,1992,1993,1994,1995,1996,1997,1998,1999),
  PARTITION p2 VALUES IN (2000,2001,2002,2003,2004,2005,2006,2007,2008,2009),
  PARTITION p3 VALUES IN (2010,2011,2012,2013,2014,2015,2016,2017,2018,2019)
);
Notice LIST partitioning supports only columns that contain integer values as partition keys. Columns that contain string values cannot be directly used as partition keys.
Limits
  • Limits on data type
    • Only the following integer formats are supported: BIGINT, BIGINT UNSINGED, INT UNSINGED, INT, MEDIUMINT, MEDIUMINT UNSINGED, SMALLINT, SMALLINT UNSINGED, TINYINT, and TINYINT UNSINGED.
    • Only the following date and time formats are supported: DATETIME and DATE.
    • Only the following string formats are supported: CHAR and VARCHAR.
  • Limits on syntax
    • LIST COLUMNS partitioning does not support columns that contains timestamp values as partition keys.
    • LIST partitioning supports only columns that contain integer values as partition keys.
    • LIST COLUMNS partitioning and LIST partitioning do not support hot partition splitting.
    • By default, the number of partitions cannot exceed 8192.
    • By default, up to five partition columns are supported.

Support data types

Table 4. Data types supported by the partition keys in different partitioning types
Data type HASH partitioning RANGE partitioning LIST partitioning
HASH partitioning KEY partitioning RANGE partitioning RANGE COLUMNS partitioning LIST partitioning LIST COLUMNS partitioning
Single partition column Multiple partition columns
Integer TINYINT 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported
TINYINT UNSIGNED 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported
SMALLINT 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported
SMALLINT UNSIGNED 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported
MEDIUMINT 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported
MEDIUMINT UNSIGNED 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported
INT 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported
INT UNSIGNED 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported
BIGINT 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported
BIGINT UNSIGNED 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported 456789Supported
Date and time DATE 456789(Partitioning functions are supported by partition keys that contain this type of data) 456789Supported 456789Supported 456789(Partitioning functions are supported by partition keys that contain this type of data) 456789Supported 456789(Partitioning functions are supported by partition keys that contain this type of data) 456789Supported
DATETIME 456789(Partitioning functions are supported by partition keys that contain this type of data) 456789Supported 456789Supported 456789(Partitioning functions are supported by partition keys that contain this type of data) 456789Supported 456789(Partitioning functions are supported by partition keys that contain this type of data) 456789Supported
TIMESTAMP 456789(Partitioning functions are supported by partition keys that contain this type of data) 456789Supported 456789Supported 456789Not supported 456789Not supported 456789Not supported 456789Not supported
String CHAR 456789Supported 456789Supported 456789Supported 456789Not supported 456789Supported 456789Not supported 456789Supported
VARCHAR 456789Supported 456789Supported 456789Supported 456789Not supported 456789Supported 456789Not supported 456789Supported

Data types of partition columns and routing policies

The routing policy in partitioning is determined by the data type of the partition columns, especially in HASH partitioning and KEY partitioning. Therefore, when you use partition columns that contain different types of data in partitioning, data is routed to partitions in different manners based on different hashing or comparison algorithms. For example, the comparison algorithms used to route data may be either case-sensitive or case-insensitive based on the data type of partition columns. The routing policy of MySQL is also determined by data types.

In the following example, a table named tbl_int is partitioned into 1024 partitions based on a partition column that contains INT values, and a table named tbl_bigint is partitioned into 1024 partitions based on a partition column that contains BIGINT values The data types of the partition columns of the two tables are different. Therefore, data is routed to different partitions when the same value (12345678) is queried in the two tables.

show create table tbl_int;
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE   | CREATE TABLE                                                                                                                                                                |
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tbl_int | CREATE TABLE `tbl_int` (
    `a` int(11) NOT NULL,
    KEY `auto_shard_key_a` USING BTREE (`a`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4
PARTITION BY KEY(`a`)
PARTITIONS 1024 |
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.02 sec)

show create table tbl_bigint;
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE      | CREATE TABLE                                                                                                                                                                      |
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tbl_bigint | CREATE TABLE `tbl_bigint` (
    `a` bigint(20) NOT NULL,
    KEY `auto_shard_key_a` USING BTREE (`a`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4
PARTITION BY KEY(`a`)
PARTITIONS 1024 |
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.10 sec)mysql> create table if not exists tbl_bigint(a bigint not null)
    -> partition by key(a) partitions 1024;
Query OK, 0 rows affected (28.41 sec)

explain select * from tbl_int where a=12345678;
+---------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTIONPLAN                                                                             |
+---------------------------------------------------------------------------------------------------+
| LogicalView(tables="tbl_int[p260]", sql="SELECT `a` FROM `tbl_int` AS `tbl_int` WHERE (`a` = ?)") |
| HitCache:false                                                                                    |
| Source:PLAN_CACHE                                                                                 |
| TemplateId: c90af636                                                                              |
+---------------------------------------------------------------------------------------------------+
4 rows in set (0.45 sec)

explain select * from tbl_bigint where a=12345678;
+------------------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTIONPLAN                                                                                      |
+------------------------------------------------------------------------------------------------------------+
| LogicalView(tables="tbl_bigint[p477]", sql="SELECT `a` FROM `tbl_bigint` AS `tbl_bigint` WHERE (`a` = ?)") |
| HitCache:false                                                                                             |
| Source:PLAN_CACHE                                                                                          |
| TemplateId: 9b2fa47c                                                                                       |
+------------------------------------------------------------------------------------------------------------+
4 rows in set (0.02 sec)
            
Change of the charset and collation of partition columns

The routing policy of a partitioned table is determined by the data type of partition columns. If you change the charset and collation of the partition columns, the distribution of all data in the table also changes. Therefore, exercise caution when you change the charset and collation of partition columns.

Data type truncation and conversion of partition columns

Data type truncation of partition columns

When a SQL statement is executed to query or insert data, if the values specified by the constant expression in the partition columns exceed the valid range of the partition column data type, PolarDB-X truncates the specified values based on the data types of the partition columns and then uses the truncated values to calculate the routing policy.

For example, the data type of the partition column of a table named tbl_smallint is SMALLINT whose valid range is [-32768, 32767]. In this case, if you insert a value that exceeds the valid range, such as 12345678 or -12345678, the value is truncated to 32767 or -32768. The following example shows how the value is truncated.

show create table tbl_smallint;
+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE        | CREATE TABLE                                                                                                                                                                        |
+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tbl_smallint | CREATE TABLE `tbl_smallint` (
    `a` smallint(6) NOT NULL,
    KEY `auto_shard_key_a` USING BTREE (`a`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4
PARTITION BY KEY(`a`)
PARTITIONS 128 |
+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.06 sec)
set sql_mode='';
Query OK, 0 rows affected (0.00 sec)

insert into tbl_smallint values (12345678),(-12345678);
Query OK, 2 rows affected (0.07 sec)

select * from tbl_smallint;
+--------+
| a      |
+--------+
| -32768 |
|  32767 |
+--------+
2 rows in set (3.51 sec)

explain select * from tbl_smallint where a=12345678;
+------------------------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTIONPLAN                                                                                            |
+------------------------------------------------------------------------------------------------------------------+
| LogicalView(tables="tbl_smallint[p117]", sql="SELECT `a` FROM `tbl_smallint` AS `tbl_smallint` WHERE (`a` = ?)") |
| HitCache:false                                                                                                   |
| Source:PLAN_CACHE                                                                                                |
| TemplateId: afb464d5                                                                                             |
+------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.16 sec)

explain select * from tbl_smallint where a=32767;
+------------------------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTIONPLAN                                                                                            |
+------------------------------------------------------------------------------------------------------------------+
| LogicalView(tables="tbl_smallint[p117]", sql="SELECT `a` FROM `tbl_smallint` AS `tbl_smallint` WHERE (`a` = ?)") |
| HitCache:true                                                                                                    |
| Source:PLAN_CACHE                                                                                                |
| TemplateId: afb464d5                                                                                             |
+------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.03 sec)

If you query a value that exceeds the valid range of the partition column data type, the value is truncated and then used to calculate the routing policy. Therefore, when you perform a query on the tbl_smallint table, the returned results are the same whether you set the value of the partition column to 12345678 or 32767 in the query.

Data type truncation of partition columns

When a SQL statement is executed to query or insert data, if the data types specified by the constant expression in the partition columns are different from the actual data types of the partition columns, PolarDB-X implicitly converts the data types of the constant expression and then calculate the routing policy based on the converted data types. However, the conversion of data types may fail. For example, the string 'abc' cannot be converted to an integer.

When PolarDB-X converts the data types of partition columns, different operations are performed for DQL, DML, and DDL statements.

  • DQL statements in which the data types of partition columns in the WHERE clause are converted
    • If the data types are converted, data is routed to partitions based on the converted data types.
    • If the data types fail to be converted, the condition specified based on the partition columns are ignored, and all data in the table is queried.
  • DML statements such as INSERT and REPLACE
    • If the data types are converted, data is routed to partitions based on the converted data types.
    • If the data types fail to be converted, an error is returned and the statement is not executed.
  • DDL statements related to partitioned tables such as CREATE TABLE and ALTER TABLE
    • An error is returned and the statement is not executed even if the data types are converted because data type conversion is not supported in DDL statements.
    • If the data types fail to be converted, an error is returned and the statement is not executed.

Difference between CREATE TABLE syntaxes in MySQL and PolarDB-X

Difference MySQL PolarDB-X
Whether the partition key is required to contain the primary key Required Not required
KEY partitioning Routing algorithm: modulo by the number of partitions Routing algorithm: consistent hashing
HASH partitioning
  1. Routing algorithm: modulo by the number of partitions
  2. Multiple partition columns are not supported
  3. LINEAR HASH partitioning is supported.
  4. The routing algorithms of Hash(col) and Key(col) are different when a single partition column is used.
  1. Routing algorithm: consistent hashing
  2. Multiple partition columns are supported (expanded syntax)
  3. LINEAR HASH partitioning is not supported.
  4. The routing algorithms of Hash(col) and Key(col) are the same when a single partition column is used.
Partitioning function Supported In the PARTITION BY HASH(expr(col)) statement, expr can be a common expression, such as YEAR(col) + 1. Supported with limits In the PARTITION BY HASH(expr(col)) statement, expr can be only one of the following functions. Other operations, such as addition, subtraction, multiplication, and division cannot be used in the expression.
  • YEAR
  • TO_DAYS
  • TO_SECOND
  • MONTH
  • UNIX_TIMESTAMP
Data type of partition column In KEY partitioning, columns that contain any type of values can be used as partition columns. In KEY partitioning, only columns that contain integer, time, and string values can be used as partition columns.
Character set of partition column Partition columns support all common character sets. Partition columns support only the following three character sets:
  • utf8
  • utf8mb4
  • gbk
Subpartitioning Supported Not supported