You can execute the CREATE TABLE statement to create a partitioned table. This topic describes the syntax of the CREATE TABLE statement and the clauses, parameters, and basic methods that can be used in the CREATE TABLE statement.
Usage notes
Before you execute the CREATE TABLE statement to create a partitioned table in a logical
database, make sure that the logical database is created by using the auto partitioning
mode. The CREATE TABLE statement cannot be used to create a partitioned table in databases
that are created by using the drds partitioning mode. You can execute the SHOW CREATE DATBASE db_name
statement to view the partitioning mode of the logical database in which you want
to create a table. Example:
CREATE DATABASE part_db mode='auto';
Query OK, 1 row affected (4.29 sec)
SHOW CREATE DATABASE part_db;
+----------+-----------------------------------------------+
| DATABASE | CREATE DATABASE |
+----------+-----------------------------------------------+
| part_db | CREATE DATABASE `part_db` /* MODE = 'auto' */ |
+----------+-----------------------------------------------+
1 row in set (0.18 sec)
For more information about the statement and syntax that are used to create a database, see CREATE DATABASE.
Syntax
CREATE [PARTITION] TABLE [IF NOT EXISTS] tbl_name
(create_definition, ...)
[table_options]
[table_partition_definition]
[local_partition_definition]
create_definition:
col_name column_definition
| mysql_create_definition
| [UNIQUE] GLOBAL INDEX index_name [index_type] (index_sharding_col_name,...)
[global_secondary_index_option]
[index_option] ...
index_sharding_col_name:
col_name [(length)] [ASC | DESC]
index_option:
KEY_BLOCK_SIZE [=] value
| index_type
| WITH PARSER parser_name
| COMMENT 'string'
index_type:
USING {BTREE | HASH}
# Create a global secondary index.
global_secondary_index_option:
[COVERING (col_name,...)]
[partition_options]
table_options:
table_option [[,] table_option] ...
table_option: {
# Specify the table group to which the table that you want to create belongs.
TABLEGROUP [=] value,...,}
# Specify the type of the table that you want to create.
table_partition_definition:
single
| broadcast
| partition_options
# Specify the partitioning policy.
partition_options:
PARTITION BY
HASH({column_name | partition_func(column_name)})
| KEY(column_list)
| RANGE{({column_name | partition_func(column_name)})
| RANGE COLUMNS(column_list)}
| LIST{({column_name | partition_func(column_name)})
| LIST COLUMNS(column_list)} }
partition_list_spec
# Specify the partitioning function.
partition_func:
YEAR
| TO_DAYS
| TO_SECOND
| UNIX_TIMESTAMP
| MONTH
# Specify the partitioning type.
partition_list_spec:
hash_partition_list
| range_partition_list
| list_partition_list
# Specify the hash or key by which data is partitioned in HASH partitioning or KEY partitioning.
hash_partition_list:
PARTITIONS partition_count
# Specify the ranges by which data is partitioned in RANGE partitioning or RANGE COLUMNS partitioning.
range_partition_list:
range_partition [, range_partition ...]
range_partition:
PARTITION partition_name VALUES LESS THAN {(expr | value_list)} [partition_spec_options]
# Specify the lists by which data is partitioned in LIST partitioning or LIST COLUMNS partitioning.
list_partition_list:
list_partition [, list_partition ...]
list_partition:
PARTITION partition_name VALUES IN (value_list) [partition_spec_options]
partition_spec_options:
[[STORAGE] ENGINE [=] engine_name]
[COMMENT [=] 'string']
[{CHARSET | CHARACTER SET} [=] charset]
[COLLATE [=] collation]
[TABLEGROUP [=] table_group_id]
[LOCALITY [=] locality_option]
locality_option:
'dn=storage_inst_id_list'
storage_inst_id_list:
storage_inst_id[,storage_inst_id_list]
local_partition_definition:
LOCAL PARTITION BY RANGE (column_name)
[STARTWITH 'yyyy-MM-dd']
INTERVAL interval_count [YEAR|MONTH|DAY]
[EXPIRE AFTER expire_after_count]
[PRE ALLOCATE pre_allocate_count]
[PIVOTDATE pivotdate_func]
[DISABLE SCHEDULE]
pivotdate_func:
NOW()
| DATE_ADD(...)
| DATE_SUB(...)
Terms
- Partition key: the column or columns based on which a table is horizontally partitioned.
- Partition column: the column that is used to calculate how data is routed to the partitions of the table. A partition column is a part of the partition key. A partition key can contain one or more partition columns.
- Vector partition key: the partition key that contains one or more partition columns.
- Single-column partition key: the partition key that contains only one partition column.
- Prefix partition column: If a vector partition key contains N (N>1) partition columns, the first K (1<=K<=N) partition columns are defined as a prefix partition column of the vector partition key.
- Partitioning function: the function that uses the partition column as the input parameter and returns the results based on which data is routed to the partitions of the table.
- Partition pruning: a feature that optimizes queries by filtering out the partitions that do not need to be scanned based on the partitioning type and query conditions.
- Hot partition splitting: a feature that is used to balance the workload of partitions. When access to a partition focus on hot data in a prefix partition column of the vector partition key, you can split the partition based on the next partition key column.
Parameters
Parameter | Description |
---|---|
CHARSET | CHARACTER SET | Specifies the default character set that is used for columns in a table. You can use
one of the following character sets:
|
COLLATE | Specifies the default collation for columns in a table. You can use one of the following
collations:
|
TABLEGROUP | Specifies the table group to which the partitioned table belongs. If this parameter is not specified, PolarDB-X automatically assigns the partitioned table to a table group or creates a table group. In the table group, all tables are partitioned based on the same partitioning method. |
LOCALITY | Specifies the data node on which the partitioned table is deployed. |
Default automatic partitioning
- If you create a partitioned table without specifying a partition key, PolarDB-X automatically uses KEY partitioning to partition the table based on the primary key.
If no primary key is specified for the table, PolarDB-X uses the implicit primary
key to partition the table.
The number of partitions into which a table is partitioned by using default automatic partitioning can be calculated by by using the following formula: Number of partitions = Number of the logical nodes of the PolarDB-X instance x 8. For example, you set the number of logical nodes to 2 when you create a PolarDB-X instance. In this case, when you create a table on the instance without specifying a partition key, the table is automatically partitioned to 16 partitions by default.
- By default, the primary table is partitioned based on the primary key. All indexes in the primary table are automatically partitioned based on the index column and primary key column.
The following example shows the standard syntax of the CREATE TABLE statement. In the example, id indicates the primary key and name indicates the index column.
CREATE TABLE auto_part_tbl(
id bigint not null auto_increment,
bid int,
name varchar(30),
primary key(id),
index idx_name (name)
);
If you execute the SHOW CREATE TABLE
statement to query the details of a CREATE TABLE statement, the standard MySQL syntax
of the CREATE TABLE statement is displayed. The information about the partitions are
not displayed. The following example shows the execution results of the SHOW CREATE
TABLE statement:
show create table auto_part_tbl;
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE | CREATE TABLE |
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| auto_part_tbl | CREATE TABLE `auto_part_tbl` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`bid` int(11) DEFAULT NULL,
`name` varchar(30) DEFAULT NULL,
PRIMARY KEY (`id`),
INDEX `idx_name` (`name`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8 |
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.06 sec)
If you execute the SHOW FULL CREATE TABLE
statement to query the details of a CREATE TABLE statement, the information about
all partitions of the primary table and the index table are displayed. The following
example shows the execution results of the SHOW FULL CREATE TABLE statement:
show full create table auto_part_tbl;
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE | CREATE TABLE |
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| auto_part_tbl | CREATE PARTITION TABLE `auto_part_tbl` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`bid` int(11) DEFAULT NULL,
`name` varchar(30) DEFAULT NULL,
PRIMARY KEY (`id`),
GLOBAL INDEX /* idx_name_$a870 */ `idx_name` (`name`) PARTITION BY KEY (`name`, `id`) PARTITIONS 16,
LOCAL KEY `_local_idx_name` (`name`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8
PARTITION BY KEY(`id`)
PARTITIONS 16
/* tablegroup = `tg108` */ |
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.03 sec)
- The primary table named auto_part_tbl is automatically partitioned based on the id column into 16 partitions by using KEY partitioning.
- The index table named idx_name uses the global index by default and is partitioned into 16 partitions based on the 'name' and 'id' partition keys.
Create a non-partitioned table and a broadcast table
CREATE TABLE single_tbl(
id bigint not null auto_increment,
bid int,
name varchar(30),
primary key(id)
) SINGLE;
CREATE TABLE broadcast_tbl(
id bigint not null auto_increment,
bid int,
name varchar(30),
primary key(id)
) BROADCAST;
Partitioning types
- HASH partitioning: This type of partitioning routes data rows to partitions based on the values that are hashed by using the built-in consistent hashing algorithm. The hash values are calculated based on the values of a specified partition column or the values that are returned by a specified expression that contains a partitioning function. A variant of HASH partitioning is KEY partitioning. HASH partitioning and KEY partitioning are different in usage. In HASH partitioning, you can use an expression that contains a partitioning function. In KEY partitioning, you can use multiple partitioning columns as the partition key.
- RANGE partitioning: This type of partitioning routes data rows to partitions by evaluating whether the values of the specified partitioning column or the values that are returned by the specified expression that contains a partitioning function are within the specified ranges. A variant of RANGE partitioning is RANGE COLUMNS partitioning. RANGE partitioning and RANGE COLUMNS partitioning are different in usage. For range partitioning, you can use an expression that contains a partitioning function. For range columns partitioning, you can use multiple partitioning columns as the partition key.
- List partitioning: This type of partitioning is similar to range partitioning. List partitioning routes data rows to partitions by evaluating whether the values of a specified partitioning column or the values that are returned by a specified expression that contains a partitioning function are within the given lists of discrete values. The list type is divided into two types of partitioning policies: List Columns and List Partition, depending on whether multiple partition key column are used as partition keys and how they are used.
HASH partitioning
In PolarDB-X, a variant of HASH partitioning is KEY partitioning. HASH partitioning and KEY partitioning are standard partitioning syntaxes supported by the native MySQL. PolarDB-X provides flexible and powerful capabilities, such as splitting, merging, and migration, to manage partitions and supports hot partition splitting based on vector partition keys. PolarDB-X supports the HASH partitioning and KEY partitioning syntaxes of the native MySQL and further defines different routing behaviors for the two partitioning types. PolarDB-X supports only the syntaxes of HASH partitioning and KEY partitioning of the native MySQL but uses different manners to route data to different partitions. The following table lists the differences between KEY partitioning and HASH partitioning.
Partitioning type | Supported partition key | Partitioning function | Syntax sample | Features and limits | Routing policy (point query) |
---|---|---|---|---|---|
KEY partitioning (default) | Single-column partition key | Not supported | PARTITION BY KEY(c1) |
|
|
Vector partition key | Not supported | PARTITION BY KEY(c1,c2,...,cn) |
|
|
|
HASH partitioning | Single-column partition key | Not supported | PARTITION BY HASH(c1) |
|
The routing policy of PARTITION BY HASH(c1) is the same as that of PARTITION BY KEY(c1). |
Supported | PARTITION BY HASH(YEAR(c1)) |
|
|||
Vector partition key | Not supported | PARTITIONBY HASH(c1,c2,...,cn) |
|
|
CREATE TABLE key_tbl(
id bigint not null auto_increment,
bid int,
name varchar(30),
birthday datetime not null,
primary key(id)
)
PARTITION BY KEY(name, id)
PARTITIONS 8;
## The following SQL statement meets the partition pruning condition and scans data only in one partition.
SELECT id from key_tbl where name='Jack';
If the data in the name column is not evenly distributed or specific data is frequently accessed, you can use another partition columns in the vector, such as the id column, to split partitions for optimization. For more information, see ALTER TABLEGROUP.
If a vector contains N partition columns and the first K (1 ≤ K ≤ N) partition columns are used to calculate how data is routed to partitions, you need only to specify a condition based on the first K columns in the vector partition key in the WHERE clause of the SQL query for partition pruning.
Example 1-2: HASH partitioningCREATE TABLE hash_tbl(
id bigint not null auto_increment,
bid int,
name varchar(30),
birthday datetime not null,
primary key(id)
)
partition by hash(id)
partitions 8;
CREATE TABLE hash_tbl_todays(
id bigint not null auto_increment,
bid int,
name varchar(30),
birthday datetime not null,
primary key(id)
)
PARTITION BY HASH(TO_DAYS(birthday))
PARTITIONS 8;
PolarDB-X supports the following partitioning functions:
- YEAR
- TO_DAYS
- TO_SECOND
- MONTH
- UNIX_TIMESTAMP
Therefore, if you want to use partitioning functions in the statement, data in the partition key must be in the following formats: DATE, DATETIME, and TIMESTAMP.
Example 1-3: Expanded HASH partitioningCREATE TABLE hash_tbl2(
id bigint not null auto_increment,
bid int,
name varchar(30),
birthday datetime not null,
primary key(id)
)
PARTITION BY HASH(name, birthday)
PARTITIONS 8;
Different from KEY partitioning, HASH partitioning uses vector partition keys. All partition columns are used to calculate how data is routed to partitions. Therefore, you must specify a condition based on all partition columns in the WHERE clause of a SQL statement for partition pruning. In the following examples, SQL1 can meet the partition pruning condition of hash_tbl2, and SQL2 cannot meet the partition pruning condition of hash_tbl2.
## The following SQL statement SQL1 meets the partition pruning condition and scans data only in one partition.
SELECT id from hash_tbl2 where name='Jack' and birthday='1990-11-11';
## The following SQL statement SQL2 does not meet the partition pruning condition and scans data in all partitions.
SELECT id from hash_tbl2 where name='Jack';
In HASH partitioning, all partition keys are used to calculate the hashes. Therefore, data is more evenly distributed in HASH partitioning than KEY partitioning in which only part of columns in the vector partition key are used to calculate the hashes. However, HASH partitioning does not support hot partition splitting because all partition columns are used to calculate the hashes and no columns are available for hot partition splitting.
Limits- Limits on data type
- Only the following integer formats are supported: BIGINT, BIGINT UNSINGED, INT UNSINGED, INT, MEDIUMINT, MEDIUMINT UNSINGED, SMALLINT, SMALLINT UNSINGED, TINYINT, and TINYINT UNSINGED.
- Only the following date and time formats are supported: DATETIME, DATE, and TIMESTAMP.
- Only the following string formats are supported: CHAR and VARCHAR.
- Limits on syntax
- HASH partitioning supports partitioning functions in single-column partition keys that are in the time formats.
- HASH partitioning does not support partitioning functions in vector partition keys and does not support hot partition splitting.
- By default, the number of partitions cannot exceed 8192.
- By default, up to five partition columns are supported.
- KEY partitioning and HASH partitioning both use the built-in consistent hashing algorithm MurmurHash3. This algorithm is widely tested in the industry and is proven to have low data collision and high performance.
- When you use KEY partitioning or HASH partitioning, data distribution across different partitions become balanced when the number of different values in the partition key is larger than 3000 based on the MurmurHash3 algorithm. The more the number of different values in the partition key, the more balanced the data distribution is.
RANGE partitioning
In PolarDB-X, a variant of RANGE partitioning is RANGE COLUMNS partitioning. RANGE partitioning and RANGE COLUMNS partitioning are standard partitioning syntaxes supported by the native MySQL. The following table lists the difference between RANGE partitioning and RANGE COLUMNS partitioning.
Partitioning type | Supported partition key | Partitioning function | Syntax sample | Features and limits | Routing policy (point query) |
---|---|---|---|---|---|
RANGE COLUMNS partitioning | Single-column partition key and vector partition key | Not supported | PARTITION BY RANGE COLUMNS (c1,c2,...,cn) ( PARTITION p1 VALUES LESS THAN (1,10,...,1000), PARTITION p2 VALUES LESS THAN (2,20,...,2000) ...) | Hot partition splitting is supported. For example, if a large number of rows in the c1 partition column are the same value, such as 88, you can use the c2 partition column to perform hot partition splitting. |
|
RANGE partitioning | Single-column partition key | Supported | PARTITION BY RANGE(YEAR(c1)) ( PARTITION p1 VALUES LESS THAN (2019), PARTITION p2 VALUES LESS THAN (2021) ...) |
|
|
RANGE COLUMNS partitioning supports vector partition keys but does not support partitioning functions. For example, you can execute the following statement to create a table by using the vector partition key (order_id, order_time):
CREATE TABLE orders(
order_id int,
order_time datetime not null)
PARTITION BY range columns(order_id,order_time)
(
PARTITION p1 VALUES LESS THAN (10000,'2021-01-01'),
PARTITION p2 VALUES LESS THAN (20000,'2021-01-01'),
PARTITION p3 VALUES LESS THAN (30000,'2021-01-01'),
PARTITION p4 VALUES LESS THAN (40000,'2021-01-01'),
PARTITION p5 VALUES LESS THAN (50000,'2021-01-01'),
PARTITION p6 VALUES LESS THAN (MAXVALUE,MAXVALUE)
);
RANGE partitioning supports only single-column partition keys. If you want to use a column that contains date and time values as the partition key, you must use an expression that contains a partitioning function such as YEAR, TO_DAYS, TO_SECONDS, or MONTH to generate integer values based on the date and time values in the column.
For example, you can execute the following statement to create a partitioned table that is partitioned based on the order_time column and stores data in different partitions by quarter:
CREATE TABLE orders_todays(
id int,
order_time datetime not null)
PARTITION BY RANGE(to_days(order_time))
(
PARTITION p1 VALUES LESS THAN (to_days('2021-01-01')),
PARTITION p2 VALUES LESS THAN (to_days('2021-04-01')),
PARTITION p3 VALUES LESS THAN (to_days('2021-07-01')),
PARTITION p4 VALUES LESS THAN (to_days('2021-10-01')),
PARTITION p5 VALUES LESS THAN (to_days('2022-01-01')),
PARTITION p6 VALUES LESS THAN (MAXVALUE)
);
- Limits on data type
- Only the following integer formats are supported: BIGINT, BIGINT UNSINGED, INT UNSINGED, INT, MEDIUMINT, MEDIUMINT UNSINGED, SMALLINT, SMALLINT UNSINGED, TINYINT, and TINYINT UNSINGED.
- Only the following date and time formats are supported: DATETIME and DATE.
- Only the following string formats are supported: CHAR and VARCHAR.
- Limits on syntax
- RANGE COLUMNS partitioning and RANGE partitioning do not support the value of null as the boundary of a value range.
- RANGE COLUMNS partitioning does not support columns that contains timestamp values as partition keys.
- RANGE partitioning supports only columns that contain integer values as partition keys. To use columns that contain timestamp values as partition keys, you must use the UNIX_TIMSTAMP partitioning function to ensure that the time zones of the data are consistent.
- RANGE partitioning does not support hot partition splitting.
- If you query a value of null, the data is routed as the minimum value.
- By default, the number of partitions cannot exceed 8192.
- By default, up to five partition columns are supported.
LIST partitioning
In PolarDB-X, a variant of LIST partitioning is LIST COLUMNS partitioning. LIST partitioning and LIST COLUMNS partitioning are standard partitioning syntaxes supported by the native MySQL. The following table lists the difference between LIST partitioning and LIST COLUMNS partitioning.
Partitioning type | Supported partition key | Partitioning function | Syntax sample | Features and limits | Routing policy (point query) |
---|---|---|---|---|---|
LIST COLUMNS partitioning | Single-column partition key and vector partition key | Not supported | PARTITION BY LIST COLUMNS (c1,c2,...,cn) ( PARTITION p1 VALUES IN ((1,10,...,1000),(2,20,...,2000) ), PARTITION p2 VALUES IN ((3,30,...,3000),(3,30,...,3000) ), ...) | Hot partition splitting is not supported. |
|
LIST partitioning | Single-column partition key | Supported | PARTITION BY LIST(YEAR(c1)) ( PARTITION p1 VALUES IN (2018,2019), PARTITION p2 VALUES IN (2020,2021) ...) | Hot partition splitting is not supported. |
LIST COLUMNS partitioning supports vector partition keys. For example, you can execute the following statement to create a table by using the vector partition key (country, city):
CREATE TABLE orders_region(
id int,
country varchar(64),
city varchar(64),
order_time datetime not null)
PARTITION BY LIST COLUMNS(country,city)
(
PARTITION p1 VALUES IN (('China','Hangzhou'), ('China','Beijing')),
PARTITION p2 VALUES IN (('United States','NewYork'),('United States','Chicago')),
PARTITION p3 VALUES IN (('Russian','Moscow'))
);
LIST partitioning supports only single-column partition keys. If you want to use a column that contains date and time values as the partition key, you must use an expression that contains a partitioning function such as YEAR, TO_DAYS, TO_SECONDS, or MONTH to generate integer values based on the date and time values in the column.
For example, you can execute the following statement to create a partitioned table that is partitioned based on the order_time column and stores data in different partitions by year:
CREATE TABLE orders_years(
id int,
country varchar(64),
city varchar(64),
order_time datetime not null)
PARTITION BY LIST(YEAR(order_time))
(
PARTITION p1 VALUES IN (1990,1991,1992,1993,1994,1995,1996,1997,1998,1999),
PARTITION p2 VALUES IN (2000,2001,2002,2003,2004,2005,2006,2007,2008,2009),
PARTITION p3 VALUES IN (2010,2011,2012,2013,2014,2015,2016,2017,2018,2019)
);
- Limits on data type
- Only the following integer formats are supported: BIGINT, BIGINT UNSINGED, INT UNSINGED, INT, MEDIUMINT, MEDIUMINT UNSINGED, SMALLINT, SMALLINT UNSINGED, TINYINT, and TINYINT UNSINGED.
- Only the following date and time formats are supported: DATETIME and DATE.
- Only the following string formats are supported: CHAR and VARCHAR.
- Limits on syntax
- LIST COLUMNS partitioning does not support columns that contains timestamp values as partition keys.
- LIST partitioning supports only columns that contain integer values as partition keys.
- LIST COLUMNS partitioning and LIST partitioning do not support hot partition splitting.
- By default, the number of partitions cannot exceed 8192.
- By default, up to five partition columns are supported.
Support data types
Data type | HASH partitioning | RANGE partitioning | LIST partitioning | |||||
---|---|---|---|---|---|---|---|---|
HASH partitioning | KEY partitioning | RANGE partitioning | RANGE COLUMNS partitioning | LIST partitioning | LIST COLUMNS partitioning | |||
Single partition column | Multiple partition columns | |||||||
Integer | TINYINT | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
TINYINT UNSIGNED | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
SMALLINT | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
SMALLINT UNSIGNED | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
MEDIUMINT | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
MEDIUMINT UNSIGNED | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
INT | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
INT UNSIGNED | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
BIGINT | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
BIGINT UNSIGNED | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
Date and time | DATE | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
DATETIME | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
TIMESTAMP | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
String | CHAR | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
VARCHAR | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Data types of partition columns and routing policies
The routing policy in partitioning is determined by the data type of the partition columns, especially in HASH partitioning and KEY partitioning. Therefore, when you use partition columns that contain different types of data in partitioning, data is routed to partitions in different manners based on different hashing or comparison algorithms. For example, the comparison algorithms used to route data may be either case-sensitive or case-insensitive based on the data type of partition columns. The routing policy of MySQL is also determined by data types.
In the following example, a table named tbl_int is partitioned into 1024 partitions based on a partition column that contains INT values, and a table named tbl_bigint is partitioned into 1024 partitions based on a partition column that contains BIGINT values The data types of the partition columns of the two tables are different. Therefore, data is routed to different partitions when the same value (12345678) is queried in the two tables.
show create table tbl_int;
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE | CREATE TABLE |
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tbl_int | CREATE TABLE `tbl_int` (
`a` int(11) NOT NULL,
KEY `auto_shard_key_a` USING BTREE (`a`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4
PARTITION BY KEY(`a`)
PARTITIONS 1024 |
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.02 sec)
show create table tbl_bigint;
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE | CREATE TABLE |
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tbl_bigint | CREATE TABLE `tbl_bigint` (
`a` bigint(20) NOT NULL,
KEY `auto_shard_key_a` USING BTREE (`a`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4
PARTITION BY KEY(`a`)
PARTITIONS 1024 |
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.10 sec)mysql> create table if not exists tbl_bigint(a bigint not null)
-> partition by key(a) partitions 1024;
Query OK, 0 rows affected (28.41 sec)
explain select * from tbl_int where a=12345678;
+---------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTIONPLAN |
+---------------------------------------------------------------------------------------------------+
| LogicalView(tables="tbl_int[p260]", sql="SELECT `a` FROM `tbl_int` AS `tbl_int` WHERE (`a` = ?)") |
| HitCache:false |
| Source:PLAN_CACHE |
| TemplateId: c90af636 |
+---------------------------------------------------------------------------------------------------+
4 rows in set (0.45 sec)
explain select * from tbl_bigint where a=12345678;
+------------------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTIONPLAN |
+------------------------------------------------------------------------------------------------------------+
| LogicalView(tables="tbl_bigint[p477]", sql="SELECT `a` FROM `tbl_bigint` AS `tbl_bigint` WHERE (`a` = ?)") |
| HitCache:false |
| Source:PLAN_CACHE |
| TemplateId: 9b2fa47c |
+------------------------------------------------------------------------------------------------------------+
4 rows in set (0.02 sec)
Change of the charset and collation of partition columnsThe routing policy of a partitioned table is determined by the data type of partition columns. If you change the charset and collation of the partition columns, the distribution of all data in the table also changes. Therefore, exercise caution when you change the charset and collation of partition columns.
Data type truncation and conversion of partition columns
Data type truncation of partition columnsWhen a SQL statement is executed to query or insert data, if the values specified by the constant expression in the partition columns exceed the valid range of the partition column data type, PolarDB-X truncates the specified values based on the data types of the partition columns and then uses the truncated values to calculate the routing policy.
For example, the data type of the partition column of a table named tbl_smallint is SMALLINT whose valid range is [-32768, 32767]. In this case, if you insert a value that exceeds the valid range, such as 12345678 or -12345678, the value is truncated to 32767 or -32768. The following example shows how the value is truncated.
show create table tbl_smallint;
+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE | CREATE TABLE |
+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tbl_smallint | CREATE TABLE `tbl_smallint` (
`a` smallint(6) NOT NULL,
KEY `auto_shard_key_a` USING BTREE (`a`)
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4
PARTITION BY KEY(`a`)
PARTITIONS 128 |
+--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.06 sec)
set sql_mode='';
Query OK, 0 rows affected (0.00 sec)
insert into tbl_smallint values (12345678),(-12345678);
Query OK, 2 rows affected (0.07 sec)
select * from tbl_smallint;
+--------+
| a |
+--------+
| -32768 |
| 32767 |
+--------+
2 rows in set (3.51 sec)
explain select * from tbl_smallint where a=12345678;
+------------------------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTIONPLAN |
+------------------------------------------------------------------------------------------------------------------+
| LogicalView(tables="tbl_smallint[p117]", sql="SELECT `a` FROM `tbl_smallint` AS `tbl_smallint` WHERE (`a` = ?)") |
| HitCache:false |
| Source:PLAN_CACHE |
| TemplateId: afb464d5 |
+------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.16 sec)
explain select * from tbl_smallint where a=32767;
+------------------------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTIONPLAN |
+------------------------------------------------------------------------------------------------------------------+
| LogicalView(tables="tbl_smallint[p117]", sql="SELECT `a` FROM `tbl_smallint` AS `tbl_smallint` WHERE (`a` = ?)") |
| HitCache:true |
| Source:PLAN_CACHE |
| TemplateId: afb464d5 |
+------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.03 sec)
If you query a value that exceeds the valid range of the partition column data type, the value is truncated and then used to calculate the routing policy. Therefore, when you perform a query on the tbl_smallint table, the returned results are the same whether you set the value of the partition column to 12345678 or 32767 in the query.
Data type truncation of partition columnsWhen a SQL statement is executed to query or insert data, if the data types specified by the constant expression in the partition columns are different from the actual data types of the partition columns, PolarDB-X implicitly converts the data types of the constant expression and then calculate the routing policy based on the converted data types. However, the conversion of data types may fail. For example, the string 'abc' cannot be converted to an integer.
When PolarDB-X converts the data types of partition columns, different operations are performed for DQL, DML, and DDL statements.
- DQL statements in which the data types of partition columns in the WHERE clause are
converted
- If the data types are converted, data is routed to partitions based on the converted data types.
- If the data types fail to be converted, the condition specified based on the partition columns are ignored, and all data in the table is queried.
- DML statements such as INSERT and REPLACE
- If the data types are converted, data is routed to partitions based on the converted data types.
- If the data types fail to be converted, an error is returned and the statement is not executed.
- DDL statements related to partitioned tables such as CREATE TABLE and ALTER TABLE
- An error is returned and the statement is not executed even if the data types are converted because data type conversion is not supported in DDL statements.
- If the data types fail to be converted, an error is returned and the statement is not executed.
Difference between CREATE TABLE syntaxes in MySQL and PolarDB-X
Difference | MySQL | PolarDB-X |
---|---|---|
Whether the partition key is required to contain the primary key | Required | Not required |
KEY partitioning | Routing algorithm: modulo by the number of partitions | Routing algorithm: consistent hashing |
HASH partitioning |
|
|
Partitioning function | Supported In the PARTITION BY HASH(expr(col)) statement, expr can be a common expression, such as YEAR(col) + 1. | Supported with limits In the PARTITION BY HASH(expr(col)) statement, expr can be only
one of the following functions. Other operations, such as addition, subtraction, multiplication,
and division cannot be used in the expression.
|
Data type of partition column | In KEY partitioning, columns that contain any type of values can be used as partition columns. | In KEY partitioning, only columns that contain integer, time, and string values can be used as partition columns. |
Character set of partition column | Partition columns support all common character sets. | Partition columns support only the following three character sets:
|
Subpartitioning | Supported | Not supported |