This topic answers frequently asked questions about partitioned tables in PolarDB for MySQL.
Are partitioned tables supported in PolarDB for MySQL?
Yes. PolarDB for MySQL is fully compatible with MySQL and provides enhanced features and performance. For more information, see Overview of partitioned tables.
PolarDB for MySQL How to transform a standard table into a partitioned table?
You can use one of the following three methods to convert a standard table to a partitioned table:
Convert a standard table to a partitioned table
Create a partitioned table. Use Data Transmission Service (DTS) to perform an online data migration from the standard table to the partitioned table within the same instance. After the data is synchronized, rename the tables to complete the switch.
Before you switch the tables, briefly pause your services to ensure that all incremental data is synchronized.
For example, to convert the standard table t1 to a partitioned table, create a partitioned table named t1_partition. Then, perform an incremental migration from t1 to t1_partition using DTS. The statement to switch the table names is as follows:
rename table t1 to t1_bak, t1_partition to t1;You can use a copy DDL statement, such as
alter table ... partition by, to convert the table. During the DDL execution, write operations to the table are blocked.ALTER TABLE t1 PARTITION BY RANGE COLUMNS(create_time) ( PARTITION p0 VALUES LESS THAN ('2025-01-01'), PARTITION p1 VALUES LESS THAN ('2025-02-01'), PARTITION p2 VALUES LESS THAN ('2025-03-01') );When you quickly convert a standard table to a partitioned table, only the metadata is modified. All historical data is placed into the first partition. This process does not validate or redistribute data. Therefore, make sure that you correctly define the partition boundaries.
Compared with the copy DDL syntax, this method adds the WITHOUT VALIDATION option.
ALTER TABLE t1 PARTITION BY RANGE COLUMNS(create_time) ( PARTITION p0 VALUES LESS THAN ('2025-01-01'), PARTITION p1 VALUES LESS THAN ('2025-02-01'), PARTITION p2 VALUES LESS THAN ('2025-03-01') ) WITHOUT VALIDATION;
How many partitions can a table in PolarDB for MySQL support?
A table can have a maximum of 8,192 partitions. If you define subpartitions, the total number of subpartitions cannot exceed 8,192.
How much data is appropriate for partitioning a table?
Minimum data volume: No lower limit is imposed on the amount of data required for a table to be partitioned. You can even partition an empty table. However, partitioning a small table is not necessary.
Maximum data volume: A PolarDB for MySQL table can store up to 64 TB of data. Therefore, you must partition tables that are larger than 64 TB.
Other considerations: Unlike traditional MySQL databases, PolarDB for MySQL is optimized to support large tables. Online clusters have non-partitioned tables that are larger than 40 TB, and their access performance does not significantly decrease. For data volumes under 64 TB, partitioning is not strictly required. You can decide whether to create a partitioned table based on data growth projections and data management needs.
Data growth
When converting a standard table to a partitioned table, consider the time required for the conversion, except when quickly converting to a RANGE-partitioned table. Typically, you can use DTS to perform an online conversion by fully reading and rewriting the data. This takes about 5.8 hours for 1 TB of data. Then, you synchronize the incremental data. Alternatively, you can use a DDL operation to convert the table, but this blocks online write operations to the table during execution. Therefore, plan your partitions in advance. For example, if you expect a large data volume in the future that will require partitioning, do not wait until the data volume exceeds 10 TB. Plan the partitioning ahead of time. An online conversion for 5 TB of data takes more than a day.
Data management requirements
If you use partitioned tables mainly for data management, you can ignore the data volume in the following scenarios.
Historical data is deleted or archived by month. Only the data from the most recent 12 months is kept online. If you use a non-partitioned table, you must run a large DELETE transaction based on a time condition to clear one month of data, and then run the
OPTIMIZE TABLEcommand to release the space from the deleted data. By contrast, if you create a range-partitioned table by month, you can DROP a partition in seconds, which makes data cleanup highly efficient. Similarly, if your business requires managing data by day, week, quarter, or year, you also do not need to consider the data volume when deciding whether to use a partitioned table.Partitioned tables also simplify data management for Software as a Service (SaaS) customers who use HASH partitioning or LIST DEFAULT HASH partitioning by tenant.
Whether to partition a table mainly depends on the data size. However, in business scenarios, partitioning is often based on the number of rows. The data volume depends on the length of a single row and must be calculated for each specific case. Generally, one billion rows of 1 KB each are estimated to be 1 TB. You can consider partitioning a table when it reaches one billion rows. PolarDB for MySQL has online clusters with non-partitioned tables containing tens of billions of rows that perform without issues.
How many partitions are appropriate for a partitioned table?
The number of partitions depends on your business scenario and data volume, as long as it does not exceed 8,192.
How do I use partitioned tables?
Partitioning is usually closely related to your business. If your business data is time-related, use range partitioning. If your business data is related to regions or tenants, you can use LIST partitioning, HASH partitioning, or LIST DEFAULT HASH partitioning. If a primary partition contains too much data, you can use subpartitions. For more information, see Partitioning strategies.
Is database or table sharding required for PolarDB for MySQL databases?
No. Consider using partitioned tables instead of sharding. PolarDB for MySQL is a centralized database with a storage-compute decoupled architecture that uses shared storage, one writer node, and multiple reader nodes. A single partition or table can store up to 64 TB of data, so you do not need to consider sharding in the early stages.
How do I use table sharding in a PolarDB for MySQL database if a single table has too much data?
Use partitioned tables. For more information about partitioned tables, see Overview of partitioned tables.
After partitioning a table in a PolarDB for MySQL database, are the partition fragments stored on different nodes or on the same node?
A partitioned table splits data into small, independently managed chunks. The data is stored on the same node. For more information about partitioned tables, see Overview of partitioned tables.
If a single table in PolarDB for MySQL reaches hundreds of millions of records, is sharding required, or should I choose a partitioned table?
Use a partitioned table. For more information about partitioned tables, see Overview of partitioned tables.
Does a PolarDB for MySQL database support table partitioning? Is partitioning meaningful in PolarDB for MySQL?
Yes. A partitioned table splits data into small, independently managed chunks. This allows tables with large data volumes to maintain high performance and high availability. For more information about partitioned tables, see Overview of partitioned tables.
If a single table is estimated to have 2 TB of data, should I choose PolarDB for MySQL or PolarDB-X?
A single table in PolarDB for MySQL supports up to 64 TB. A data volume of 2 TB is relatively small, so we recommend PolarDB for MySQL. Because the data volume exceeds 1 TB, use a partitioned table.
Does PolarDB for MySQL support local indexes for partitioned tables? Can I add a secondary index to a specific primary partition or subpartition?
Yes, it does. For more information, see Partial indexes.
Are full-text indexes supported for partitioned tables?
No.
Are spatial data types, such as POINT or GEOMETRY, supported for partitioned tables?
Not supported
Are user-defined temporary tables supported for partitioned tables?
Not supported
Are foreign keys supported for partitioned tables?
Not supported
Are RANGE and LIST subpartitioning supported?
Yes. For more information, see Partitioned table types and usage.
Is the columnstore index supported for partitioned tables?
Support
Is X-Engine supported for partitioned tables?
Support
Is the table locked when I create or delete partitions?
PolarDB for MySQL 8.0.2 supports partition-level locks. It locks only the partition being operated on. This prevents large transactions from blocking DDL operations and does not affect DML operations on other partitions. For more information, see Online partition maintenance.
How do I delete a partition without a name?
For RANGE-LIST or LIST partitions, first view the table schema using the
SHOW CREATE TABLEstatement. Then, remove the partition using theALTER TABLE ... DROP PARTITIONcommand.For HASH partitions, first view the partition name using the
EXPLAIN SELECT * FROM ***statement. HASH partitions do not supportDROP PARTITION. To adjust the number of HASH buckets, you must re-partition the table using theALTER TABLE XXX PARTITION BY HASH(XXX) PARTITIONS NUM;statement.NoteThis operation can take a long time. Run it during off-peak hours.
Does partitioning in PolarDB for MySQL cause performance degradation?
Compared with a non-partitioned table, scanning the same amount of data in a partitioned table incurs a performance overhead due to switching between partitions. For the same amount of data, a non-partitioned table has only one B+ tree, while a partitioned table has a B+ tree for each partition. The tree depth is relatively lower, which results in better insert performance. For queries that use a `where` clause for partition pruning, a partitioned table can improve performance by reducing data scanning and computation. Compared with database and table sharding, a partitioned table also has performance advantages for JOIN and DDL operations.
Is there a separate charge for the partitioned table feature in PolarDB for MySQL?
The partitioned table feature is a built-in capability of the kernel and is free of charge.
Do I need to adjust any parameters when using partitioned tables?
Enable partition-level metadata locks (MDLs). For more information, see Online partition maintenance.
Can I migrate data from a non-partitioned table in a source database to a partitioned table in a destination database using DTS?
Yes. In the data synchronization task, manually create the partitioned table schema in the destination database. Then, configure the mapping relationship to synchronize the data.
Can I convert a standard table to a partitioned table during a major version upgrade?
Yes. Perform the following steps:
First, add a helper table without a primary key to the source database, such as
create table t1 (a int). This ensures that the task is interrupted if the precheck fails during the major version upgrade.Start the major version upgrade task and wait for the precheck to fail.
Create the partitioned table in the destination database. To do this, go to the Quota Center. In the row for the PolarDB for MySQL Major Version Upgrade quota, click Apply in the Operation column.
Delete the helper table, such as
t1. Click Continue Upgrade without skipping the precheck. The DTS precheck will again report an error that a table with the same name exists in the destination database.In the DTS console, ignore the error and restart the precheck. After the precheck succeeds, the synchronization task starts.
As long as the data storage format of the partitioned table is identical to that of the standard table, the subsequent synchronization tasks will complete successfully.
After setting up a partitioned table, how do I view information about all partitions?
To query partition information, you can query the information_schema.PARTITIONS table.
Can a table be partitioned only on integer columns?
You can use KEY, RANGE COLUMN, and LIST COLUMN syntax to partition columns with non-integer data. For more information, see KEY, RANGE, and LIST. You can also use a partitioning function to convert a data column into an integer column to use HASH, RANGE, or LIST partitioning.
What are the limits on partitioned tables?
The maximum number of partitions cannot exceed 8,192.
A single partition can store a maximum of 64 TB of data.
Foreign keys are not supported.
Full-text indexes are not supported.
How do I create a partitioned table?
Use the PARTITION BY syntax when you create the table. For information about how to create different types of partitioned tables, see Partitioned table types and usage.
How do I specify a partition key?
Specify the partition key using part_expr. For more information, see Partitioned table types and usage.
Is the partitioning support in PolarDB for MySQL the same as in PolarDB for PostgreSQL?
In PolarDB for PostgreSQL, partitions of a partitioned table are child tables, and each partition is an independent table. In PolarDB for MySQL, partitions are InnoDB tables. From the server layer's perspective, each partition is not an independent table.
Is there a significant performance improvement when partitioning a data table in PolarDB for MySQL?
Query performance is significantly improved for queries that use partition pruning based on partition key filter conditions. In addition, PolarDB for MySQL has optimized the performance of partitioned tables. For more information, see Overview of partitioned tables.
How does PolarDB handle large volumes of data, for example, a table that receives 10 million new rows per day, totaling 300 million rows per month?
For more information, see INTERVAL RANGE and Automated management of partitions. You can enable the online partition maintenance feature so that automatically adding or deleting partitions does not block DML operations on other partitions.
Are transactions supported for partitioned tables?
This is supported.
Are distributed transactions for database and table sharding required when using partitioned tables?
No.
Does partitioning improve write performance? Do data writes to different partitions block each other?
For large data volumes, write performance can be improved. Data writes to different partitions do not block each other.
Does adding a partition cause a lock timeout?
You can use the online partition maintenance feature to avoid this situation.
Why does dropping a partition block business operations on the entire table?
In native MySQL, dropping a partition acquires an MDL on the entire table, which blocks all write access. PolarDB for MySQL supports partition-level MDLs. Therefore, it only blocks DML operations on the partition being dropped and does not block DML operations on other partitions. This minimizes the impact on your business.
Does partitioning affect query and read/write performance?
Specify the partition key in your SQL statements to reduce the impact on performance.
How does OPTIMIZE TABLE work on a partitioned table?
OPTIMIZE TABLE acquires an MDL on the entire partitioned table, which blocks DML operations on all partitions. Use the REBUILD PARTITION command together with the online partition maintenance feature. This way, only DML operations on the partition being rebuilt are blocked, not on other partitions. This minimizes the impact on your business.
What is a safe way to delete data from a partitioned table?
You can create a new, empty temporary table with the same definition. Then, perform an EXCHANGE PARTITION operation on the partition from which you want to delete data. Finally, delete the temporary table.
Why is the query plan for my partitioned table inaccurate?
An inaccurate query plan for a partitioned table is mainly caused by inaccurate statistics. This issue has been addressed in version 8.0.2 with optimizations for partition-level statistics. Upgrade your kernel to version 8.0.2.
What should I do if partitions are unevenly distributed during partitioning?
Execute the conversion from a standard table to a partitioned table again, but do not add the WITHOUT VALIDATION keyword. After you re-run the partitioning operation, the system automatically re-validates the data and adjusts the partitions.
If you have a large amount of data, re-validating the data and setting up partitions can take a long time. Perform this operation during off-peak hours.
How do I partition a physical table in PolarDB for MySQL?
Generally, each partition is an InnoDB table. Hybrid partitions can be placed on other storage engines.
Can I restore data from a partition if I accidentally delete it?
Currently, only database-level and table-level data restoration is supported. Partition-level data restoration is not supported.
How do I resolve memory exhaustion caused by too many partitions?
This issue does not exist in PolarDB for MySQL versions 8.0.1 and 8.0.2 because the memory for partitions is shared. Upgrade your kernel version.
Why does the ADD PARTITION operation take a long time, and how can I avoid it?
The operation takes a long time because a large transaction is running on the partitioned table. PolarDB for MySQL version 8.0.2 supports partition-level MDLs, which block DML operations only on the partition being added, not on other partitions. This minimizes the impact on your business.
Can PolarDB automatically create table partitions?
Yes. For more information, see Introduction to INTERVAL RANGE and Automated management of partitions.
When using the automated partition management feature, is an event created on the RW node also executed on the RO node?
PolarDB for MySQL uses a shared storage architecture. An event created on the RW node is not executed on the RO node. Simply set the parameter to ENABLE.
When using the automated partition management feature, can the new RW node continue to execute an event after a primary/secondary failover occurs in the cluster?
After an HA failover, the new RW node can continue to execute the event.
When using automated partition management, does the event_scheduler parameter need to be set to ON on both the RW and RO nodes?
The parameter does not need to be set to ON on the RO node. You only need to set it to ON on the RW node.
How long does it take to convert a standard table to a partitioned table, for example, a table with 100 GB of data?
If you use the Quickly convert a standard table to a RANGE-partitioned table feature, the conversion can be completed in seconds. If the quick conversion feature is not suitable for your business scenario, you can use ALTER TABLE PARTITION BY to rewrite all the data in the table. This takes about 1 to 2 hours. The actual time depends on your cluster specifications and business workload.
What is the difference between LINEAR HASH partitioning and HASH partitioning?
HASH partitioning
This is also known as modulo hashing. It is the most common type of partitioning and routes data to partitions based on a modulo operation on the number of partitions.
LINEAR HASH partitioning is a hashing algorithm that performs calculations based on the properties of powers of two. Compared with HASH partitioning, it has the following advantages and disadvantages:
Advantages: When you add a new partition, it can only be created by splitting the previous specific partition. The proportion of data that needs to be read and written when adding or deleting partitions is very small.
Disadvantages: The data distribution is less uniform.
In a scenario where data is partitioned by tenant ID, how do I choose the partition type and number of partitions?
For scenarios where you partition by tenant ID, you can choose one of the following two partition types:
HASH partitioning
This is suitable for scenarios where tenant IDs are randomly generated and the data distribution is relatively scattered. You can set the number of partitions based on the following three scenarios:
For randomly generated tenant IDs, the amount of data in a single partition is generally calculated based on the total data volume. A single partition can contain between 5 million and 50 million rows. Because the data distribution may be uneven, the amount of data in a single partition is not absolute.
If the tenant IDs follow a certain pattern, such as splitting the data using 100, 200, or 500 partitions, and the data is unevenly distributed across the partitions, you can try using a prime number for the number of partitions.
If you have 1 billion rows of data, create 100 to 200 partitions. The average amount of data in a single partition will be about 5 million to 50 million rows.
HASH partitioning has two subtypes, HASH and KEY. They share the same principle and both use a modulo hashing algorithm.
If the partition key (tenant ID) is a numeric type, choose HASH partitioning.
If the partition key (tenant ID) is a character type, choose KEY partitioning.
LIST DEFAULT HASH partitioning
This is suitable for long-tail business scenarios where the data distribution is uneven and follows the 80/20 rule. For example, there are few large tenants with a large amount of data, but many small and medium-sized tenants with less data. Or, new small tenants may be added at any time and cannot all be enumerated when the table is created. In this scenario, you can use two partition types in one partitioned table. Use a separate LIST partition for each large tenant, or combine multiple large tenants into one LIST partition. The number of partitions depends on the number of large tenants and their data volume. Use HASH partitioning for other small and medium-sized tenants. The method for determining the number of HASH partitions is the same as for a HASH-partitioned table.
How do I choose an index when using a partitioned table?
For PolarDB for MySQL partitioned tables, you can choose one of the following types of indexes:
You can create different indexes on different partitions of a partitioned table based on your business scenario to meet the query needs of different partitions.
For equality query scenarios where the query condition does not include the partition key, you can use a global secondary index.
How do I migrate data from historical partitions of a partitioned table to partitions of another new partitioned table in the same database?
Assume that the partitioned table t1 has historical partitions p0 and p1. You need to migrate the data from these two historical partitions to the p0 and p1 partitions of a new partitioned table t2. Perform the following steps:
Create the new partitioned table
t2with the same schema ast1, including thep0andp1partitions.Create a standard table
tempwith the same schema as the partitioned table to be migrated.Execute EXCHANGE PARTITION to migrate data from the historical partition
p0of thet1table to thetemptable.ALTER TABLE t1 EXCHANGE PARTITION p0 WITH TABLE temp;If the data range of partition
p0int1is exactly the same as the data range of partitionp0int2, you can use the WITHOUT VALIDATION option in the data migration SQL statement to speed up the migration. For example:ALTER TABLE t1 EXCHANGE PARTITION p0 WITH TABLE temp WITHOUT VALIDATION;Execute the ALTER TABLE EXCHANGE PARTITION statement to migrate data from the
temptable to thep0partition of thet2table.ALTER TABLE t2 EXCHANGE PARTITION p0 WITH TABLE temp;Or, use the following SQL statement:
ALTER TABLE t2 EXCHANGE PARTITION p0 WITH TABLE temp WITHOUT VALIDATION;After the migration is complete, the data from the historical partition
p0of tablet1has been migrated to partitionp0of the new partitioned tablet2.Use the methods in Step 3 and Step 4 to migrate data from the historical partition
p1of tablet1to partitionp1of tablet2.After all migrations are complete, delete the
temptable.If other partitions in table
t1still need to be migrated tot2, you can use ADD PARTITION in tablet2to add new empty partitions, and then follow the steps above to migrate the data.
What should I do if an error occurs when I run an EXCHANGE PARTITION operation after running an INSTANT operation to add or modify columns in a partitioned or standard table?
When you run an EXCHANGE PARTITION operation after running an INSTANT operation to add or modify columns in a partitioned or standard table, the following error message is reported:
ERROR 1731 (HY000): Non matching attribute 'INSTANT COLUMN(s)' between partition and tableSolution: Rewrite the standard or partitioned table to remove the INSTANT information before performing the exchange. The rewrite commands are as follows:
For a standard or partitioned table in the InnoDB engine, the rewrite command is as follows:
ALTER TABLE table_name ENGINE=InnoDB;For a standard or partitioned table in the X-Engine engine, the rewrite command is as follows:
ALTER TABLE table_name ENGINE=xengine;
The rewrite operation is an online DDL and does not affect DML operations or queries, but it consumes resources. If the table has a large amount of data, run the rewrite operation during off-peak hours. You can use the following SQL statement to query the execution progress and estimated remaining time of the rewrite DDL:
SELECT
pl.ID,
pl.INFO,
esc.THREAD_ID,
esc.EVENT_NAME,
(esc.WORK_COMPLETED / esc.WORK_ESTIMATED) * 100 as PROGRESS,
pl.TIME / 60 AS `EXECUTED TIME(min)`,
ROUND(
(
esc.WORK_ESTIMATED * pl.TIME / esc.WORK_COMPLETED - pl.TIME
) / 60,
2
) AS `ESTIMATED REMAINING TIME(min)`
FROM
performance_schema.events_stages_current esc
LEFT JOIN performance_schema.threads th ON esc.thread_id = th.thread_id
LEFT JOIN information_schema.PROCESSLIST pl ON th.PROCESSLIST_ID = pl.ID;How do I query data from a partitioned table by specifying the partition name?
In PolarDB for MySQL, the syntax for querying data by specifying the partition name is as follows:
Syntax:
PARTITION (partition_names)In this syntax, partition_names represents a list of one or more partition names, in the following format:
partition_name, partition_name, ...Examples:
Query data from a specific partition
To query data from partition
p0of tablet1, you can use the following statement:SELECT * FROM t1 PARTITION (p0);
Query data from multiple partitions
To query data from multiple partitions at the same time, you can separate the partition names with commas:
SELECT * FROM t1 PARTITION (p0, p1, p2);
When you run a partition query, you must ensure that the specified partition names are exactly the same as the partition names in the actual table schema.
The partition query syntax is applicable to table types that support partitioned tables, and the database version must support the corresponding feature.