Core features and common usage scenarios of the AUTO mode - PolarDB

Split hot data to handle hot data

PolarDB-X provides the following methods to allow you to handle hot data.

Migrate data from the partition in which hot data is stored to a specific data node. This way, the dedicated storage resources in the data node can be used for the hot data to serve business. The hot data does not affect the business of non-hot data. To handle hot data, perform the following steps:
1. Execute the following statement to extract specific hot data to an independent partition:
```
ALTER TABLEGROUP #tgName EXTRACT to PARTITION #hotPartitionName BY HOT VALUE(#keyVal)
```
2. Execute the following statement to move an independent partition to a specified physical resource:
```
ALTER TABLEGROUP #tgName MOVE PARTITIONS #hotPartitionName TO #dn
```
If hot data exceeds the single-point performance of the machine, you can use the following method in PolarDB-X. To linearly scale your business in a more effective manner, execute the following statement to use the hash method to distribute hot data:
```
ALTER TABLEGROUP #tgName SPLIT INTO PARTITIONS #hotPartitionNamePrefix #N BY HOT VALUE(#keyVal);              
```
The preceding statement can split the hot data that corresponds to the value of the keyVal parameter into N parts, add the specified prefix to the partition names, and evenly distribute the partitions to different data nodes. This way, the hot data is linearly distributed among different data nodes.

Schedule partitions to balance data in a more flexible manner

A modulo operation is performed based on hash values to implement sharding in databases in DRDS mode. Make sure that a database shard is mapped to a table shard. For example, Database shard N must include Table shard m. You cannot modify the mapping between database shards and table shards. If you want to split, merge, or migrate a table shard, you need to rehash all table data.

You can migrate only database shards to a data node because table shards cannot be migrated to other data nodes. If you migrate a database shard that stores a large volume of data, data among data nodes cannot be balanced.

In AUTO mode, the consistent hashing routing algorithm is used for table partitioning. This way, you can merge, split, and migrate partitions in a flexible manner. For example, you can migrate your partitions to specified data nodes based on your business requirements. This operation does not affect irrelevant data in partitions. PolarDB-X can evenly distribute data to different data nodes by merging, splitting, and migrating data. This way, data can be balanced among data nodes.

Use TTL to automatically clean up historical data

In some business scenarios, a sudden increase occurs in the volume of business data, and the frequency in which business data are queried decreases significantly over a period of time. If the data is stored in PolarDB-X, storage space cannot be released and data queries become less efficient/are slowed down. In this case, customers choose to delete historical data from PolarDB-X after the data is archived. The historical data can be quickly deleted without affecting the existing business. This result meets customer requirements.

If DDL statements can be executed in PolarDB-X to delete historical data, data is cleared at a higher speed. PolarDB-X provides the time-to-live (TTL) feature in AUTO mode to allow you to quickly delete historical data.

If you use the syntax that is related to TTL when you create a table, a TTL table is created. In the following example, the t_order table is partitioned by time that is specified in the gmt_modified column. The table is partitioned at an interval of one month. Each partition expires 12 months after the partition is created. Create partitions three months before you use the partitions.

CREATE TABLE t_order (
  id bigint NOT NULL AUTO_INCREMENT,
  gmt_modified DATETIME NOT NULL,
  PRIMARY KEY (id, gmt_modified)
)
PARTITION BY HASH(id)
PARTITIONS 16
-- The following sample code provides the TTL syntax:
LOCAL PARTITION BY RANGE (gmt_modified)  -- The table is partitioned based on the values in the gmt_modified column.
INTERVAL 1 MONTH -- The table is partitioned at an interval of one month.
EXPIRE AFTER 12 -- Each partition expires 12 months after the partition is created.
PRE ALLOCATE 3;  -- The partitions are created three months before you use the partitions.

Use the LOCALITY attribute to specify the location where data is stored

The LOCALITY attribute in multiple dimensions is provided in databases in AUTO mode. You can use this attribute to manually specify the location where data is stored.

You can set the LOCALITY attribute of a database to the location where data is stored. For example, you can execute the following statement to specify the location:
```
CREATE DATABASE db1 MODE='AUTO' LOCALITY='dn=pxc-xdb-s-pxcexample'
```
You can set the LOCALITY attribute of a table to the location where data is stored. For example, you can execute the following statement to specify the location: For example, a non-partitioned table is stored in a data node. You can set the LOCALITY attribute to the physical location where the table is stored.
```
CREATE TABLE tb (a INT, b INT, PRIMARY KEY(a)) LOCALITY='dn=pxc-xdb-s-pxcexample'
```

The LOCALITY attribute for partitions will be supported in PolarDB-X V5.4.14. For example, in business scenarios in multiple regions, if your database is in AUTO mode and list partitioning is performed, data in a region can be stored in a single data node.

CREATE TABLE orders_region(
 id int,
 country varchar(64),
 city varchar(64),
 order_time datetime not null)
PARTITION BY LIST COLUMNS(country,city)
(
  PARTITION p1 VALUES IN (('China','Hangzhou'), ('China','Beijing')) LOCALITY='dn=pxc-xdb-s-pxcexample1',
  PARTITION p2 VALUES IN (('United States','NewYork'),('United States','Chicago')) LOCALITY='dn=pxc-xdb-s-pxcexample2',
  PARTITION p3 VALUES IN (('Russian','Moscow')) LOCALITY='dn=pxc-xdb-s-pxcexample3'
);