This topic describes databases in AUTO mode, databases in DRDS mode, and the differences between the modes.

AUTO mode

In PolarDB-X V5.4.13 or later, databases in AUTO mode are supported. The databases are also known as databases in automatic partitioning mode. Databases in AUTO mode support automatic partitioning. When you create a table, you do not need to specify a partition key. Data can be automatically and evenly distributed in your cluster. You can also use standard MySQL syntax to manually partition the table. The AUTO mode provides benefits of distributed databases, such as transparent distribution, auto scaling, and partition management.

PolarDB-X databases whose version is earlier than V5.4.13 are databases in DRDS mode. Databases in DRDS mode do not support automatic partitioning. When you create a table, you can use the sharding syntax that is dedicated to PolarDB-X to specify a database shard key and a table shard key. If no database shard key and table shard key are specified, a non-partitioned table is created.

PolarDB-X V5.4.13 or later supports databases in AUTO mode and databases in DRDS mode. The database in the two modes can be deployed in the same database instance.

Parameter MODE

PolarDB-X allows you to specify the MODE parameter when you create a database so that you can decide to create the database in AUTO or DRDS mode. The following table describes the MODE parameter.

Note After the database is created, the mode of the database cannot be changed.
Parameter Value Description Syntax for creating a database Syntax for creating a database
MODE 'AUTO' Specifies that the database you create is in AUTO mode.

Example:

CREATE DATABASE auto_db MODE='AUTO';

For more information, see CREATE DATABASE.

Tables created in the database in AUTO mode are called partitioned tables. The syntax of the tables is the standard MySQL syntax. For information about how to use standard MySQL syntax, see MySQL syntax of partitioning tables.
  • 'DRDS'
  • If the MODE parameter is not configured, the default value is used.
Specifies that the database you create is in DRDS mode.

Example:

  • CREATE DATABASE drds_db MODE='DRDS';
  • CREATE DATABASE drds_db;

For more information, see CREATE DATABASE.

Tables created in the database in DRDS mode are called table shards. For more information, see Sharding syntax of DRDS databases.

Precautions

  • To create a database in AUTO mode, specify MODE='AUTO' in the CREATE DATABASE statement.
  • If you do not specify the value of the MODE parameter in the CREATE DATABASE statement, a database in DRDS mode is created by default.
  • If your database is in AUTO mode, you cannot use the sharding syntax of DRDS databases to create tables. In AUTO mode, you can create only partitioned tables.
  • If your database is in DRDS mode, you cannot use the table partitioning syntax to create tables. In DRDS mode, you can create only table shards.

Automatic partitioning and manual partitioning

Automatic partitioning

For automatic partitioning, you do not need to specify partitioning-related configuration items, such as a partition key and a partition policy when you create a table. PolarDB-X automatically selects the partition key and performs horizontal partitioning on the table and indexes in the table. Databases in AUTO mode support automatic partitioning. Databases in DRDS mode do not support automatic partitioning.

Example:

Use the standard MySQL syntax to create a table named tb. The syntax does not include partitioning-related configuration items.

CREATE TABLE tb(a INT, b INT, PRIMARY KEY(a));
  • If the preceding DDL statement is executed in a database in DRDS mode, a non-partitioned table is created. By default, the table is not partitioned.
    Execute the SHOW statement to view the complete CREATE TABLE statement:
    SHOW FULL CREATE TABLE tb \G
    *************************** 1. row ***************************
           Table: tb
    Create Table: CREATE TABLE `tb` (
        `a` int(11) NOT NULL,
        `b` int(11) DEFAULT NULL,
        PRIMARY KEY (`a`)
    ) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4
    1 row in set (0.02 sec)
  • If the preceding DDL statement is executed in a database in AUTO mode, a partitioned table is created. By default, the table is automatically partitioned by primary key.Execute the SHOW statement to view the complete CREATE TABLE statement:
    SHOW FULL CREATE TABLE tb \G
    *************************** 1. row ***************************
           TABLE: tb
    CREATE TABLE: CREATE PARTITION TABLE `tb` (
        `a` int(11) NOT NULL,
        `b` int(11) DEFAULT NULL,
        PRIMARY KEY (`a`)
    ) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4
    PARTITION BY KEY(`a`)
    PARTITIONS 16
    1 row in set (0.01 sec)

In databases in AUTO mode, you can create a table only by using standard MySQL syntax, including syntax of creating an index. The automatic partitioning feature of PolarDB-X provides benefits of distributed databases, such as transparent distribution, auto scaling, and partition management.

Manual partitioning

When you use the manual partitioning method and create a table, you must explicitly specify partitioning-related configuration items such as the partition key and partition policy. When tables are manually partitioned, the syntax that you use to create tables in databases in AUTO mode is different from the syntax that you use to create tables in databases in DRDS mode.

  • AUTO mode: You can use the standard MySQL syntax to create partitioned tables. Multiple partitioning policies such as hash partitioning, range partitioning, and list partitioning are supported.

    In the following example, the PARTITION BY HASH(a) syntax is used to create the tb table and the partition key column a and the hash partitioning policy is used.

    CREATE TABLE tb (a INT, b INT, PRIMARY KEY(a))
        -> PARTITION by HASH(a) PARTITIONS 4;
    Query OK, 0 rows affected (0.83 sec)
    
    SHOW FULL CREATE TABLE tb\G
    *************************** 1. row ***************************
           TABLE: tb
    CREATE TABLE: CREATE TABLE `tb` (
        `a` int(11) NOT NULL,
        `b` int(11) DEFAULT NULL,
        PRIMARY KEY (`a`)
    ) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4
    PARTITION BY KEY(`a`)
    PARTITIONS 4
    1 row in set (0.02 sec)
  • DRDS mode: You can use the sharding syntax for databases in DRDS mode to create tables. Only the hash policy is supported.

    In the following example, the DBPARTITION BY HASH(a) TBPARTITION BY HASH(a) syntax is used to create the tb table and the column a is used as the shard key.

    CREATE TABLE tb (a INT, b INT, PRIMARY KEY(a))
        -> DBPARTITION by HASH(a)
        -> TBPARTITION by HASH(a)
        -> TBPARTITIONS 4;
    Query OK, 0 rows affected (1.16 sec)
    
    SHOW FULL CREATE TABLE tb\G
    *************************** 1. row ***************************
           Table: tb
    Create Table: CREATE TABLE `tb` (
        `a` int(11) NOT NULL,
        `b` int(11) DEFAULT NULL,
        PRIMARY KEY (`a`)
    ) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4  dbpartition by hash(`a`) tbpartition by hash(`a`) tbpartitions 4
    1 row in set (0.02 sec)

Comparison between routing algorithms for table partitioning and sharding

The major difference between table partitioning and sharding is that the routing algorithm that is used for table partitioning is different from the routing algorithm that is used for sharding.
  • In the routing algorithm that is used for sharding, the hash values and the number of physical table shards are used to perform a modulo operation. If you want to change the number of partitions, all data needs to be rehashed. For example, if you want to change the number of partitions from four to five, all data needs to be rehashed. The routing algorithm that is used for sharding in DRDS mode is not suitable for table partitioning.
  • The default routing algorithm that is used for table partitioning is a range-based consistent hashing algorithm. This algorithm allows you to perform operations on partitions, such as splitting and merging to change the number of partitions. You do not need to rehash all data when you use this algorithm. In AUTO mode, you can change the number of partitions when tables are partitioned.

Core features and common usage scenarios of the AUTO mode

Split hot data to handle hot data

PolarDB-X provides the following methods to allow you to handle hot data.

  • Migrate data from the partition in which hot data is stored to a specific data node. This way, the dedicated storage resources in the data node can be used for the hot data to serve business. The hot data does not affect the business of non-hot data. To handle hot data, perform the following steps:
    1. Execute the following statement to extract specific hot data to an independent partition:
      ALTER TABLEGROUP #tgName EXTRACT to PARTITION #hotPartitionName BY HOT VALUE(#keyVal)
    2. Execute the following statement to move an independent partition to a specified physical resource:
      ALTER TABLEGROUP #tgName MOVE PARTITIONS #hotPartitionName TO #dn
  • If hot data exceeds the single-point performance of the machine, you can use the following method in PolarDB-X. To linearly scale your business in a more effective manner, execute the following statement to use the hash method to distribute hot data:
    ALTER TABLEGROUP #tgName SPLIT INTO PARTITIONS #hotPartitionNamePrefix #N BY HOT VALUE(#keyVal);              

    The preceding statement can split the hot data that corresponds to the value of the keyVal parameter into N parts, add the specified prefix to the partition names, and evenly distribute the partitions to different data nodes. This way, the hot data is linearly distributed among different data nodes.

Schedule partitions to balance data in a more flexible manner

A modulo operation is performed based on hash values to implement sharding in databases in DRDS mode. Make sure that a database shard is mapped to a table shard. For example, Database shard N must include Table shard m. You cannot modify the mapping between database shards and table shards. If you want to split, merge, or migrate a table shard, you need to rehash all table data.

You can migrate only database shards to a data node because table shards cannot be migrated to other data nodes. If you migrate a database shard that stores a large volume of data, data among data nodes cannot be balanced.

In AUTO mode, the consistent hashing routing algorithm is used for table partitioning. This way, you can merge, split, and migrate partitions in a flexible manner. For example, you can migrate your partitions to specified data nodes based on your business requirements. This operation does not affect irrelevant data in partitions. PolarDB-X can evenly distribute data to different data nodes by merging, splitting, and migrating data. This way, data can be balanced among data nodes.

Use TTL to automatically clean up historical data

In some business scenarios, a sudden increase occurs in the volume of business data, and the frequency in which business data are queried decreases significantly over a period of time. If the data is stored in PolarDB-X, storage space cannot be released and data queries become less efficient/are slowed down. In this case, customers choose to delete historical data from PolarDB-X after the data is archived. The historical data can be quickly deleted without affecting the existing business. This result meets customer requirements.

If DDL statements can be executed in PolarDB-X to delete historical data, data is cleared at a higher speed. PolarDB-X provides the time-to-live (TTL) feature in AUTO mode to allow you to quickly delete historical data.

If you use the syntax that is related to TTL when you create a table, a TTL table is created. The t_order table is partitioned by time that is specified in the gmt_modified column. The table is partitioned at an interval of one month. Each partition expires 12 months after the partition is created. Create partitions three months before you use the partitions.

CREATE TABLE t_order (
  id bigint NOT NULL AUTO_INCREMENT,
  gmt_modified DATETIME NOT NULL,
  PRIMARY KEY (id, gmt_modified)
)
PARTITION BY HASH(id)
PARTITIONS 16
-- The following sample code provides the TTL syntax:
LOCAL PARTITION BY RANGE (gmt_modified)  -- The table is partitioned based on the values in the gmt_modified column.
INTERVAL 1 MONTH -- The table is partitioned at an interval of one month.
EXPIRE AFTER 12 -- Each partition expires 12 months after the partition is created.
PRE ALLOCATE 3;  -- The partitions are created three months before you use the partitions.
Use the LOCALITY attribute to specify the location where data is stored

The LOCALITY attribute in multiple dimensions is provided in databases in AUTO mode. You can use this attribute to manually specify the location where data is stored.

  • You can set the LOCALITY attribute of a database to the location where data is stored. For example, you can execute the following statement to specify the location:
    CREATE DATABASE db1 MODE='AUTO' LOCALITY='dn=pxc-xdb-s-pxcexample'
  • You can set the LOCALITY attribute of a table to the location where data is stored. For example, you can execute the following statement to specify the location: For example, a non-partitioned table is stored in a data node. You can set the LOCALITY attribute to the physical location where the table is stored.
    CREATE TABLE tb (a INT, b INT, PRIMARY KEY(a)) LOCALITY='dn=pxc-xdb-s-pxcexample'
  • The LOCALITY attribute for partitions will be supported in PolarDB-X V5.4.14. For example, in business scenarios in multiple regions, if your database is in AUTO mode and list partitioning is performed, data in a region can be stored in a single data node.
    CREATE TABLE orders_region(
     id int,
     country varchar(64),
     city varchar(64),
     order_time datetime not null)
    PARTITION BY LIST COLUMNS(country,city)
    (
      PARTITION p1 VALUES IN (('China','Hangzhou'), ('China','Beijing')) LOCALITY='dn=pxc-xdb-s-pxcexample1',
      PARTITION p2 VALUES IN (('United States','NewYork'),('United States','Chicago')) LOCALITY='dn=pxc-xdb-s-pxcexample2',
      PARTITION p3 VALUES IN (('Russian','Moscow')) LOCALITY='dn=pxc-xdb-s-pxcexample3'
    );

Comparison between features of databases in AUTO mode and databases in DRDS mode

Compared with databases in DRDS mode, databases in AUTO mode support new features such as automatic partitioning, hot data splitting, partition scheduling, and TTL tables. Databases in AUTO mode are also optimized based on various aspects such as partition management and partition change.

The following table describes the differences between the main features of databases in AUTO mode and databases in DRDS mode.

Feature Database in AUTO mode Database in DRDS mode
Transparent distribution Default primary key-based partitioning Supported. If you do not specify partitioning-related configuration items when you create a table, the table is automatically partitioned based on the primary key. Not supported.
Default global secondary indexes (GSIs) Supported. If partitioning-related configuration items are not specified for index tables, tables are automatically partitioned based on indexed columns. Not supported.
Load balancing Supported. Not supported.
Hot data distribution Supported. Not supported.
Partitioning policy Hash partitioning and key partitioning Supported. The consistent hashing routing algorithm is used. Hot data can be distributed. Modulo operations can be performed based on the number of partitions to route data. Hot data cannot be distributed.
Range partitioning and range columns partitioning Supported. Hot data can be distributed. Not supported.
List partitioning and list columns partitioning Supported. Not supported.
Vector partition key (Multiple columns are used as the partition key.) Supported. A vector can be used as a partition key. For example, you can use PARTITION BY KEY(c1,c2,c3) to partition a table. Not supported.
Collation of the partition key Supported. Not supported.
Non-partitioned table and broadcast table Supported. Supported.
Partition management Partition creation, deletion, and modification Supported. Not supported.
Partition splitting and merging Supported. Not supported.
Partition migration Supported. Not supported.
Partition truncation Supported. Not supported.
Partition transparency This feature will be released soon. Hot data partitions can be automatically analyzed. Not supported.
Splitting change Change the table type to a type such as non-partitioned tables, broadcast tables, or partitioned tables Supported. Supported.
Adjust partitioning-related configuration items, including the number of partitions, partition key type, and partition policy Supported. Supported.
Auto scaling Whether data stops being written for a period of time No. Yes. Data stops being written for a short period of time.
Whether to allow other DDL statements Yes. No.
Locality Static storage isolation Supported. You can specify physical storage resources when you create databases, tables, and partitions. Supported. You can specify physical storage resources when you create databases and tables.
Dynamic storage isolation Supported. Physical storage resources used by databases and tables can be dynamically adjusted. Not supported.
Compatible with auto scaling Yes. No.
Partition pruning Partition pruning based on prefixes Supported.

For example, if you use PARTITION BY KEY(a,b,c) to partition a table, columns a, b, and c are used as the vector partition key. In this case, a=100 and b=100 or a=100 can be used to specify the partition on which partition pruning is performed.

Not supported.
Fold constants for calculation expressions Supported.

For example, partition pruning is performed based on the condition pk = POW(2, 4) that contains the calculation expression.

Not supported. The partition key must be a constant such as pk = 123. If the partition key is a calculation expression such as pk = POW(2, 4), a full table scan is performed.
Ignore spaces at the end of the line in partition routes that contain case-sensitive characters Supported.

You can specify the collation of the partition key to determine whether characters of partition routes are case-sensitive and whether spaces at the end of lines can be ignored in the routes.

Not supported. The collation cannot be used for partition columns. Characters of the hash algorithm are case-sensitive. The spaces at the end of lines cannot be ignored in the hash algorithm.
Pushdown join operation Supported.

When operations are performed on partitions, such as partition splitting, merging, and migration, the pushdown process of join operations is not affected.

Supported.
Partition selection Supported.

The syntax that is used to select partitions, such as SELECT * FROM tb PARTITIONS (p1) is used to query specific partitions.

Not supported.
TTL Supported. Not supported.

Comparison between the performance of databases in DRDS mode and databases in AUTO mode

The routing algorithm that is used for sharding in DRDS mode is different from the routing algorithm that is used for table partitioning in AUTO mode. Sysbench is used to perform a benchmark test for PolarDB-X to obtain the queries per second (QPS) of the routing algorithms in different Sysbench test scenarios. You can use the test result to evaluate the differences between the performance of the routing algorithms.

Test environment
PolarDB-X instance type: polarx.x4.2xlarge.2e
  • CN (16C64G) × 2
  • DN (16C64G) × 2

Version: 5.4.13-16415631

  • Table partitioning:
    • 32 partitions
    • Partitioning statement: partition by hash(id) partitions 32
    • Total volume of table data: 160 million rows
  • Sharding:
    • 32 physical table shards
    • Sharding statement: dbpartition by hash(id) tbpartition by hash(id) tbpartition 2
    • Total volume of table data: 160 million rows
Test scenario
The following list describes the scenarios of Sysbench:
  • oltp_point_select: point queries that contain only partition keys.
  • oltp_read_only: point queries and small-range queries such as queries that use BETWEEN operators for transactions. Point queries and small-range queries use partition keys.
  • oltp_read_write: point queries and writes and small-range queries and writes that use partitions keys for transactions
Test results
The following analysis results are provided based on the preceding test results:
  • The consistent hashing routing algorithm used in table partitioning is more complex than the modular hash routing algorithm used in sharding. In scenarios in which you use the oltp_point_select, the QPS of table partitioning is slightly decreased and is close to the QPS of sharding.
  • In scenarios in which you use oltp_read_only and oltp_read_write, the expressions in SQL statements are more complex than the expressions that you use in scenarios in which you use oltp_point_select. The expressions in the scenarios in which you use oltp_read_only and oltp_read_write are more complex because small-range queries are run. In this case, the overall QPS increases by 33% because partitioned tables are pruned.