All Products
Search
Document Center

PolarDB:Primary keys and unique keys (for AUTO mode)

Last Updated:Apr 11, 2024

This topic describes how to determine in AUTO databases whether the primary key of a table is global or a local, and whether the unique key of a table is global or local.

Primary keys

In PolarDB-X, primary keys include global primary keys and local primary keys.

  • If a primary key is globally unique, it is a global primary key.

  • If a primary key is unique within a partition, it is a local primary key.

Common tables and broadcast tables

The primary key in a common table or a broadcast table is a global primary key because it is globally unique.

Example 1: Global primary keys in a common table and a broadcast table

## A common table
CREATE TABLE single_tbl(
 id bigint NOT NULL AUTO_INCREMENT, 
 name varchar(30), 
 PRIMARY KEY(id)
) SINGLE;

## A broadcast table
CREATE TABLE brd_tbl(
 id bigint NOT NULL AUTO_INCREMENT, 
 name varchar(30), 
 PRIMARY KEY(id)
) BROADCAST;

Partitioned tables

In an AUTO database, if you do not specify the partition key or partition algorithm when you create a table, an automatic partitioned table is created. If you specify the partition key or partition algorithm when you create a table, a manual partitioned table is created.

Automatic partitioned tables

All primary keys in an automatic partitioned table are global primary keys because they are globally unique.

Example 2: A global primary key in an automatic partitioned table

## An automatic partitioned table
CREATE TABLE auto_tbl(
 id bigint NOT NULL AUTO_INCREMENT, 
 name varchar(30), 
 PRIMARY KEY(id)
);

Manual partitioned tables

Global primary keys

In a manual partitioned table, if the primary key columns contain all partition key columns, the primary key is a global primary key because it is globally unique.

Example 3: A global primary key in a manual partitioned table

The primary key columns of the key_tbl table are (id, name, addr), which contain all partition key columns (id, addr). Therefore, the primary key of the table is a global primary key because it is globally unique.

CREATE TABLE key_tbl(
 id bigint,
 name varchar(10),
 addr varchar(30),
 PRIMARY KEY(id, name, addr)
) PARTITION BY KEY(id, addr);

Local primary keys

In a manual partitioned table, if the primary key columns do not contain all partition key columns, the primary key is a local primary key.

Example 4: A local primary key in a manual partitioned table

The partition key column city of the list_tbl table is not included in the primary key columns. Therefore, the primary key of the table is a local primary key because it is only unique within a partition, but not globally unique.

CREATE TABLE list_tbl(
 order_id bigint,
 city varchar(50),
 name text,
 PRIMARY KEY(order_id)
) PARTITION BY LIST(city)
(
 PARTITION p1 VALUES IN ("Beijing"),
 PARTITION p2 VALUES IN ("Shanghai"),
 PARTITION p3 VALUES IN ("Guangzhou"),
 PARTITION p4 VALUES IN ("Shenzhen"),
 PARTITION p5 VALUES IN(DEFAULT)
);

Example 5: A local primary key is not globally unique

Local primary keys are unique only within a partition, but not globally unique. Therefore, duplicate primary keys may occur. The list_tbl table in Example 4 is used here. The city column is a partition key column in the table. Therefore, data in the city column is stored in different partitions.

  1. Insert a row into the list_tbl table. After the statement is executed, the data is stored in the p1 partition.

    INSERT INTO list_tbl(order_id, city, name) VALUES (10001, "Beijing", "phone");
    Query OK, 1 row affected
  2. Insert a row with the same order_id value and the same city value into the list_tbl table. Because the city value is the same, the row is expected to be stored in the p1 partition. When the statement is executed, the row cannot be inserted and an error of primary key conflict is reported. The row of the same primary key value cannot be stored in the same partition. This indicates that the local primary key can be unique only within a partition.

    INSERT INTO list_tbl(order_id, city, name) VALUES (10001, "Beijing", "book");
    (1062, "ERR-CODE: [TDDL-4614][ERR_EXECUTE_ON_MYSQL] Error occurs when execute on GROUP 'TEST_DB_P00000_GROUP' ATOM 'dskey_test_db_p00000_group#polardbx-storage-0-master#11.167.60.147-1766#test_db_p00000': Duplicate entry '10001' for key 'PRIMARY' ")
  3. Insert a row with the same order_id value but a different city value (Shanghai) into the list_tbl table. The row is expected to be stored in the p4 partition. The statement is executed. Two rows with the same primary key exist in the list_tbl table. This indicates that the local primary key is not globally unique.

    INSERT INTO list_tbl (order_id, city, name) VALUES (10001, "Shenzhen", "camera");
    Query OK, 1 row affected
    
    SELECT * FROM list_tbl;
    +----------+----------+--------+
    | order_id | city | name |
    +----------+----------+--------+
    | 10001 | Beijing | phone |
    | 10001 | Shenzhen | camera |
    +----------+----------+--------+
    2 rows in set

Example 6: If you execute a DDL statement on a table that has duplicate primary keys, an error of primary key conflict may be reported.

Because duplicate local primary keys may be used in a table, an error of primary key conflict may be reported when you perform operations related to data redistribution on the table, such as the DDL statement of modifying partitioning policies or synchronizing tables to downstream components.

The list_tbl table in Example 5 is used here. The table already has two rows with the same primary key. One row is stored in the partition whose city value is Beijing, and the other row is stored in the partition whose city value is Shenzhen. Execute the following DDL statement to modify the partitioning policy so that the rows whose city values are Beijing and Shenzhen are stored in the same partition. The DDL statement fails and an error of primary key conflict is reported. This is because this DDL statement tries to store rows with duplicate primary keys in the same partition, which violates the rule that local primary keys must be unique within a partition.

ALTER TABLE list_tbl 
PARTITION BY LIST (city)
(
 PARTITION p1 VALUES IN ("Beijing", "Shenzhen"),
 PARTITION p2 VALUES IN ("Shanghai"),
 PARTITION p3 VALUES IN ("Guangzhou"),
 PARTITION p5 VALUES IN(DEFAULT)
);
(4700, "ERR-CODE: [TDDL-4700][ERR_SERVER] server error by Failed to execute the DDL task. Caused by: ERR-CODE: [TDDL-5321][ERR_GLOBAL_SECONDARY_INDEX_BACKFILL_DUPLICATE_ENTRY] Duplicated entry '10001' for key 'PRIMARY' ")

The following methods can be used to avoid local primary key conflict when a table uses local primary keys:

  • Use the AUTO_INCREMENT attribute to allow PolarDB-X to generate primary keys.

  • You do not manually specify primary key values.

    Important

    If a table that uses local primary keys already has duplicate primary keys, you must avoid primary key conflict when you synchronize data to downstream components. For example, when you synchronize a table that uses local primary keys to AnalyticDB for MySQL by using DTS and if AnalyticDB for MySQL uses the primary keys of the PolarDB-X table, primary key conflict may occur. In this case, we recommend that you set the primary keys of AnalyticDB for MySQL to the full set of primary key columns and partition key column of the PolarDB-X table.

Unique keys

Similar to primary keys, unique keys in PolarDB-X include global unique keys and local unique keys

  • If a unique key is globally unique, it is a global unique key.

  • If a unique key is unique within a partition, it is a local unique key.

Common tables and broadcast tables

The unique key in a common table or a broadcast table is a global unique key because it is globally unique.

Example 7: Global unique keys in a common table and a broadcast table

## A common table
CREATE TABLE single_tbl(
 serial_id bigint,
 name varchar(30), 
 UNIQUE KEY(serial_id)
) SINGLE;

## A broadcast table
CREATE TABLE brd_tbl(
 serial_id bigint,
 name varchar(30), 
 	UNIQUE KEY(serial_id)
) BROADCAST;

Partitioned tables

In an AUTO database, if you do not specify the partition key or partition algorithm when you create a table, an automatic partitioned table is created. If you specify the partition key or partition algorithm when you create a table, a manual partitioned table is created.

Automatic partitioned tables

All unique keys in an automatic partitioned table are global unique keys because they are globally unique.

Example 8: A global unique key in an automatic partitioned table

## An automatic partitioned table
CREATE TABLE auto_tbl(
 serial_id bigint,
 name varchar(30), 
 UNIQUE KEY(serial_id)
);

Manual partitioned tables

Global unique keys

In a manual partitioned table, if the unique key columns contain all partition key columns, the unique key is a global unique key because it is globally unique.

Example 9: A global unique key in a manual partitioned table

The unique key columns of the hash_tbl table are (inner_id, type_id), which contain all partition key column (type_id). Therefore, the unique key of the table is a global unique key because it is globally unique.

CREATE TABLE hash_tbl(
 type_id int,
 inner_id int,
 UNIQUE KEY(inner_id, type_id)
) PARTITION BY HASH(type_id);

Example 10: A globally unique key that consists of unique global indexes in a manual partitioned table

In a manual partitioned table, the global secondary index is also a globally unique key because it is globally unique. The key_tbl table contains the unique global index of the serial_id index column. It is a globally unique key because it ensures that the serial_id index column is globally unique.

CREATE TABLE key_tbl(
 type_id int,
 serial_id int,
 UNIQUE GLOBAL INDEX u_sid(serial_id) PARTITION BY HASH(serial_id)
) PARTITION BY HASH(type_id);

Local unique keys

Example 11: A local unique key in a manual partitioned table

In a manual partitioned table, if the unique key columns do not contain all partition key columns, the unique key is a local unique key.

The unique key column of the range_tbl table is serial_id and does not contain the partition key column order_time. Therefore, it is a local unique key because it is only unique within a partition, but not globally unique.

CREATE TABLE range_tbl(
 id int primary key auto_increment,
 serial_id int,
 	order_time datetime NOT NULL,
 UNIQUE KEY(serial_id)
) PARTITION BY RANGE(order_time)
(
 PARTITION p1 VALUES LESS THAN ('2022-12-31'),
 PARTITION p2 VALUES LESS THAN ('2023-12-31'),
 PARTITION p3 VALUES LESS THAN (MAXVALUE)
);

Example 12: A local unique key is not globally unique

Local unique keys are unique only within a partition, but not globally unique. Therefore, duplicate unique keys may occur.

  1. The range_tbl table in Example 11 is used here. Insert a row into the range_tbl table. After the statement is executed, the data is stored in the p1 partition.

    INSERT INTO range_tbl(serial_id, order_time) VALUES (20001, '2022-01-01');
    Query OK, 1 row affected
  2. Insert a row whose serial_id value is the same as the preceding row and whose order_time value is 2022-01-02 into the range_tbl table. The order_time value indicates that the row is still expected to be stored in the p1 partition. When the statement is executed, the row cannot be inserted and an error of unique key conflict is reported. The row of the same unique key value cannot be stored in the same partition. This indicates that the local unique key can be unique only within a partition.

    INSERT INTO info_tbl(serial_id, order_time) VALUES (20001, '2022-01-02');
    (1062, "ERR-CODE: [TDDL-4614][ERR_EXECUTE_ON_MYSQL] Error occurs when execute on GROUP 'D25_000001_GROUP' ATOM 'dskey_d25_000001_group#polardbx-storage-1-master#11.167.60.147-1766#d25_000001': Duplicate entry '20001' for key 'serial_id' ")
  3. Insert a row whose serial_id value is the same as the preceding row and whose order_time value is 2023-01-01 into the range_tbl table. The order_time value indicates the row is expected to be stored in the p3 partition. The statement is executed. Two rows with the same unique key exist in the range_tbl table. This indicates that the local unique key is not globally unique.

    INSERT INTO range_tbl(serial_id, order_time) VALUES (20001, '2024-01-01');
    Query OK, 1 row affected
    
    SELECT * FROM range_tbl;
    +----+-----------+---------------------+
    | id | serial_id | order_time |
    +----+-----------+---------------------+
    | 2 | 20001 | 2024-01-01 00:00:00 |
    | 1 | 20001 | 2022-01-01 00:00:00 |
    +----+-----------+---------------------+
    2 rows in set

Example 13: If you execute a DDL statement on a table that has duplicate unique keys, an error of unique key conflict may be reported.

Because duplicate local unique keys may be used in a table, an error of unique key conflict may be reported when you perform operations related to data redistribution on the table, such as the DDL statement of modifying partitioning policies or synchronizing tables to downstream components.

The range_tbl table in Example 12 is used here. The table already has two rows with the same serial_id value. One row is stored in the p1 partition, and the other row is stored in the p2 partition. Execute the following DDL statement to change the range_tbl table from a manual partitioned table into a common table. This triggers data redistribution in the range_tbl table.

ALTER TABLE range_tbl SINGLE;
(4700, "ERR-CODE: [TDDL-4700][ERR_SERVER] server error by Failed to execute the DDL task. Caused by: ERR-CODE: [TDDL-5321][ERR_GLOBAL_SECONDARY_INDEX_BACKFILL_DUPLICATE_ENTRY] Duplicated entry '200001' for key 'PRIMARY' ")

The DDL statement fails and an error of unique key conflict is reported. This is because this DDL statement tries to change the range_tbl table into a common table which contain duplicate unique keys, which violates the rule that unique keys must be unique within a common table.

To avoid local unique key conflict when a table uses local unique keys, you must ensure that local unique keys are not duplicate.

Important

If a table that uses local unique keys already has duplicate unique keys, we recommend that you manually modify source data to avoid unique key conflict when you synchronize data to downstream components.

FAQ

  • Is a separate statement used to create a global primary key or a global unique key?

    No. You can use the same MySQL statement to create a primary key or a unique key. Make sure that the primary key or unique key that you create meet the requirements for a global primary key or a global unique key.

  • A local primary key is currently used in the table. How can I make sure that the local primary key is globally unique?

    You can use the sequence feature to generate globally unique primary keys. For more information, see Sequence.