All Products
Search
Document Center

PolarDB:Instantly change the character set of a column

Last Updated:Aug 29, 2024

If you specify a UTF-8 character set for a column in MySQL, the UTF8MB3 character set is used by default. This character set uses up to three bytes to store each character. If you need to store information such as emoji, you must change the column character set to UTF8MB4. In most cases, if you want to change a column character set, the entire table must be rebuilt. This process is time-consuming and has a significant impact on your business. PolarDB for MySQL allows you to instantly change a column character set without affecting your business.

Prerequisites

Your cluster meets one of the following version requirements:

  • A PolarDB for MySQL 8.0.1 cluster whose revision version is 8.0.1.1.40 or later. For information about how to view the version of your cluster, see the Query the engine version section of the "Engine versions" topic.

    Note

    To use this feature in clusters of PolarDB for MySQL 8.0.1, you must first configure the loose_innodb_support_instant_modify_charset parameter.

  • PolarDB for MySQL 8.0.2.

    Note

    This feature is provided out-of-the-box with PolarDB for MySQL 8.0.2 clusters. No additional configurations are required.

Limits

  • You can change the character sets for columns only of the CHAR, VARCHAR, ENUM, or TEXT data type.

  • You can only change the character set of a column from UTF8MB3 to UTF8MB4, or change the data type of a column from a random type to BINARY.

  • No index is created on the column whose character set needs to be changed.

  • The length of the column whose character set needs to be changed must meet the following requirements:

    • Before and after you change the character set of a column, the maximum storage length of the column is less than 256 bytes or greater than 255 bytes. The storage length is the number of bytes that the character set actually occupies.

      For example, if a column of the VARCHAR(85) data type uses the UTF8MB3 character set, the maximum storage length of the column is calculated in the following way: 85 × 3 = 255 bytes. This is because the UTF8MB3 character set uses up to three bytes to store each character. If the character set of the column is changed to UTF8MB4, the maximum storage length of the column is calculated in the following way: 85 × 4 = 340 bytes. Because the maximum storage length after change is greater than 256 bytes, the character set of the column cannot be instantly changed.

    • The maximum storage length of a column after change cannot be less than that before change.

      For example, if a column of the CHAR(120) data type uses the UTF8MB3 character set, the maximum storage length of the column is calculated in the following way: 120 × 3 = 360 bytes. If you change the data type of the column to BINARY, the maximum storage length of the column is calculated in the following way: 120 × 1 = 120 bytes. In this case, the character set of the column cannot be instantly changed.

Use the feature to instantly change a column character set

  1. Configure parameters.

    For clusters of PolarDB for MySQL 8.0.1, you must set the loose_innodb_support_instant_modify_charset parameter to ON to enable the feature. For more information about how to configure parameters, see Configure cluster and node parameters.

    Parameter

    Level

    Description

    loose_innodb_support_instant_modify_charset

    Global

    Specifies whether to enable the feature to instantly change a column character set. Default value: OFF. Valid values:

    • OFF

    • ON

  2. Execute statements.

    • Set the ALGORITHM parameter to INPLACE to forcibly enable the feature to instantly change a column character set. Example:

      ALTER TABLE tablename MODIFY COLUMN test_column varchar(60) CHARACTER SET utf8mb4, ALGORITHM = INPLACE;

      If the INPLACE algorithm is unavailable, the following message is returned. In this case, we recommend that you check whether the loose_innodb_support_instant_modify_charset parameter is set to ON and check the rules described in the "Limits" section of this topic.

      ERROR 1846 (0A000): ALGORITHM=INPLACE is not supported. Reason: Cannot change column type INPLACE. Try ALGORITHM=COPY.
    • If you set the ALGORITHM parameter to DEFAULT or do not specify the ALGORITHM parameter, PolarDB selects the fastest algorithm to change a column character set. Example:

      ALTER TABLE tablename MODIFY COLUMN test_column varchar(60) CHARACTER SET utf8mb4, ALGORITHM = DEFAULT;
      ALTER TABLE tablename MODIFY COLUMN test_column varchar(60) CHARACTER SET utf8mb4;

View execution results

In this example, two tables named t1 and t2 are used and each table contains 10 million rows of data. The column character set is changed from UTFMB3 to UTF8MB4 when the feature to instantly change a column character set is enabled and disabled. The t1 and t2 tables have the same table structure and data size.

  • When the feature is disabled, the entire table must be rebuilt if you want to change a column character set. It takes approximately 1 minute to recreate the entire table and change the column character set, and only query operations can be performed during the entire DDL process.

    ALTER TABLE t1 MODIFY COLUMN c varchar(2)  CHARACTER SET utf8mb4 NOT NULL DEFAULT '';

    Sample result:

    Query OK, 10000000 rows affected (59.66 sec)
    Records: 10000000  Duplicates: 0  Warnings: 0
  • When the feature is enabled, you can instantly change a column character set regardless of the data size of the table.

    ALTER TABLE t2 MODIFY COLUMN c varchar(2)  CHARACTER SET utf8mb4 NOT NULL DEFAULT '',ALGORITHM = INPLACE;

    Sample result:

    Query OK, 0 rows affected (0.03 sec)
    Records: 0  Duplicates: 0  Warnings: 0

Contact Us

If you have any questions about DDL operations, please feel free to reach out to Contact us.