All Products
Search
Document Center

PolarDB:Character types

Last Updated:Mar 28, 2024

This topic describes the definition and syntax of the character types.

Character types

Name

Description

character varying(n), varchar(n)

A character whose length is variable and limited.

character(n), char(n)

A character whose length is fixed and padded with spaces.

text

A character whose length is variable and unlimited.

CLOB

A large character string whose length is variable and can be up to 1 GB.

LONG

A character string whose length is variable and unlimited.

NVARCHAR(n)

A character string whose length is variable and limited.

NVARCHAR2(n)

A character string whose length is variable and limited.

STRING

An alias of the VARCHAR2 type.

VARCHAR2(n)

A character string whose length is variable and limited.

SQL defines two basic character types: character varying(``n``) and character(``n``). n is a positive integer. Both types can store strings of up to n characters in length. An error occurs if a longer string is stored into a column of these types unless the excess characters are all spaces. If the excess characters are all spaces, the string is truncated to the maximum length. This exception is required by the SQL standard. If the length of the string to be stored is shorter than the specified length, values of the character type are padded with spaces, while values of the character varying type simply store the shorter string.

However, if you explicitly cast a value to the character varying(``n``) or character(``n``) type, a value whose length is longer than the specified length is truncated to n characters and no error is reported. This is also required by the SQL standard.

varchar(``n``) is an alias of character varying(n), and char(``n``) is an alias of character(``n``). The character type without a length specifier is equivalent to the character(1) type. If the character varying type lacks a length specifier, the type supports strings of any length. The character(1) type is an extension of PolarDB for PostgreSQL (Compatible with Oracle).

In addition, PolarDB for PostgreSQL (Compatible with Oracle) provides the text type, which stores strings of any length. Although the text type is not in the SQL standard, several SQL database management systems support the text type.

Values of the character type are physically padded with spaces to the specified length of n characters, and are stored and displayed this way. However, trailing spaces are considered as semantically insignificant and disregarded when two values of the character type are compared. In collations where space is significant, this behavior can produce unexpected results. For example, SELECT 'a '::CHAR(2) collate "C" < E'a\n'::CHAR(2) returns true, even if the C locale considers a space to be greater than a newline. Trailing spaces are removed if a character value is converted to one of the other string types. Trailing spaces are semantically significant in character varying and text values, and when you perform pattern matching by using the LIKE operator or a regular expression.

The storage of a string of these types requires 4 bytes plus the actual length of the string. For a string of the character type, the padded spaces are also included. Long strings are automatically compressed by the system. Therefore, less physical space on disks may be required. Long values are also stored in background tables. This way, long values do not interfere with quick access to shorter column values. A character string that can be stored can be up to 1 GB in length. The maximum value that is allowed for n in the data type declaration is less than 1 GB. It is not useful to change this because the number of characters and bytes may be quite different in a multibyte encoding. If you want to store long strings without a specific upper limit, use the text type or the character varying type without a length specifier instead of specifying an arbitrary length limit. The storage of a short string (up to 126 bytes in length) requires 1 byte plus the actual length of the string, which includes the padded spaces if the string is of the character type. Longer strings require an overhead of 4 bytes instead of 1 byte. Long strings are automatically compressed by the system. Therefore, less physical space on disks may be required. Long values are also stored in background tables. This way, long values do not interfere with quick access to shorter column values. A character string that can be stored can be up to 1 GB in length. The maximum value that is allowed for n in the data type declaration is less than 1 GB. It is not useful to change this because the number of characters and bytes may be totally different in a multibyte character encoding. If you want to store long strings without a specific upper limit, use the text type or the character varying type without a length specifier instead of specifying an arbitrary length limit.

The three types have no performance difference, except increased storage space when the space-padded type is used, and a small number of extra CPU cycles to check the length when values of the types are stored into a length-constrained column. Although the character(n) type has performance advantages in some database systems, such advantages do not apply in PolarDB for PostgreSQL (Compatible with Oracle). In fact, the character(n) type has the slowest performance among the three types because it requires additional storage costs. In most cases, the text or character varying type is preferred.

The following sample code provides an example on how to use character types:

    CREATE TABLE test1 (a character(4));
    INSERT INTO test1 VALUES ('ok');
    SELECT a, char_length(a) FROM test1; -- (1)

      a   | char_length
    ------+-------------
     ok   |           2


    CREATE TABLE test2 (b varchar(5));
    INSERT INTO test2 VALUES ('ok');
    INSERT INTO test2 VALUES ('good      ');
    INSERT INTO test2 VALUES ('too long');
    ERROR:  value too long for type character varying(5)
    INSERT INTO test2 VALUES ('too long'::varchar(5)); -- explicit truncation
    SELECT b, char_length(b) FROM test2;

       b   | char_length
    -------+-------------
     ok    |           2
     good  |           5
     too l |           5

PolarDB for PostgreSQL (Compatible with Oracle) supports another two fixed-length character types. The name type is used only to store identifiers in the internal system catalogs and is not for general-purpose use. The length of the name type is defined as 64 bytes, which includes 63 usable characters and the terminator. The name type must be referenced by using the NAMEDATALEN constant in C source code. The length is specified when compiling is performed and therefore can be adjusted for special purposes. The default maximum length may change in a future release. The "char" type (note the double quotation marks) is different from the char(1) type. The "char" type uses only one byte of storage. Therefore, the "char" type is used in the system catalogs as a simplistic enumeration type.

Special character types

Name

Storage size

Description

"char"

1 byte

A single-byte internal type.

name

64 bytes

An internal type for object names.