All Products
Search
Document Center

NCHAR data type

Last Updated: Jun 18, 2021

The NCHAR data type specifies the fixed-length UNICODE character data. When you create a database, the national character set defines the maximum column length. When you create a table that contains an NCHAR column, you define the column length in characters. Width specifications of the character data type NCHAR indicate the number of characters. The maximum column length is 2,000 bytes.

If you want to use less space to store Chinese characters, choose the NCHAR data type.

When you use an NCHAR column to store values, the database automatically pads the values that are shorter than the specified length with spaces to the specified length. When you specify lengths, CHAR is used as the unit of measurement. You cannot specify other units.

Notice

You cannot insert a CHAR value into an NCHAR column or insert an NCHAR value into a CHAR column.

Syntax

NCHAR[(size)]

Parameters

Parameter

Description

size

The length of a fixed-length character string. Based on the national character set, the maximum length is set to 2,000 bytes. By default, the minimum length of a fixed-length character string is one character.

More information

Unicode character set

The Unicode character set is an encoding of characters. It provides UTF-8, UTF-16, UTF-32, and other compression and conversion encoding methods. An encoding method determines the size required to store a character. Chinese characters and English characters take up different spaces varying from storage methods.

Comparison of three encoding methods

Encoding methodNumber of bytes for encoding charactersBOMAdvantageDisadvantage
UTF-8

A variable-length encoding method that provides single-byte encoding for ASCII characters and multibyte encoding for non-ASCII characters. The minimum code unit is eight bits.

Without BOM: If a byte stream starts with EF BB BF at the beginning of a text, the text is encoded in UTF-8.

An ideal Unicode encoding method: This method is fully compatible with ASCII encoding, requires no BOM, features strong self-synchronization and error correction capabilities for network transmission and communication, and provides high scalability.

The variable-length encoding makes internal processing of the program more difficult.

UTF-16

Two or four bytes. The minimum code unit is 16 bits.

With BOM: UTF-16LE (little-endian) is represented by FF FE, and UTF-16BE (big-endian) is represented by FE FF.

The earliest Unicode encoding method that has been applied to various scenarios. This method is suitable for Unicode processing in memory, and is used to encode strings in APIs across multiple programming languages.

This method is not compatible with ASCII encoding, and has poor scalability. The encoding is complex when surrogate pairs are used to encode code points in the supplementary planes.

UTF-32

A fixed length of four bytes. The minimum code unit is 16 bits.

With BOM: UTF-16LE (little-endian) is represented by FF FE, and UTF-16BE (big-endian) is represented by FE FF.

A fixed-byte encoding that is easy to read and is internally processed by a compiler. This method provides a one-to-one mapping between Unicode code points and code units.

All characters are encoded in a fixed length of four bytes, which wastes storage space and bandwidth. This method is not compatible with ASCII encoding, has a poor scalability, and is not used in most cases.

Database character set

  • Used to store the data types such as CHAR, VARCHAR2, and CLOB

  • Used to identify the information such as table names, column names, and PL/SQL variables

  • Used to store SQL and PL/SQL program units

National character set

  • Used to store the data types such as NCHAR, NVARCHAR2, and NCLOB

  • The national character set is essentially an additional character set that is selected for ApsaraDB for OceanBase. The national character set is mainly used to enhance the character processing capability of ApsaraDB for OceanBase. The NCHAR data type uses the national character set. While using the database character set provided by the CHAR data type, the NCHAR data type provides an alternative to the database character set.