UTL_I18N is a PL/SQL package that provides globalization utilities for PolarDB for Oracle applications. It includes functions to convert between strings and character references, and to convert strings and RAW data across character sets.
Subprograms
| Subprogram | Description |
|---|---|
| ESCAPE_REFERENCE | Converts a text string to HTML/XML character references |
| UNESCAPE_REFERENCE | Converts character references back to a text string |
| STRING_TO_RAW | Converts a VARCHAR2 or NVARCHAR2 string to RAW data in a target character set |
| RAW_TO_CHAR | Converts RAW data to a VARCHAR2 string using a specified source character set |
ESCAPE_REFERENCE
Converts a text string to character references for use in HTML and XML documents. Character references represent characters independently of the document encoding, and appear in two forms:
Numeric character reference: Uses the Unicode code point. For example,
årepresents the letter "å" (small "a" with a ring above).Character entity reference: Uses a symbolic name. For example,
årepresents the same character. Special characters like<are also escaped —<avoids confusion with HTML/XML tag syntax.
Supported character sets: SQL_ASCII, UTF8, EUC_CN, GB18030, ISO_8859_5, LATIN1.
Syntax
UTL_I18N.ESCAPE_REFERENCE(
str IN VARCHAR2 CHARACTER SET ANY_CS,
page_cs_name IN VARCHAR2 DEFAULT NULL)
RETURN VARCHAR2 CHARACTER SET str%CHARSET;Parameters
| Parameter | Description |
|---|---|
str | The string to escape. |
page_cs_name | The character set of the document. |
Return values
| Return value | Description |
|---|---|
VARCHAR2 | The escaped string, with applicable characters replaced by character references. |
Example
SELECT UTL_I18N.ESCAPE_REFERENCE('hello< ' || chr(229), 'sql_ascii') FROM dual;Output:
escape_reference
-------------------
hello < å
(1 row)The < character is escaped to < (character entity reference) and chr(229) (å) to å (numeric character reference), because sql_ascii does not contain å as a named entity.
UNESCAPE_REFERENCE
Decodes each character reference in an input string to its corresponding character value, returning the original unescaped text.
Syntax
UTL_I18N.UNESCAPE_REFERENCE(
str IN VARCHAR2 CHARACTER SET ANY_CS)
RETURN VARCHAR2 CHARACTER SET str%CHARSET;Parameters
| Parameter | Description |
|---|---|
str | The string containing character references to unescape. |
Return values
| Return value | Description |
|---|---|
VARCHAR2 | The unescaped string with character references decoded to their character values. |
Example
SELECT UTL_I18N.UNESCAPE_REFERENCE('hello< å') FROM dual;Output:
unescape_reference
--------------------
hello < å
(1 row)STRING_TO_RAW
Converts a VARCHAR2 or NVARCHAR2 string to a target character set and returns the result as RAW data.
Syntax
UTL_I18N.STRING_TO_RAW(
data IN VARCHAR2 CHARACTER SET ANY_CS,
dst_charset IN VARCHAR2 DEFAULT NULL)
RETURN RAW;Parameters
| Parameter | Description |
|---|---|
data | The VARCHAR2 or NVARCHAR2 string to convert. |
dst_charset | The destination character set. |
Return values
| Return value | Description |
|---|---|
RAW | The input string encoded in the destination character set, returned as RAW data. |
Example
SELECT utl_i18n.string_to_raw('abcdef', 'utf8') FROM dual;Output:
string_to_raw
----------------
\x616263646566
(1 row)RAW_TO_CHAR
Converts RAW data — encoded in a specified character set other than the database character set — to a VARCHAR2 string.
Syntax
UTL_I18N.RAW_TO_CHAR(
data IN RAW,
src_charset IN VARCHAR2 DEFAULT NULL)
RETURN VARCHAR2;Parameters
| Parameter | Description |
|---|---|
data | The RAW data to convert. |
src_charset | (Optional) The character set of the RAW data. |
Return values
| Return value | Description |
|---|---|
VARCHAR2 | The VARCHAR2 string decoded from the RAW data using the specified source character set. |
Example
Convert RAW data (UTF-8 encoded hex bytes for the uppercase alphabet) to a VARCHAR2 string:
SELECT utl_i18n.raw_to_char(
'\x4142434445464748494a4b4c4d4e4f505152535455565758595a'::bytea,
'UTF8'
) FROM dual;Output:
raw_to_char
--------------
ABCDEFGHIJKLMNOPQRSTUVWXYZ
(1 row)