All Products
Search
Document Center

MaxCompute:RLIKE

Last Updated:Mar 26, 2026

RLIKE is a MaxCompute SQL operator based on the Perl Compatible Regular Expressions (PCRE) standard. Use it to perform pattern matching directly in SQL queries.

Support scope: RLIKE supports PCRE syntax, including the Perl backslash sequences \d, \D, and back references (\1\9). For the complete PCRE specification, see the PCRE documentation.

Known limitation: MaxCompute does not support the escape sequences \f (form feed) and \v (vertical tab). For supported escape sequences, see Escape characters.

Metacharacters

The following table lists the metacharacters supported by RLIKE.

Metacharacter Description
^ Matches the beginning of a string.
$ Matches the end of a string.
. Matches any single character.
* Matches the preceding character or pattern zero or more times.
+ Matches the preceding character or pattern one or more times.
? Matches the preceding character or pattern zero or one time. When placed after a quantifier (*, +, ?, {n}, {n,}, or {n,m}), switches that quantifier to non-greedy mode. Non-greedy mode matches as few characters as possible; the default greedy mode matches as many as possible.
A|B Matches A or B.
(abc)* Matches the sequence abc zero or more times.
{n} or {m,n} {n} matches exactly n times. {m,n} matches between m and n times.
[ab] Matches any character listed in the brackets.
[a-d] Matches any character in the range a through d (a, b, c, or d).
[^ab] ^ means NOT. Matches any character that is not a or b.
[::] POSIX character group syntax. See POSIX character groups.
\\ Escape character. See Escape character rules.
\\n Back reference, where n is a digit from 1 to 9. Refers to the nth capturing group.
\\d Matches any digit (0–9).
\\D Matches any non-digit character.

Escape character rules

RLIKE uses the backslash (\) as an escape character. Because a regular expression pattern is written as a SQL string literal, escaping happens at two layers:

  1. SQL string layer: The SQL parser processes the string first. A single \ in source code becomes a literal backslash in memory.

  2. Regex engine layer: The PCRE engine then processes the resulting string. A \ before a special character escapes it in the regex.

Because of this two-layer processing, every backslash in a RLIKE pattern must be written twice in SQL source code.

Goal Regex engine sees SQL source code
Match the literal + \+ \\+
Match a literal backslash \ \\ \\\\
Match a digit (\d) \d \\\d
Match a back reference to group 2 (\2) \2 \\\2
Tab characters (\t) are an exception. When MaxCompute reads a tab from a string literal, it stores it as a single character. The RLIKE pattern a\tb matches a tab directly—no extra escaping is needed.

Examples

General rule matching

-- Match a string that begins with 'a'. Returns true.
SELECT 'aa123bb' RLIKE '^a';

-- Match a string that begins with 'a' and ends with 'b'. Returns true.
SELECT 'aa123bb' RLIKE '^a.*b$';

-- Match a string that contains 'foo' followed by 'bar'. Returns true.
SELECT 'footerbar' RLIKE 'foo(.*?)(bar)';

-- Match a string that begins with 'foo' and ends with 'bar'. Returns true.
SELECT 'footerbar' RLIKE '^foo(.*?)(bar)$';

-- Match a string whose first character is in the range a-d. Returns true.
SELECT 'cc123bb' RLIKE '^[a-d]';

-- Match '12', then one or more characters from a-d, then '34'. Returns true.
SELECT '12abc34' RLIKE '^12[a-d]+34$';

Escape character matching

Matching a special regex character (`+`)

To match the literal string a+b, escape + for the regex engine (\+), then escape the backslash again for the SQL string layer (\\+):

SELECT 'a+b' RLIKE 'a\\+b';
-- Returns: true

Matching a literal backslash

To match \, the regex engine needs \\. With the SQL string layer, write \\\\:

-- Wrong: only one layer of escaping. Returns false.
SELECT 'a\\b', 'a\\b' RLIKE 'a\\\b';
-- _c0: a\b | _c1: false

-- Correct: two layers of escaping. Returns true.
SELECT 'a\\b', 'a\\b' RLIKE 'a\\\\b';
-- _c0: a\b | _c1: true
In MaxCompute SQL, writing a\\b in source code displays as a\b in query output, because MaxCompute processes the SQL string escape before displaying the value.

Matching a tab character

Tab characters need no extra escaping:

SELECT 'a\tb', 'a\tb' RLIKE 'a\tb';
-- _c0: a	b | _c1: true

Matching digits, non-digits, and back references

All Perl backslash sequences need the extra SQL-layer escape:

-- Match a string that begins with a digit. Returns true.
SELECT '2025maxcompute' RLIKE '^\\\d';

-- Match a string that begins with a non-digit character. Returns true.
SELECT 'maxcompute2025test' RLIKE '^\\\D';

-- Match a string containing 'MC' followed by any digit. Returns true.
SELECT 'alibaba-cloud-MC2025-test' RLIKE 'MC\\\d';

-- Match a string containing 'MC' followed by exactly four digits and '-'. Returns true.
SELECT 'alibaba-cloud-MC2025-test' RLIKE 'MC\\\d{4}-';

-- Match a string containing 'abcdefdef' (back reference to capturing group 2).
-- Group 1: (abc), Group 2: (def), \2 matches 'def' again. Returns true.
SELECT 'mmabcdefdefgg' RLIKE '(abc)(def)\\\2';

Character groups

POSIX character groups

Character group Description Equivalent
[[:alnum:]] Letters and digits [a-zA-Z0-9]
[[:alpha:]] Letters [a-zA-Z]
[[:ascii:]] ASCII characters [\x00-\x7F]
[[:blank:]] Space and tab [ \t]
[[:cntrl:]] Control characters [\x00-\x1F\x7F]
[[:digit:]] Digits [0-9]
[[:graph:]] Printable characters except space [\x21-\x7E]
[[:lower:]] Lowercase letters [a-z]
[[:print:]] [:graph:] plus space [\x20-\x7E]
[[:punct:]] Punctuation [][!"#$%&'()*+,./:;<=>?@\^_{|}~-]`
[[:space:]] Whitespace characters [ \t\r\n\v\f]
[[:upper:]] Uppercase letters [A-Z]
[[:xdigit:]] Hexadecimal digits [A-Fa-f0-9]

Chinese character groups

Chinese punctuation has no unified Unicode range. To match a specific punctuation character, look up its Unicode code point and use it directly.

Character group Pattern
Double-byte characters (e.g., Chinese characters) [^\\\x{00}-\\\x{ff}]
Chinese characters [\\x{4e00}-\\x{9fa5}]
Chinese punctuation No unified range. Search for the Unicode code point and use [\\x{NNNN}].

Examples

Match rows containing the Chinese period (Unicode U+3002):

SELECT * FROM VALUES ('Hello.'),('nihao!') t(d) WHERE d RLIKE '[\\x{3002}]';

-- Returns:
-- +--------+
-- | d      |
-- +--------+
-- | Hello. |
-- +--------+

To match a single quotation mark, use its Unicode code point U+0027: [\\x{0027}].