All Products
Search
Document Center

MaxCompute:RLIKE

Last Updated:Dec 28, 2023

RLIKE is an operator that is provided by MaxCompute SQL based on the Perl Compatible Regular Expressions (PCRE) standards. You can use the RLIKE operator to perform more accurate and complex pattern matching or replacement in MaxCompute SQL. This topic describes metacharacters supported by RLIKE, POSIX character groups, Chinese character groups, and escape characters. This topic also provides examples of the related characters.

Metacharacter

The RLIKE operator supports various metacharacters. The following table describes commonly used metacharacters that are supported by RLIKE.

Metacharacter

Description

^

Matches the beginning of a string.

$

Matches the end of a string.

.

Matches any single character.

*

Matches the preceding character or character pattern zero or more times.

+

Matches the preceding character or character pattern one or more times.

?

Matches the preceding character or character pattern zero or one time.

?

Matches a modifier. If this character follows one of other characters (*, +, ?, {n}, {n,}, or {n,m}), the match pattern is non-greedy. In the non-greedy algorithm, a character string matches as few characters as possible. In the greedy algorithm, a character string matches as many characters as possible. By default, the greedy algorithm is used.

A|B

Matches A or B.

(abc)*

Matches the abc sequence zero or more times.

{n} or {m,n}

The number of matches.

[ab]

Matches any character in the brackets. Fuzzy match is used.

[a-d]

Matches one of the following characters: a, b, c, and d.

[^ab]

Matches any character except those in the brackets.

[::]

For more information, see POSIX character groups.

\

The escape character. For more information, see Escape characters.

\n

n is a digit ranging from 1 to 9 and indicates back reference.

\d

A digit.

\D

A non-digit character.

  • For more information about regular expressions, visit pcre2syntax man page.

  • If the matching result does not meet your expectations, escape characters may be required. For more information, see Escape characters.

Examples

  • Example 1: Match the beginning and end of a string.

    select 'aa123bb' rlike '^a';
    -- Match the beginning of the string aa123bb with a. true is returned.
    
    select 'aa123bb' rlike '^a.*b$';
    -- Match the beginning of the string aa123bb with a and the end of the string aa123bb with b. true is returned.
    
    select 'footerbar' rlike 'foo(.*?)(bar)';
    -- true is returned.
  • Example 2: Use an escape character to match a string that has special characters.

    select 'a+b' rlike 'a\\\+b';
    -- true is returned.
  • Example 3: Match any single character.

    select 'cc123bb' rlike '^[a-d]';
    -- Match the beginning of the string cc123bb with one of the characters from a to d. true is returned.
  • Example 4: Match any digit and any non-digit character.

    select '123bb' rlike '^\\\d';
    -- Match the beginning of the string 123bb with any digit. true is returned.
    
    select 'cc123bb' rlike '^\\\D';
    -- Match the beginning of the string cc123bb with any non-digit character. true is returned.

Character groups

POSIX character groups

Character group

Description

Valid value

[[:alnum:]]

Letters and digits

[a-zA-Z0-9]

[[:alpha:]]

Letters

[a-zA-Z]

[[:ascii:]]

ASCII characters

[\x00-\x7F]

[[:blank:]]

Spaces and tab characters

[ \t]

[[:cntrl:]]

Control characters

[\x00-\x1F\x7F]

[[:digit:]]

Digits

[0-9]

[[:graph:]]

Characters other than whitespace characters

[\x21-\x7E]

[[:lower:]]

Lowercase letters

[a-z]

[[:print:]]

[:graph:] and whitespace characters

[\x20-\x7E]

[[:punct:]]

Punctuations

[][!”#$%&’()*+,./:;<=>? @\^_`{|}~-]

[[:space:]]

Whitespace characters

[ \t\r\n\v\f]

[[:upper:]]

Uppercase letters

[A-Z]

[[:xdigit:]]

Hexadecimal characters

[A-Fa-f0-9]

Note

MaxCompute does not support the \f and \v escape characters. For more information about the escape characters that are supported by MaxCompute, see Escape characters.

Chinese character groups

Character group

Valid value

Double-byte characters such as Chinese characters

[^\\x{00}-\\x{ff}]

Chinese characters

[\\x{4e00}-\\x{9fa5}]

Chinese punctuations

Chinese punctuations do not have unified Unicode range. You can use a search engine to search for the Unicode of a Chinese punctuation and use operators to match Chinese punctuations.

For example, the Chinese period (。) is expressed as [\\x{3002}]. If you execute the select * from values ('你好。 '),('nihao! t(d) where d rlike '[\\x{3002}]'; statement, the returned result is 你好。. .

Note

If you want to match Chinese single quotation marks (’), you can use unicode0027. The Chinese single quotation mark (’) is expressed as [\\x{0027}].

Escape characters

RLIKE allows you to use backslashes (\) as escape characters. Therefore, backslashes (\) in regular expressions must also be escaped.

Examples

  • Example 1

    Use a regular expression to match the a+b string. The plus sign (+) in the expression is a special character in the expression and must be escaped. In the regular expression engine, the string is expressed as a\\+b. The expression must be escaped again. Therefore, the expression that can match the string is a\\\+b.

    select 'a+b' rlike 'a\\\+b';
    
    +------+
    | _c1  |
    +------+
    | true |
    +------+
  • Example 2

    Match the backslash (\) character. In specific cases, The backslash (\) character is a special character in the regular expression engine and must be expressed as \\ in the engine. Then, the expression must be escaped again. As a result, the backslash (\) character is expressed as \\\\.

    select 'a\\b', 'a\\b' rlike 'a\\\b';
    +-----+------+
    | _c0 | _c1  |
    +-----+------+
    | a\b | false |
    +-----+------+
    
    select 'a\\b', 'a\\b' rlike 'a\\\\b';
    +-----+------+
    | _c0 | _c1  |
    +-----+------+
    | a\b | true |
    +-----+------+
    Note

    If a MaxCompute SQL statement contains a\\b, MaxCompute returns a\b because MaxCompute escapes the expression.

  • Example 3

    Match a tab character. If a string contains tab characters, MaxCompute stores \t as one character when MaxCompute reads this expression. Therefore, \t is a common character in regular expressions.

    select 'a\tb', 'a\tb' rlike 'a\tb';
    
    +---------+------+
    | _c0     | _c1  |
    +---------+------+
    | a     b | true |
    +---------+------+