All Products
Search
Document Center

MaxCompute:REGEXP_EXTRACT

Last Updated:Jul 24, 2023

Splits the source string into groups based on a given pattern and returns the string in the nth group specified by groupid.

Syntax

string regexp_extract(string <source>, string <pattern>[, bigint <groupid>])

Parameters

  • source: required. A value of the STRING type. This parameter specifies the string that you want to split.

  • pattern: required. A constant of the STRING type or a regular expression. This parameter specifies the pattern based on which you split a string. For more information about regular expressions, see Regular expressions.

  • groupid: optional. A constant of the BIGINT type. The value of this parameter must be greater than or equal to 0.

Note

Data is stored in the UTF-8 format. Chinese characters can be represented in hexadecimal. They are encoded in the range of [\\x{4e00},\\x{9fa5}].

Return value

A value of the STRING type is returned. The return value varies based on the following rules:

  • If pattern is an empty string or no group is specified in pattern, an error is returned.

  • If the value of groupid is not of the BIGINT type or is less than 0, an error is returned. If you do not specify this parameter, the default value is 1. This value indicates that the string in the first group is returned. If groupid is set to 0, all substrings that match pattern are returned.

  • If the value of source, pattern, or groupid is null, null is returned.

Related functions

REGEXP_EXTRACT is a string function. For more information about functions related to string searches and conversion, see String functions.