The REGEXP_EXTRACT function searches the source string for a match to the regular expression pattern and returns the content of the group specified by group_id.
Usage notes
The REGEXP_EXTRACT function follows Java regular expression specifications for the Hive-compatible data type edition. For the 1.0 and 2.0 data type editions, it follows MaxCompute specifications.
Syntax
STRING REGEXP_EXTRACT(STRING <source>, STRING <pattern>[, BIGINT <group_id>])Parameters
source: Required. A value of the STRING type. This parameter specifies the string to be searched.
pattern: Required. A constant or regular expression of the STRING type. This parameter specifies the pattern for matching. For more information about how to write regular expressions, see Regular expression specifications.
group_id: Optional. A value of the BIGINT type. The value must be greater than or equal to 0.
If group_id is not specified, the default value is 1, and the function returns the first matched group.
If group_id is specified, the function returns the content of the matched group group_id.
If group_id is 0, the function returns the substring that matches the entire pattern.
Data is stored in UTF-8 format. The hexadecimal code range for Chinese characters is [\\x{4e00},\\x{9fa5}].
Return value
A value of the STRING type is returned. The following rules apply:
If pattern is an empty string, or if pattern contains no capturing groups and group_id is not specified, an error is returned.
NoteRun
SET odps.sql.bigquery.compatible=true;to enable BigQuery compatible mode. If pattern contains no groups and group_id is not specified, the function returns the substring that matches the entire pattern.If group_id is not of the BIGINT type or is less than 0, an error is returned.
If source, pattern, or group_id is NULL, NULL is returned.
Related functions
REGEXP_EXTRACT is a string function. For more information about other string functions, see String functions.