REGEXP_EXTRACT searches a source string for a substring that matches a regular expression pattern and returns the content of the capturing group specified by group_id.
Syntax
STRING REGEXP_EXTRACT(STRING <source>, STRING <pattern>[, BIGINT <group_id>])
Overloads:
-
REGEXP_EXTRACT(source, pattern)— returns the first capturing group (equivalent togroup_id = 1) -
REGEXP_EXTRACT(source, pattern, group_id)— returns the capturing group at positiongroup_id
Parameters
-
source: Required. STRING. The string to search.
-
pattern: Required. STRING constant or regular expression. The pattern to match against
source. For syntax details, see Regular expression specifications. -
group_id: Optional. BIGINT. The index of the capturing group to return. Must be ≥ 0.
-
If omitted, defaults to
1and returns the first capturing group. -
If
0, returns the substring matching the entire pattern. -
If a positive integer
n, returns the nth capturing group.
-
Data is stored in UTF-8. The hexadecimal range for Chinese characters is [\x{4e00},\x{9fa5}].
Return value
Returns a STRING value. The following conditions apply:
-
Returns NULL if
source,pattern, orgroup_idis NULL. -
Returns an error if
patternis an empty string. -
Returns an error if
patterncontains no capturing groups andgroup_idis not specified. -
Returns an error if
group_idis not a BIGINT or is less than 0.
To change the behavior whenpatternhas no capturing groups andgroup_idis not specified, runSET odps.sql.bigquery.compatible=true;to enable BigQuery compatible mode. In this mode, the function returns the substring matching the entire pattern instead of an error.
Usage notes
Regex specification varies by data type edition:
| Data type edition | Regex specification |
|---|---|
| Hive-compatible data type edition | Java regular expression |
| 1.0 data type edition | MaxCompute |
| 2.0 data type edition | MaxCompute |
For more information, see Hive-compatible data type edition, 1.0 data type edition, and 2.0 data type edition.
Related functions
REGEXP_EXTRACT is a string function. For other string functions, see String functions.