The REGEXP_EXTRACT_ALL function extracts all substrings from the source string that match the pattern and returns them as an array.
Syntax
ARRAY<T> REGEXP_EXTRACT_ALL(STRING <source>, STRING <pattern>[, BIGINT <group_id>])Parameters
source: Required. A STRING value that specifies the string to search.
pattern: Required. A STRING constant or a regular expression that specifies the pattern to match. For more information about regular expressions, see Regular expression specifications.
group_id: Optional. A BIGINT value that must be greater than or equal to 0.
If you do not specify group_id, the system uses a default value of 1 and returns an array of all matching results where group_id is 1.
If you specify a group_id, the function returns an array of all results that match that group_id.
If group_id is 0, the entire pattern is treated as a single group.
Return value
Returns a value of the ARRAY<STRING> type. The following rules apply:
An error is returned if pattern is an empty string, or if pattern contains no capturing groups and group_id is not specified.
NoteYou can run the
SET odps.sql.bigquery.compatible=true;command to enable BigQuery compatibility mode. In this mode, if the pattern contains no capturing groups and group_id is not specified, the function returns an array of substrings that match the entire pattern.An error is returned if group_id is not a BIGINT value or is less than 0.
If source, pattern, or group_id is NULL, the function returns NULL.
Examples
Example 1: If you do not specify group_id, results for the first group_id are returned by default.
-- Returns ["100","300"]. SELECT REGEXP_EXTRACT_ALL('100-200, 300-400', '(\\d+)-(\\d+)');Example 2: Set the group_id value to
2. This returns the results for the second group_id.-- Returns ["200","400"]. SELECT REGEXP_EXTRACT_ALL('100-200, 300-400', '(\\d+)-(\\d+)',2);Example 3: The pattern contains no capturing groups. The following scenarios describe the command usage:
Scenario 1: If pattern contains no capturing groups and group_id is not specified, an error is returned. In BigQuery compatibility mode, the function returns an array of substrings that match the entire pattern.
-- FAILED: ODPS-0130071:[0,0] Semantic analysis exception - physical plan generation failed: SQL Runtime Unretryable Error: ODPS-0121065:Argument value out of range - Regex group count is 0, but the specified group index is 1 SELECT REGEXP_EXTRACT_ALL('100-200, 300-200', '\\d+-\\d+'); -- Enable BigQuery compatibility mode. SET odps.sql.bigquery.compatible=true; -- Returns ["100-200","300-200"]. SELECT REGEXP_EXTRACT_ALL('100-200, 300-200', '\\d+-\\d+');Scenario 2: If pattern contains no capturing groups and group_id is set to 0, the function returns an array of substrings that match the entire pattern.
-- Returns ["100-200","300-200"]. SELECT REGEXP_EXTRACT_ALL('100-200, 300-200', '\\d+-\\d+',0);
Related functions
REGEXP_EXTRACT_ALL is a string function. For more information about string functions, see String functions.