All Products
Search
Document Center

MaxCompute:REGEXP_EXTRACT_ALL

Last Updated:Oct 23, 2025

The REGEXP_EXTRACT_ALL function extracts all substrings from the source string that match the pattern and returns them as an array.

Syntax

ARRAY<T> REGEXP_EXTRACT_ALL(STRING <source>, STRING <pattern>[, BIGINT <group_id>])

Parameters

  • source: Required. A STRING value that specifies the string to search.

  • pattern: Required. A STRING constant or a regular expression that specifies the pattern to match. For more information about regular expressions, see Regular expression specifications.

  • group_id: Optional. A BIGINT value that must be greater than or equal to 0.

    • If you do not specify group_id, the system uses a default value of 1 and returns an array of all matching results where group_id is 1.

    • If you specify a group_id, the function returns an array of all results that match that group_id.

    • If group_id is 0, the entire pattern is treated as a single group.

Return value

Returns a value of the ARRAY<STRING> type. The following rules apply:

  • An error is returned if pattern is an empty string, or if pattern contains no capturing groups and group_id is not specified.

    Note

    You can run the SET odps.sql.bigquery.compatible=true; command to enable BigQuery compatibility mode. In this mode, if the pattern contains no capturing groups and group_id is not specified, the function returns an array of substrings that match the entire pattern.

  • An error is returned if group_id is not a BIGINT value or is less than 0.

  • If source, pattern, or group_id is NULL, the function returns NULL.

Examples

  • Example 1: If you do not specify group_id, results for the first group_id are returned by default.

    -- Returns ["100","300"].
    SELECT REGEXP_EXTRACT_ALL('100-200, 300-400', '(\\d+)-(\\d+)');
  • Example 2: Set the group_id value to 2. This returns the results for the second group_id.

    -- Returns ["200","400"].
    SELECT REGEXP_EXTRACT_ALL('100-200, 300-400', '(\\d+)-(\\d+)',2);
  • Example 3: The pattern contains no capturing groups. The following scenarios describe the command usage:

    • Scenario 1: If pattern contains no capturing groups and group_id is not specified, an error is returned. In BigQuery compatibility mode, the function returns an array of substrings that match the entire pattern.

      -- FAILED: ODPS-0130071:[0,0] Semantic analysis exception - physical plan generation failed: SQL Runtime Unretryable Error: ODPS-0121065:Argument value out of range - Regex group count is 0, but the specified group index is 1
      SELECT REGEXP_EXTRACT_ALL('100-200, 300-200', '\\d+-\\d+'); 
      
      -- Enable BigQuery compatibility mode.
      SET odps.sql.bigquery.compatible=true;
      -- Returns ["100-200","300-200"].
      SELECT REGEXP_EXTRACT_ALL('100-200, 300-200', '\\d+-\\d+'); 
    • Scenario 2: If pattern contains no capturing groups and group_id is set to 0, the function returns an array of substrings that match the entire pattern.

      -- Returns ["100-200","300-200"].
      SELECT REGEXP_EXTRACT_ALL('100-200, 300-200', '\\d+-\\d+',0);

Related functions

REGEXP_EXTRACT_ALL is a string function. For more information about string functions, see String functions.