All Products
Search
Document Center

MaxCompute:REGEXP_EXTRACT

Last Updated:Mar 26, 2026

REGEXP_EXTRACT searches a source string for a substring that matches a regular expression pattern and returns the content of the capturing group specified by group_id.

Syntax

STRING REGEXP_EXTRACT(STRING <source>, STRING <pattern>[, BIGINT <group_id>])

Overloads:

  • REGEXP_EXTRACT(source, pattern) — returns the first capturing group (equivalent to group_id = 1)

  • REGEXP_EXTRACT(source, pattern, group_id) — returns the capturing group at position group_id

Parameters

  • source: Required. STRING. The string to search.

  • pattern: Required. STRING constant or regular expression. The pattern to match against source. For syntax details, see Regular expression specifications.

  • group_id: Optional. BIGINT. The index of the capturing group to return. Must be ≥ 0.

    • If omitted, defaults to 1 and returns the first capturing group.

    • If 0, returns the substring matching the entire pattern.

    • If a positive integer n, returns the nth capturing group.

Data is stored in UTF-8. The hexadecimal range for Chinese characters is [\x{4e00},\x{9fa5}].

Return value

Returns a STRING value. The following conditions apply:

  • Returns NULL if source, pattern, or group_id is NULL.

  • Returns an error if pattern is an empty string.

  • Returns an error if pattern contains no capturing groups and group_id is not specified.

  • Returns an error if group_id is not a BIGINT or is less than 0.

To change the behavior when pattern has no capturing groups and group_id is not specified, run SET odps.sql.bigquery.compatible=true; to enable BigQuery compatible mode. In this mode, the function returns the substring matching the entire pattern instead of an error.

Usage notes

Regex specification varies by data type edition:

Data type edition Regex specification
Hive-compatible data type edition Java regular expression
1.0 data type edition MaxCompute
2.0 data type edition MaxCompute

For more information, see Hive-compatible data type edition, 1.0 data type edition, and 2.0 data type edition.

Related functions

REGEXP_EXTRACT is a string function. For other string functions, see String functions.