The REGEXP_SUBSTR function searches a string for a pattern specified by a POSIX compliant regular expression. This function returns the string that matches the pattern specified in the call to the function.

Syntax

TEXT REGEXP_SUBSTR
(
  srcstr        TEXT, 
  pattern       TEXT, 
  position      INT  DEFAULT 1, 
  occurrence    INT  DEFAULT 1,
  modifier      TEXT DEFAULT NULL,
  subexpression INT  DEFAULT 0 
)     

Parameters

Parameter Description
srcstr The string to search.
pattern The regular expression for which REGEXP_SUBSTR will search.
position An integer value that indicates the start position in a source string. The default value is 1.
occurrence Specifies which match is returned if more than one occurrence of the pattern occurs in the string that is searched. The default value is 1.
modifier The values that control the pattern matching behavior. The default value is NULL. For a complete list of the modifiers supported by PolarDB-O, see the PostgreSQL core documentation.
subexpression An integer value that identifies the portion of the pattern that will be returned by REGEXP_SUBSTR. The default value of subexpression is 0.

If you specify a value for subexpression, you must include one (or multiple) set of parentheses in the pattern to isolate a portion of the value being searched. The value specified by subexpression indicates which set of parentheses will be returned. For example, if the value of subexpression is 2, REGEXP_SUBSTR returns the value contained within the second set of parentheses.

Examples

In the following example, the REGEXP_SUBSTR searches a string that contains a phone number for the first set of three consecutive digits:

polardb=# SELECT REGEXP_SUBSTR('800-555-****', '[0-9][0-9][0-9]', 1, 1) FROM DUAL;
 regexp_substr 
---------------
 800
(1 row)        

The function locates the first occurrence of three digits and returns the string (8 0 0). If you want to search for the second occurrence of three consecutive digits, use the following command:

polardb=# SELECT REGEXP_SUBSTR('800-555-****', '[0-9][0-9][0-9]', 1, 2) FROM DUAL;
 regexp_substr 
---------------
 555
(1 row)        

REGEXP_SUBSTR returns 555, which is the content of the second substring.