Learn how SLS data transformation handles regular expression matching, character escaping, and grouping.
Full match
A full match requires the regular expression to match the entire string. For example, \d+ is a full match for the string 1234.
Some functions support partial matches. To enforce a full match, add ^ at the start and $ at the end. For example: ^regular expression$. Full regex syntax is covered in the Python Regular expression operations reference.
The following table shows each function's matching mode.
|
Category |
Function |
Matching mode |
|
Global operation functions |
Partial match |
|
|
Full match |
||
|
Full match |
||
|
Full match |
||
|
Partial match |
||
|
Expression functions |
Parameter-controlled. Defaults to full match. |
|
|
Partial match |
||
|
Partial match |
||
|
Partial match |
||
|
Parameter-controlled. Defaults to partial match. |
||
|
Partial match |
||
|
Partial match |
Matching mode examples:
-
regex_match("abc123", r"\d+"): Matches (partial match by default). -
regex_match("abc123", r"\d+", full=True): No match (full match enabled). -
regex_match("abc123", r"^\d+$"): No match. Equivalent to full match mode. -
e_search(r'status~="\d+"'): Matches the status field value. Equivalent to partial match mode. -
e_search(r'status~="^\d+$"'): Matches the status field value. Equivalent to full match mode.
Character escaping
Regular expressions contain special characters. To match them literally, escape them using one of these methods:
-
Use a backslash (\) to escape.
Character escaping covers the full syntax.
-
Use the
str_regex_escapefunction.-
For example,
e_drop_fields(str_regex_escape("abc.test"))drops the abc.test field. -
For example,
e_drop_fields("abc.test")drops fields that match abc?test, where the question mark (?) represents any single character.
-
Grouping
Parentheses () group expressions for repetition or backreference. The following example shows the difference:
"""
Log before processing:
SourceIP: 192.0.2.1
Log after processing:
SourceIP: 192.0.2.1
ip: 192.0.2.1
"""
# No group:
e_regex("SourceIP",r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}","ip")
# With group:
e_regex("SourceIP", "\d{1,3}(.\d{1,3}){3}", "ip")
Capturing groups
A capturing group stores the match for later backreference. Any group whose parentheses () do not begin with ?: is a capturing group.
Capturing groups are numbered from 1, left to right, by opening parenthesis position. For example, the following expression has three groups:
(\d{4})-(\d{2}-(\d{2}))
1 1 2 3 32
When a regex contains both standard and named capturing groups, standard groups are numbered first, then named groups. SLS supports referencing named groups directly by name in expressions or programs.
Non-capturing groups
A non-capturing group does not store the match. Groups whose parentheses () start with ?: are non-capturing.
To match program or project, use pro(gram|ject). If caching the match is unnecessary, use the non-capturing form pro(?:gram|ject).
(?:x) matches x without caching the result, letting you define subexpressions for use with regex operators.