When you use Logtail to collect logs, you can add Logtail plugins to extract log fields. You can extract fields using regular expressions (regex), anchors, CSV, single-character delimiters, multi-character delimiters, key-value pairs, or Grok mode. This topic describes the parameters for these plugins and provides configuration examples.
Limits
Text logs and container standard output support only form-based configuration. Other input plugins support only JSON configuration.
When you use regex mode to extract fields, the regular expressions have the following limits:
The Go regular expression engine is based on RE2. It has the following limitations compared with the Perl Compatible Regular Expressions (PCRE) engine:
Differences in named group syntax
Go uses the
(?P<name>...)syntax, not the(?<name>...)syntax used by PCRE.Unsupported regex patterns
Assertions:
(?=...),(?!...),(?<=...), and(?<!...).Conditional expressions:
(?(condition)true|false).Recursive matching:
(?R)and(?0).Subroutine references:
(?&name)and(?P>name).Atomic groups:
(?>...).
When you debug regular expressions using tools such as Regex101, avoid the unsupported patterns. Otherwise, the plugin will not be able to process the expressions.
Entry point
To use a Logtail plugin to process logs, you can add a plugin configuration when you create or modify a Logtail collection configuration. For more information, see Overview of Logtail plugins for data processing.
Regex mode
You can extract target fields using a regular expression.
Form-based configuration
Parameters
Set Processor Type to Extract Fields (Regex Mode). The following table describes the available parameters.
Parameter
Description
Original Field
The name of the original field.
Regular Expression
The regular expression. You must enclose the field to be extracted in parentheses
().Result Fields
Specify the names for the extracted fields. You can add multiple field names.
Error on Missing Original Field
If you select this option, an error is reported if the raw log does not contain the specified original field.
Report an error if the regular expression does not match
If you select this option, an error is reported if the value of the original field does not match the specified regular expression.
Retain Original Field
If you select this option, the original field is retained in the parsed log.
Retain Original Field on Parsing Failure
If you select this option, the original field is retained in the parsed log if parsing fails.
Require Full Regex Match
If you select this option, values are extracted only if the regular expression matches all the fields specified in Result Fields within the original field.
Configuration example
This example shows how to use regex mode to extract the value of the content field. The result fields are named
ip,time, method, url, request_time, request_length, status, length, ref_url, and browser.Raw log
"content" : "10.200.**.** - - [10/Aug/2022:14:57:51 +0800] \"POST /PutData? Category=YunOsAccountOpLog&AccessKeyId=<yourAccessKeyId>&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=<yourSignature> HTTP/1.1\" 0.024 18204 200 37 \"-\" \"aliyun-sdk-java"Logtail plugin configuration for data processing
Processing result
"ip" : "10.200.**.**" "time" : "10/Aug/2022:14:57:51" "method" : "POST" "url" : "/PutData?Category=YunOsAccountOpLog&AccessKeyId=<yourAccessKeyId>&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=<yourSignature>" "request_time" : "0.024" "request_length" : "18204" "status" : "200" "length" : "27" "ref_url" : "-" "browser" : "aliyun-sdk-java"
JSON configuration
Parameters
Set type to processor_regex. The following table describes the parameters in detail.
Parameter
Type
Required
Description
SourceKey
String
Yes
The name of the original field.
Regex
String
Yes
The regular expression. You must enclose the field to be extracted in parentheses
().Keys
String array
Yes
The names for the extracted fields. Example: ["ip", "time", "method"].
NoKeyError
Boolean
No
Specifies whether to report an error if the raw log does not contain the specified original field.
true: An error is reported.
false (default): No error is reported.
NoMatchError
Boolean
No
Specifies whether to report an error if the value of the original field does not match the specified regular expression.
true (default): An error is reported.
false: No error is reported.
KeepSource
Boolean
No
Specifies whether to retain the original field in the parsed log.
true: The original field is retained.
false (default): The original field is not retained.
FullMatch
Boolean
No
Specifies whether to require a full match for extraction.
true (default): Field values are extracted only if all fields that are specified in the Keys parameter are successfully matched from the original field by the regular expression in the Regex parameter.
false: Field values are extracted even if only some fields match.
KeepSourceIfParseError
Boolean
No
Specifies whether to retain the original field in the parsed log if parsing fails.
true (default): The original field is retained.
false: The original field is not retained.
Configuration example
This example shows how to use regex mode to extract the value of the content field. The result fields are named ip, time, method, url, request_time, request_length, status, length, ref_url, and browser.
Raw log
"content" : "10.200.**.** - - [10/Aug/2022:14:57:51 +0800] \"POST /PutData? Category=YunOsAccountOpLog&AccessKeyId=<yourAccessKeyId>&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=<yourSignature> HTTP/1.1\" 0.024 18204 200 37 \"-\" \"aliyun-sdk-java"Logtail plugin configuration for data processing
{ "type" : "processor_regex", "detail" : {"SourceKey" : "content", "Regex" : "([\\d\\.]+) \\S+ \\S+ \\[(\\S+) \\S+\\] \"(\\w+) ([^\\\"]*)\" ([\\d\\.]+) (\\d+) (\\d+) (\\d+|-) \"([^\\\"]*)\" \"([^\\\"]*)\" (\\d+)", "Keys" : ["ip", "time", "method", "url", "request_time", "request_length", "status", "length", "ref_url", "browser"], "NoKeyError" : true, "NoMatchError" : true, "KeepSource" : false } }Processing result
"ip" : "10.200.**.**" "time" : "10/Aug/2022:14:57:51" "method" : "POST" "url" : "/PutData?Category=YunOsAccountOpLog&AccessKeyId=<yourAccessKeyId>&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=<yourSignature>" "request_time" : "0.024" "request_length" : "18204" "status" : "200" "length" : "27" "ref_url" : "-" "browser" : "aliyun-sdk-java"
Calibration Pattern
You can extract fields by specifying start and end keywords. If a field is in JSON format, you can expand the JSON field.
Form-based configuration
Parameters
Set Processor Type to Extract Fields (Anchor Mode). For a description of the parameters, see the following table.
Parameter
Description
Original Field
The name of the original field.
Calibration Item List
The list of anchor items.
Start Keyword
The start keyword. If this parameter is empty, matching starts from the beginning of the string.
End Keyword
The end keyword. If this parameter is empty, matching ends at the end of the string.
Results
Specify the name for the extracted field.
Field Type
The type of the field. Valid values: string and json.
JSON Expansion
Specifies whether to expand the JSON field.
JSON Expansion Connector
The character that is used to connect expanded JSON keys. The default value is an underscore (_).
Maximum Depth of JSON Expansion
The maximum depth for JSON expansion. The default value is 0, which indicates no limit.
Report an error if the original field is missing
If you select this option, an error is reported if the raw log does not contain the specified original field.
Report Error on Missing Anchor Item
If you select this option, an error is reported if the raw log does not contain a matching anchor item.
Retain Original Field
If you select this option, the original field is retained in the parsed log.
Configuration example
This example shows how to use anchor mode to extract the value of the content field. The result fields are named time, val_key1, val_key2, val_key3, value_key4_inner1, and value_key4_inner2.
Raw log
"content" : "time:2022.09.12 20:55:36\t json:{\"key1\" : \"xx\", \"key2\": false, \"key3\":123.456, \"key4\" : { \"inner1\" : 1, \"inner2\" : false}}"Logtail plugin configuration for data processing

Processing result
"time" : "2022.09.12 20:55:36" "val_key1" : "xx" "val_key2" : "false" "val_key3" : "123.456" "value_key4_inner1" : "1" "value_key4_inner2" : "false"
JSON configuration
Parameters
Set type to processor_anchor. The following table describes the parameters in detail.
Parameter
Type
Required
Description
SourceKey
String
Yes
The name of the original field.
Anchors
Anchor array
Yes
The list of anchor items.
Start
String
Yes
The start keyword. If this parameter is empty, matching starts from the beginning of the string.
Stop
String
Yes
The end keyword. If this parameter is empty, matching ends at the end of the string.
FieldName
String
Yes
The name for the extracted field.
FieldType
String
Yes
The type of the field. Valid values: string and json.
ExpondJson
Boolean
No
Specifies whether to expand the JSON field.
true: Expands the item.
false (default): The JSON field is not expanded.
This parameter takes effect only when FieldType is set to json.
ExpondConnecter
String
No
The character that is used to connect expanded JSON keys. The default value is an underscore (_).
MaxExpondDepth
Int
No
The maximum depth for JSON expansion. The default value is 0, which indicates no limit.
NoAnchorError
Boolean
No
Specifies whether to report an error if an anchor item is not found.
true: An error is reported.
false (default): No error is reported.
NoKeyError
Boolean
No
Specifies whether to report an error if the raw log does not contain the specified original field.
true: An error is reported.
false (default): No error is reported.
KeepSource
Boolean
No
Specifies whether to retain the original field in the parsed log.
true: The original field is retained.
false (default): The original field is not retained.
Configuration example
This example shows how to use anchor mode to extract the value of the content field. The result fields are named time, val_key1, val_key2, val_key3, value_key4_inner1, and value_key4_inner2.
Raw log
"content" : "time:2022.09.12 20:55:36\t json:{\"key1\" : \"xx\", \"key2\": false, \"key3\":123.456, \"key4\" : { \"inner1\" : 1, \"inner2\" : false}}"Logtail plugin configuration for data processing
{ "type" : "processor_anchor", "detail" : {"SourceKey" : "content", "Anchors" : [ { "Start" : "time", "Stop" : "\t", "FieldName" : "time", "FieldType" : "string", "ExpondJson" : false }, { "Start" : "json:", "Stop" : "", "FieldName" : "val", "FieldType" : "json", "ExpondJson" : true } ] } }Processing result
"time" : "2022.09.12 20:55:36" "val_key1" : "xx" "val_key2" : "false" "val_key3" : "123.456" "value_key4_inner1" : "1" "value_key4_inner2" : "false"
CSV mode
You can parse logs in CSV format.
Form-based configuration
Parameters
For Processor Type, select Extract Fields (CSV Mode). The following table describes the parameters.
Parameter
Description
Original Field
The name of the original field.
Result Fields
Specify the names for the extracted fields. You can add multiple field names.
ImportantIf the number of fields to be split is smaller than the number of fields specified for Result Fields, the extra fields in Result Fields are ignored.
Delimiter
The delimiter. The default value is a comma (,).
Retain Excess Content
If you select this option, excess content is retained if the number of fields to be split is greater than the number of fields specified for Result Fields.
Parse Excess Content
If you select this option, the excess content is parsed. You can use Prefix For Excess Field Names to specify a prefix for the names of the excess fields.
If you select Retain Excess Content but do not select Parse Excess Content, the excess content is stored in the _decode_preserve_ field.
NoteIf the excess content is not in the standard CSV format, you must standardize the content before you store it.
Prefix For Excess Field Names
The prefix for the names of excess fields. For example, if you set this parameter to expand_, the field names are expand_1, expand_2, and so on.
Ignore Leading Spaces In Fields
If you select this option, leading spaces in field values are ignored.
Retain Original Field
If you select this option, the original field is retained in the parsed log.
Report Error If Original Field Is Missing
If you select this option, an error is reported if the log does not contain the specified original field.
Configuration example
This example shows how to extract the value of the csv field.
Raw log
{ "csv": "2022-06-09,192.0.2.0,\"{\"\"key1\"\":\"\"value\"\",\"\"key2\"\":{\"\"key3\"\":\"\"string\"\"}}\"", ...... }Logtail plugin configuration for data processing

Processing result
{ "date": "2022-06-09", "ip": "192.0.2.0", "content": "{\"key1\":\"value\",\"key2\":{\"key3\":\"string\"}}" ...... }
JSON configuration
Parameters
Set type to processor_csv. The following table describes the parameters in detail.
Parameter
Type
Required
Description
SourceKey
String
Yes
The name of the original field.
SplitKeys
String array
Yes
The names for the extracted fields. Example: ["date", "ip", "content"].
ImportantIf the number of fields to be split is smaller than the number of keys in the SplitKeys parameter, the extra keys in the SplitKeys parameter are ignored.
PreserveOthers
Boolean
No
Specifies whether to retain excess content if the number of split fields is greater than the number of keys in the SplitKeys parameter.
true: The excess content is retained.
false (default): The excess content is not retained.
ExpandOthers
Boolean
No
Specifies whether to parse the excess content.
true: Enables parsing.
You can parse the excess content using the ExpandOthers parameter and specify a prefix for the names of the excess fields using the ExpandKeyPrefix parameter.
false (default): The excess content is not parsed.
If you set PreserveOthers to true and ExpandOthers to false, the excess content is stored in the _decode_preserve_ field.
NoteIf the excess content is not in the standard CSV format, you must standardize the content before you store it.
ExpandKeyPrefix
String
No
The prefix for the names of excess fields. For example, if you set this parameter to expand_, the field names are expand_1, expand_2, and so on.
TrimLeadingSpace
Boolean
No
Specifies whether to ignore leading spaces in field values.
true: The operation is ignored.
false (default): Leading spaces are not ignored.
SplitSep
String
No
The delimiter. The default value is a comma (,).
KeepSource
Boolean
No
Specifies whether to retain the original field in the parsed log.
true: The original field is retained.
false (default): The original field is not retained.
NoKeyError
Boolean
No
Specifies whether to report an error if the raw log does not contain the specified original field.
true: An error is reported.
false (default): No error is reported.
Configuration example
This example shows how to extract the value of the csv field.
Raw log
{ "csv": "2022-06-09,192.0.2.0,\"{\"\"key1\"\":\"\"value\"\",\"\"key2\"\":{\"\"key3\"\":\"\"string\"\"}}\"", ...... }Logtail plugin configuration for data processing
{ ...... "type":"processor_csv", "detail":{ "SourceKey":"csv", "SplitKeys":["date", "ip", "content"], } ...... }Processing result
{ "date": "2022-06-09", "ip": "192.0.2.0", "content": "{\"key1\":\"value\",\"key2\":{\"key3\":\"string\"}}" ...... }
Single-character delimiter mode
You can extract fields using a single-character delimiter. You can use a quote character to enclose the delimiter.
Form-based configuration
Parameters
For Processor Type, select Extract Fields (Single-character Delimiter Mode). The following table describes the parameters.
Parameter
Description
Original Field
The name of the original field.
Delimiter
The delimiter. The delimiter must be a single character. You can specify a non-printable character, such as \u0001.
Result fields
Specify the names for the extracted fields.
Use Quotes
If you select this option, you can use a quote character.
Quote Character
The quote character. The quote must be a single character. You can specify a non-printable character, such as \u0001.
Error on Missing Original Field
If you select this option, an error is reported if the raw log does not contain the specified original field.
Report Error on Delimiter Mismatch
If you select this option, an error is reported if the delimiter that you specify does not match the delimiter in the raw log.
Keep Original Field
If you select this option, the original field is retained in the parsed log.
Example
This example shows how to use a vertical bar (|) as the delimiter to extract the value of the content field. The result fields are named ip, time, method, url, request_time, request_length, status, length, ref_url, and browser.
Raw log
"content" : "10.**.**.**|10/Aug/2022:14:57:51 +0800|POST|PutData? Category=YunOsAccountOpLog&AccessKeyId=<yourAccessKeyId>&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=<yourSignature>|0.024|18204|200|37|-| aliyun-sdk-java"Logtail plugin configuration for data processing

Processing result
"ip" : "10.**.**.**" "time" : "10/Aug/2022:14:57:51 +0800" "method" : "POST" "url" : "/PutData?Category=YunOsAccountOpLog&AccessKeyId=<yourAccessKeyId>&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=<yourSignature>" "request_time" : "0.024" "request_length" : "18204" "status" : "200" "length" : "27" "ref_url" : "-" "browser" : "aliyun-sdk-java"
JSON configuration
Parameters
Set type to processor_split_char. The following table describes the parameters in detail.
Parameter
Type
Required
Description
SourceKey
String
Yes
The name of the original field.
SplitSep
String
Yes
The delimiter. The delimiter must be a single character. You can specify a non-printable character, such as \u0001.
SplitKeys
String array
Yes
The names for the extracted fields. Example: ["ip", "time", "method"].
PreserveOthers
Boolean
No
Specifies whether to retain excess content if the number of split fields is greater than the number of keys in the SplitKeys parameter.
true: The excess content is retained.
false (default): The excess content is not retained.
QuoteFlag
Boolean
No
Specifies whether to use a quote character.
true: Enabled
false (default): A quote character is not used.
Quote
String
No
The quote character. The quote must be a single character. You can specify a non-printable character, such as \u0001.
This parameter takes effect only when QuoteFlag is set to true.
NoKeyError
Boolean
No
Specifies whether to report an error if the raw log does not contain the specified original field.
true: An error is reported.
false (default): No error is reported.
NoMatchError
Boolean
No
Specifies whether to report an error if the delimiter that you specify does not match the delimiter in the log.
true: An error is reported.
false (default): No error is reported.
KeepSource
Boolean
No
Specifies whether to retain the original field in the parsed log.
true: The original field is retained.
false (default): The original field is not retained.
Example
This example shows how to use a vertical bar (|) as the delimiter to extract the value of the content field. The result fields are named ip, time, method, url, request_time, request_length, status, length, ref_url, and browser.
Raw log
"content" : "10.**.**.**|10/Aug/2022:14:57:51 +0800|POST|PutData? Category=YunOsAccountOpLog&AccessKeyId=<yourAccessKeyId>&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=<yourSignature>|0.024|18204|200|37|-| aliyun-sdk-java"Logtail plugin configuration for data processing
{ "type" : "processor_split_char", "detail" : {"SourceKey" : "content", "SplitSep" : "|", "SplitKeys" : ["ip", "time", "method", "url", "request_time", "request_length", "status", "length", "ref_url", "browser"] } }Processing result
"ip" : "10.**.**.**" "time" : "10/Aug/2022:14:57:51 +0800" "method" : "POST" "url" : "/PutData?Category=YunOsAccountOpLog&AccessKeyId=<yourAccessKeyId>&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=<yourSignature>" "request_time" : "0.024" "request_length" : "18204" "status" : "200" "length" : "27" "ref_url" : "-" "browser" : "aliyun-sdk-java"
Multi-character delimiter mode
You can extract fields using a multi-character delimiter. You cannot use a quote character to enclose the delimiter.
Form-based configuration
Parameters
Set Processor Type to Extract Fields (Multi-character Delimiter Mode). The parameters are described in the following table.
Parameter
Description
Original Field
The name of the original field.
Split String
The delimiter. You can specify non-printable characters, such as \u0001\u0002.
Result fields
Specify the names for the extracted fields.
ImportantIf there are fewer split fields than the number of fields specified in Result Fields, the excess field names in Result Fields are ignored.
Report an Error if the Original Field Is Missing
If you select this option, an error is reported if the log does not contain the specified original field.
Report Error on Delimiter Mismatch
If you select this option, an error is reported if the delimiter that you specify does not match the delimiter in the log.
Retain Original Field
If you select this option, the original field is retained in the parsed log.
Retain Additional Content
If you select this option, excess content is retained when the number of split fields exceeds the number specified for Result Fields.
Parse Excess Content
If you select this option, excess content is parsed when the number of fields to be split exceeds the number of fields specified for Result Fields. You can use Prefix For Excess Field Names to specify a prefix for the excess field names.
Overflow Field Prefix
The prefix for the names of excess fields. For example, if you set this parameter to expand_, the field names are expand_1, expand_2, and so on.
Configuration example
This example shows how to use the delimiter |#| to extract the value of the content field. The result fields are named ip, time, method, url, request_time, request_length, status, expand_1, expand_2, and expand_3.
Raw log
"content" : "10.**.**.**|#|10/Aug/2022:14:57:51 +0800|#|POST|#|PutData? Category=YunOsAccountOpLog&AccessKeyId=<yourAccessKeyId>&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=<yourSignature>|#|0.024|#|18204|#|200|#|27|#|-|#| aliyun-sdk-java"Logtail plugin configuration for data processing

Processing result
"ip" : "10.**.**.**" "time" : "10/Aug/2022:14:57:51 +0800" "method" : "POST" "url" : "/PutData?Category=YunOsAccountOpLog&AccessKeyId=<yourAccessKeyId>&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=<yourSignature>" "request_time" : "0.024" "request_length" : "18204" "status" : "200" "expand_1" : "27" "expand_2" : "-" "expand_3" : "aliyun-sdk-java"
JSON configuration
Parameters
Set type to processor_split_string. The following table describes the parameters in detail.
Parameter
Type
Required
Description
SourceKey
String
Yes
The name of the original field.
SplitSep
String
Yes
The delimiter. You can specify non-printable characters, such as \u0001\u0002.
SplitKeys
String array
Yes
The names for the extracted fields. Example: ["key1","key2"].
NoteIf the number of fields to be split is smaller than the number of keys in the SplitKeys parameter, the extra keys in the SplitKeys parameter are ignored.
PreserveOthers
Boolean
No
Specifies whether to retain excess content if the number of split fields is greater than the number of keys in the SplitKeys parameter.
true: The excess content is retained.
false (default): The excess content is not retained.
ExpandOthers
Boolean
No
Specifies whether to parse the excess content if the number of split fields is greater than the number of keys in the SplitKeys parameter.
true: The content is parsed.
You can parse the excess content using the ExpandOthers parameter and specify a prefix for the names of the excess fields using the ExpandKeyPrefix parameter.
false (default): The excess content is not parsed.
ExpandKeyPrefix
String
No
The prefix for the names of excess fields. For example, if you set this parameter to expand_, the field names are expand_1, expand_2, and so on.
NoKeyError
Boolean
No
Specifies whether to report an error if the raw log does not contain the specified original field.
true: An error is reported.
false (default): No error is reported.
NoMatchError
Boolean
No
Specifies whether to report an error if the delimiter that you specify does not match the delimiter in the log.
true: An error is reported.
false (default): No error is reported.
KeepSource
Boolean
No
Specifies whether to retain the original field in the parsed log.
true: The original field is retained.
false (default): The original field is not retained.
Configuration example
This example shows how to use the delimiter |#| to extract the value of the content field. The result fields are named ip, time, method, url, request_time, request_length, status, expand_1, expand_2, and expand_3.
Raw log
"content" : "10.**.**.**|#|10/Aug/2022:14:57:51 +0800|#|POST|#|PutData? Category=YunOsAccountOpLog&AccessKeyId=<yourAccessKeyId>&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=<yourSignature>|#|0.024|#|18204|#|200|#|27|#|-|#| aliyun-sdk-java"Logtail plugin configuration for data processing
{ "type" : "processor_split_string", "detail" : {"SourceKey" : "content", "SplitSep" : "|#|", "SplitKeys" : ["ip", "time", "method", "url", "request_time", "request_length", "status"], "PreserveOthers" : true, "ExpandOthers" : true, "ExpandKeyPrefix" : "expand_" } }Processing result
"ip" : "10.**.**.**" "time" : "10/Aug/2022:14:57:51 +0800" "method" : "POST" "url" : "/PutData?Category=YunOsAccountOpLog&AccessKeyId=<yourAccessKeyId>&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=<yourSignature>" "request_time" : "0.024" "request_length" : "18204" "status" : "200" "expand_1" : "27" "expand_2" : "-" "expand_3" : "aliyun-sdk-java"
Key-value pair mode
You can extract fields by splitting key-value pairs.
The processor_split_key_value plugin is supported in Logtail 0.16.26 and later.
Form-based configuration
Parameters
Set Processor Type to Extract Fields (Key-value Pair Mode). The following table describes the parameters.
Parameter
Description
Original Field
The name of the original field.
Key-Value Pair Delimiter
The delimiter between key-value pairs. The default value is a tab character (
\t).Key-Value Delimiter
The delimiter between the key and value in a single key-value pair. The default value is a colon (:).
Retain Original Field
If you select this option, the original field is retained.
Report Error on Missing Original Field
If you select this option, an error is reported if the log does not contain the specified original field.
Discard Key-Value Pairs on Delimiter Mismatch
If you select this option, a key-value pair is discarded if the delimiter that you specify does not match the delimiter in the log.
Error on Missing Key-Value Delimiter
If you select this option, an error is reported if the log does not contain the specified delimiter.
Report Errors for Empty Keys
If you select this option, an error is reported if a key is empty after splitting.
Quote Character
If a value is enclosed in the quote character, the content within the quotes is extracted. Multi-character quotes are supported.
ImportantIf a value enclosed in quotes contains a backslash (\) that is adjacent to a quote character, the backslash (\) is also extracted as part of the value.
Configuration examples
Example 1: Swap key-value pairs.
This example shows how to split the value of the content field into key-value pairs. The delimiter between key-value pairs is a tab character (
\t). The delimiter between a key and a value is a colon (:).Raw log
"content": "class:main\tuserid:123456\tmethod:get\tmessage:\"wrong user\""Logtail plugin configuration for data processing

Processing result
"content": "class:main\tuserid:123456\tmethod:get\tmessage:\"wrong user\"" "class": "main" "userid": "123456" "method": "get" "message": "\"wrong user\""
Example 2: Split key-value pairs that contain quotes.
This example shows how to split the value of the content field into key-value pairs. The delimiter between key-value pairs is a tab character (
\t). The delimiter between a key and a value is a colon (:). The quote character is a double quotation mark (").Raw log
"content": "class:main http_user_agent:\"User Agent\" \"Chinese\" \"hello\\t\\\"ilogtail\\\"\\tworld\""Logtail plugin configuration for data processing

Processing result
"class": "main", "http_user_agent": "User Agent", "no_separator_key_0": "Chinese", "no_separator_key_1": "hello\t\"ilogtail\"\tworld",
Example 3: Split key-value pairs that contain multi-character quotes.
This example shows how to split the value of the content field into key-value pairs. The delimiter between key-value pairs is a tab character
\t. The delimiter between a key and a value is a colon (:). The quote character is a double quotation mark ("").Raw log
"content": "class:main http_user_agent:\"\"\"User Agent\"\"\" \"\"\"Chinese\"\"\""Logtail plugin configuration for data processing

Processing result
"class": "main", "http_user_agent": "User Agent", "no_separator_key_0": "Chinese",
JSON configuration
Parameters
Set type to processor_split_key_value. The following table describes the parameters in detail.
Parameter
Type
Required
Description
SourceKey
string
Yes
The name of the original field.
Delimiter
string
No
The delimiter between key-value pairs. The default value is a tab character (
\t).Separator
string
No
The delimiter between the key and value in a single key-value pair. The default value is a colon (:).
KeepSource
Boolean
No
Specifies whether to retain the original field in the parsed log.
true: The original field is retained.
false (default): The original field is not retained.
ErrIfSourceKeyNotFound
Boolean
No
Specifies whether to report an error if the raw log does not contain the specified original field.
true (default): An error is reported.
false: No error is reported.
DiscardWhenSeparatorNotFound
Boolean
No
Specifies whether to discard a key-value pair if the key-value delimiter is not found.
true: The key-value pair is discarded.
false (default): The key-value pair is not discarded.
ErrIfSeparatorNotFound
Boolean
No
Specifies whether to report an error if the specified key-value delimiter is not found.
true (default): An error is reported.
false: No error is reported.
ErrIfKeyIsEmpty
Boolean
No
Specifies whether to report an error if a key is empty after splitting.
true (default): An error is reported.
false: No error is reported.
Quote
String
No
The quote character. If a value is enclosed in the quote character, the content within the quotes is extracted. Multi-character quotes are supported. By default, this feature is disabled.
ImportantIf the quote character is double quotation marks (""), you must add a backslash (\) as an escape character.
If a value enclosed in quotes contains a backslash (\) that is adjacent to a quote character, the backslash (\) is included as part of the extracted value.
Configuration examples
Example 1: Split key-value pairs.
This example shows how to split the value of the content field into key-value pairs. The delimiter between key-value pairs is a tab character (
\t). The delimiter between a key and a value is a colon (:).Raw log
"content": "class:main\tuserid:123456\tmethod:get\tmessage:\"wrong user\""Logtail plugin configuration for data processing
{ "processors":[ { "type":"processor_split_key_value", "detail": { "SourceKey": "content", "Delimiter": "\t", "Separator": ":", "KeepSource": true } } ] }Processing result
"content": "class:main\tuserid:123456\tmethod:get\tmessage:\"wrong user\"" "class": "main" "userid": "123456" "method": "get" "message": "\"wrong user\""
Example 2: Split key-value pairs.
This example shows how to split the value of the content field into key-value pairs. The delimiter between key-value pairs is a tab character
\t. The delimiter between a key and a value is a colon (:). The quote character is a double quotation mark (").Raw log
"content": "class:main http_user_agent:\"User Agent\" \"Chinese\" \"hello\\t\\\"ilogtail\\\"\\tworld\""Logtail plugin configuration for data processing
{ "processors":[ { "type":"processor_split_key_value", "detail": { "SourceKey": "content", "Delimiter": " ", "Separator": ":", "Quote": "\"" } } ] }Processing result
"class": "main", "http_user_agent": "User Agent", "no_separator_key_0": "Chinese", "no_separator_key_1": "hello\t\"ilogtail\"\tworld",
Example 3: Split key-value pairs.
This example shows how to split the value of the content field into key-value pairs. The delimiter between key-value pairs is a tab character
\t. The delimiter between a key and a value is a colon (:). The quote character is a triple double quotation mark (""").Raw log
"content": "class:main http_user_agent:\"\"\"User Agent\"\"\" \"\"\"Chinese\"\"\""Logtail plugin configuration for data processing
{ "processors":[ { "type":"processor_split_key_value", "detail": { "SourceKey": "content", "Delimiter": " ", "Separator": ":", "Quote": "\"\"\"" } } ] }Results
"class": "main", "http_user_agent": "User Agent", "no_separator_key_0": "Chinese",
Grok mode
You can extract target fields using Grok expressions.
The processor_grok plugin is supported in Logtail 1.2.0 and later.
Form-based configuration
Parameters
You can set Processor Type to Extract Fields (GROK Mode). The following table describes the parameters.
Parameter
Description
Original Field
The name of the original field.
Match Patterns
An array of Grok expressions. The processor_grok plugin matches the log against the expressions in the list from top to bottom and returns the result of the first successful match.
For more information about the default expressions, see processor_grok. If the default expressions do not meet your requirements, you can enter a custom Grok expression in the Custom Grok Patterns field.
NoteConfiguring multiple Grok expressions may affect performance. We recommend that you configure no more than five expressions.
Custom Grok Patterns
Enter a custom rule name and Grok expression.
Custom Grok Pattern File Directory
The directory that contains custom Grok pattern files. The processor_grok plugin reads all files in the directory.
ImportantAfter you update a custom Grok pattern file, you must restart Logtail for the update to take effect.
Maximum Match Time
The maximum time to try extracting fields using a Grok expression, in milliseconds. If you set this parameter to 0 or leave it empty, no timeout is set.
Retain logs that failed to parse
If you select this option, the log is retained if parsing fails.
Retain Original Field
If you select this option, the original field is retained in the parsed log.
Report Missing Original Field as Error
If you select this option, an error is reported if the raw log does not contain the specified original field.
Report an error if no patterns match
If you select this option, an error is reported when a log does not match any expression in Match Patterns.
Report Match Timeout Errors
If you select this option, an error is reported if the match times out.
Configuration example
This example shows how to use Grok mode to extract the value of the content field. The extracted fields are named year, month, and day.
Raw log
"content" : "2022 October 17"Logtail plugin configuration for data processing

Processing result
"year":"2022" "month":"October" "day":"17"
JSON configuration
Parameters
Set type to processor_grok. The following table describes the parameters in detail.
Parameter
Type
Required
Description
CustomPatternDir
String array
No
The directory that contains custom Grok pattern files. The processor_grok plugin reads all files in the directory.
If you do not add this parameter, custom Grok pattern files are not imported.
ImportantAfter you update a custom Grok pattern file, you must restart Logtail for the update to take effect.
CustomPatterns
Map
No
Custom Grok patterns. The key is the rule name and the value is the Grok expression.
For information about the default expressions, see processor_grok. If the default expressions do not meet your requirements, enter a custom Grok expression in the Match parameter.
If you do not add this parameter, custom Grok patterns are not used.
SourceKey
String
No
The name of the original field. The default value is the content field.
Match
String array
Yes
An array of Grok expressions. The processor_grok plugin matches the log against the expressions in the list from top to bottom and returns the result of the first successful match.
NoteConfiguring multiple Grok expressions may affect performance. We recommend that you configure no more than five expressions.
TimeoutMilliSeconds
Long
No
The maximum time allowed to extract fields using a Grok expression, in milliseconds.
If you do not add this parameter or set it to 0, no timeout is set.
IgnoreParseFailure
Boolean
No
Specifies whether to ignore logs that fail to parse.
true (default): The log is ignored.
false: The log is deleted.
KeepSource
Boolean
No
Specifies whether to retain the original field after successful parsing.
true (default): The original field is retained.
false: The original field is discarded.
NoKeyError
Boolean
No
Specifies whether to report an error if the raw log does not contain the specified original field.
true: An error is reported.
false (default): No error is reported.
NoMatchError
Boolean
No
Specifies whether to report an error if the log does not match any expression in the Match parameter.
true (default): An error is reported.
false: No error is reported.
TimeoutError
Boolean
No
Specifies whether to report an error if the match times out.
true (default): An error is reported.
false: No error is reported.
Example 1
This example shows how to use Grok mode to extract the value of the content field. The extracted fields are named year, month, and day.
Raw log
"content" : "2022 October 17"Logtail plugin configuration for data processing
{ "type" : "processor_grok", "detail" : { "KeepSource" : false, "Match" : [ "%{YEAR:year} %{MONTH:month} %{MONTHDAY:day}" ], "IgnoreParseFailure" : false } }Processing result
"year":"2022" "month":"October" "day":"17"
Example 2
This example shows how to use Grok mode to extract the value of the content field from multiple logs and parse the extracted values into different results based on different Grok expressions.
Raw log
{ "content" : "begin 123.456 end" } { "content" : "2019 June 24 \"I am iron man"\" } { "content" : "WRONG LOG" } { "content" : "10.0.0.0 GET /index.html 15824 0.043" }Logtail plugin configuration for data processing
{ "type" : "processor_grok", "detail" : { "CustomPatterns" : { "HTTP" : "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" }, "IgnoreParseFailure" : false, "KeepSource" : false, "Match" : [ "%{HTTP}", "%{WORD:word1} %{NUMBER:request_time} %{WORD:word2}", "%{YEAR:year} %{MONTH:month} %{MONTHDAY:day} %{QUOTEDSTRING:motto}" ], "SourceKey" : "content" }, }Processing result
For the first log, the processor_grok plugin fails to match the first expression
%{HTTP}in the Match parameter. The plugin then successfully matches the second expression%{WORD:word1} %{NUMBER:request_time} %{WORD:word2}. Therefore, the plugin returns the result that is extracted based on the second expression.Because the KeepSource parameter is set to false, the original content field is discarded.
For the second log entry, the processor_grok plugin fails to match the first expression
%{HTTP}and the second expression%{WORD:word1} %{NUMBER:request_time} %{WORD:word2}in the Match parameter. It then successfully matches the third expression%{YEAR:year} %{MONTH:month} %{MONTHDAY:day} %{QUOTEDSTRING:motto}. Therefore, it returns the extracted results based on the third expression.For the third log, the processor_grok plugin fails to match any of the three expressions in the Match parameter. Because the IgnoreParseFailure parameter is set to false, the third log is discarded.
For the fourth log, the processor_grok plugin successfully matches the first expression
%{HTTP}in the Match parameter. Therefore, the plugin returns the result that is extracted based on the first expression.
{ "word1":"begin", "request_time":"123.456", "word2":"end", } { "year":"2019", "month":"June", "day":"24", "motto":"\"I am iron man"\", } { "client":"10.0.0.0", "method":"GET", "request":"/index.html", "bytes":"15824", "duration":"0.043", }
References
Configure a Logtail pipeline using API operations:
GetLogtailPipelineConfig - Get a Logtail pipeline configuration
ListLogtailPipelineConfig - List Logtail pipeline configurations
CreateLogtailPipelineConfig - Create a Logtail pipeline configuration
DeleteLogtailPipelineConfig - Delete a Logtail pipeline configuration
UpdateLogtailPipelineConfig - Update a Logtail pipeline configuration
Configure a processor plugin in the console:
Collect container logs from a Kubernetes cluster using a CRD (stdout/file)