All Products
Search
Document Center

:Parsing in delimiter mode

Last Updated:Dec 20, 2023

You can use a Logtail plug-in to parse logs into multiple key-value pairs based on a specific delimiter.

Entry point

If you want to use a Logtail plug-in to process logs, you can add a Logtail plug-in configuration when you create or modify a Logtail configuration. For more information, see Overview.

Configuration description

Parameter

Description

Original Field

The original field that is used to store the content of a log before the log is parsed. Default value: content.

Delimiter

The delimiter based on which you want to extract log fields. Select a delimiter based on the actual log content. For example, you can select Vertical Bar (|).

Note

If you set the Delimiter parameter to Non-printable Character, you must enter a character in the following format: 0x<Hexadecimal ASCII code of the non-printable character>. For example, if you want to use the non-printable character whose hexadecimal ASCII code is 01, you must enter 0x01.

Quote

The quote. If a log field contains delimiters, you must specify a quote to enclose the field. Simple Log Service parses the content that is enclosed in a pair of quotes into a complete field. You must select a quote based on the format of logs that you want to collect.

Note

If you set the Quote parameter to Non-printable Character, you must enter a character in the following format: 0x<Hexadecimal ASCII code of the non-printable character>. For example, if you want to use the non-printable character whose hexadecimal ASCII code is 01, you must enter 0x01.

Extracted Field

  • If you specify a sample log, Simple Log Service can automatically extract log content based on the specified sample log and delimiter. Configure the Key parameter for each Value parameter. The Key parameter specifies the new field name. The Value parameter specifies the content that is extracted.

  • If you do not specify a sample log, the Value column is unavailable. You must specify keys based on the actual logs and delimiter.

A key can contain only letters, digits, and underscores (_) and must start with a letter or an underscore (_). A key can be up to 128 bytes in length.

Allow Missing Field

Specifies whether to upload logs that contain keys whose values are empty to Simple Log Service if the number of extracted values is less than the number of specified keys. If you select the Allow Missing Fields parameter, the logs are uploaded to Simple Log Service.

In this example, a log is 11|22|33|44, the Delimiter parameter is set to Vertical Bar (|), and the keys are set to A, B, C, D, and E.

  • The value of the E field is empty. If you select the Allow Missing Field parameter, the log is uploaded to Simple Log Service.

  • If you do not select the Allow Missing Field parameter, the log is discarded.

    Note

    Linux Logtail V1.0.28 and later or Windows Logtail V1.0.28.0 and later supports the Allow Missing Field parameter.

Processing Method of Field to which Excess Part is Assigned

The method that is used to process excess values that are extracted if the number of extracted values is greater than the number of specified keys.

  • Expand: retains the excess values and adds the values to the fields in the __column$i__ format respectively. $i indicates the sequence number of the excess field. The sequence number starts from 0. Examples: __column0__ and __column1__.

  • Retain: retains the excess values and adds the values to the __column0__ field.

  • Drop: discards the excess values.

Retain Original Field if Parsing Fails

If you select the Retain Original Field if Parsing Fails parameter and parsing fails, the original field is retained.

Retain Original Field if Parsing Succeeds

If you select the Retain Original Field if Parsing Succeeds parameter and parsing is successful, the original field is retained.

New Name of Original Field

If you select the Retain Original Field if Parsing Fails or Retain Original Field if Parsing Succeeds parameter, you can rename the original field to store the original log content.

Appendix

The Logtail plug-in for parsing data in delimiter mode supports single-character delimiters and multi-character delimiters.

Single-character delimiters

The following examples show logs that use single-character delimiters:

05/May/2022:13:30:28,10.10.*.*,"POST /PutData?Category=YunOsAccountOpLog&AccessKeyId=****************&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=******************************** HTTP/1.1",200,18204,aliyun-sdk-java
05/May/2022:13:31:23,10.10.*.*,"POST /PutData?Category=YunOsAccountOpLog&AccessKeyId=****************&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=******************************** HTTP/1.1",401,23472,aliyun-sdk-java

If a log uses single-character delimiters, you must specify a delimiter. You can also specify a quote.

  • Delimiter: Available single-character delimiters include the tab character (\t), vertical bar (|), space, comma (,), semicolon (;), and non-printable characters. You cannot specify a double quotation mark (") as a delimiter.

    However, a double quotation mark (") can be used as a quote. A double quotation mark (") can appear at the border of a field, or in the field. If a double quotation mark (") is included in a log field, it must be escaped as a pair of double quotation marks ("") when the log is processed. When Simple Log Service parses the log, a pair of double quotation marks ("") are restored to a double quotation mark ("). For example, you can specify a comma (,) as a delimiter and a double quotation mark (") as a quote. If a log field contains the specified delimiter and quote, the field is enclosed within a pair of quotes, and the double quotation mark (") in the field is escaped as a pair of double quotation marks (""). If a processed log is in the 1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00 format, the log is parsed into five fields: 1999, Chevy, Venture "Extended Edition, Very Large", an empty field, and 5000.00.

  • Quote: If a log field contains delimiters, you must specify a quote to enclose the field. Simple Log Service parses the content that is enclosed in a pair of quotes into a complete field.

    Available quotes include the tab character (\t), vertical bar (|), space, comma (,), semicolon (;), and non-printable characters.

    For example, if you specify a comma (,) as a delimiter and a double quotation mark (") as a quote, the log 1997,Ford,E350,"ac, abs, moon",3000.00 is parsed into five fields: 1997, Ford, E350, ac, abs, moon, and 3000.00.

Multi-character delimiters

The following examples show logs that use multi-character delimiters:

05/May/2022:13:30:28&&10.200.**.**&&POST /PutData?Category=YunOsAccountOpLog&AccessKeyId=****************&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=******************************** HTTP/1.1&&200&&18204&&aliyun-sdk-java
05/May/2022:13:31:23&&10.200.**.**&&POST /PutData?Category=YunOsAccountOpLog&AccessKeyId=****************&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=******************************** HTTP/1.1&&401&&23472&&aliyun-sdk-java

A multi-character delimiter can contain two or three characters, such as ||, &&&, and ^_^. Simple Log Service parses logs based on delimiters. You do not need to use quotes to enclose log fields.

Important

Make sure that each log field does not contain the exact delimiter. Otherwise, Simple Log Service cannot parse the logs as expected.

For example, if you specify && as the delimiter, the log 1997&&Ford&&E350&&ac&abs&moon&&3000.00 is parsed into five fields: 1997, Ford, E350, ac&abs&moon, and 3000.00.