This document describes how to use Structured Process Language (SPL) statements to extract key-value pairs from strings.
Keyword extraction
To extract dynamic key-value pairs, use the parse-kv instruction to extract or transform keywords and values. For example, extract keywords and values from a log that is in the format k1: q=asd&a=1&b=2&1=3.
Raw log
k1: q=asd&a=1&b=2&__1__=3SPL statement
* | parse-kv k1,'&','='Result
k1: q=asd&a=1&b=2&__1__=3 q: asd a: 1 b: 2 __1__: 3
Value extraction
Use the parse-kv instruction for logs that have clear identifiers, such as logs in the a=b or a="cxxx" format.
Raw log
content1: k1="helloworld",the change world, k2="good"To extract content and exclude
the change world, use the following SPL statement:* | parse-kv content1,',','='Result
content1: k1="helloworld",the change world, k2="good" k1: "helloworld" k2: "good"
Keyword transformation
Transform keywords and values using the prefix="" parameter with the parse-kv instruction.
Raw log
k1: q=asd&a=1&b=2SPL statement
* | parse-kv -prefix='start_' k1,'&','='Result
k1: q=asd&a=1&b=2 start_a: 1 start_b: 2 start_q: asd
Value transformation
If a value contains quotation marks, such as in the log format k1:"v1"abc", use the parse-kv instruction to extract the value.
Raw log
""" The \ here is a regular character, not an escape character """ content2: k1:"v1\"abc", k2:"v2", k3:"v3"SPL statement
* | parse-kv content2,',', ':'Result
content2: k1:"v1\"abc", k2:"v2", k3: "v3" k1: "v1\"abc" k2: "v2" k3: "v3"
Customer use cases
Assume that a website log contains URL data that you need to extract. Design transformation rules as needed to process the log content.
Requirements
Requirement 1: Parse the log to extract fields such as
proto,domain, andparam.Requirement 2: Expand the key-value pairs in
param.
Raw log
__source__: 10.43.xx.xx __tag__:__client_ip__: 12.120.xx.xx __tag__:__receive_time__: 1563517113 __topic__: request: https://yz.m.sm.cn/video/getlist/s?ver=3.2.3&app_type=supplier&os=Android8.1.0SPL solution
* | parse-regexp request, '([^:]+)://([^/]+)(.+)' as uri_proto, uri_domain, uri_param | parse-regexp uri_param, '([^?]*)\?(.*)' as uri_path,uri_query | parse-kv uri_query,'&','='Detailed rules and results
Use the parse-kv instruction to parse the
requestfield.Use unnamed capture groups
* | parse-regexp request, '([^:]+)://([^/]+)(.+)' as uri_proto, uri_domain, uri_paramResult
uri_proto: https uri_domain: yz.m.sm.cn uri_param: /video/getlist/s?ver=3.2.3&app_type=supplier&os=Android8.1.0
Use the parse-regexp instruction to parse the
uri_paramfield.Use unnamed capture groups
* | parse-regexp uri_param, '([^?]*)\?(.*)' as uri_path,uri_queryResult
uri_path: /video/getlist/s uri_query: ver=3.2.3&app_type=supplier&os=Android8.1.0
Extract fields from
uri_param.Use unnamed capture groups
* | parse-kv uri_query,'&','='Result
ver: 3.2.3 app_type: supplier os: Android8.1.0
Preview of the processed log
__source__: 10.43.xx.xx __tag__:__client_ip__: 12.120.xx.xx __tag__:__receive_time__: 1563517113 __topic__: request: https://yz.m.sm.cn/video/getlist/s?ver=3.2.3&app_type=supplier&os=Android8.1.0 uri_domain: yz.m.sm.cn uri_path: /video/getlist/s uri_proto: https uri_query: ver=3.2.3&app_type=supplier&os=Android8.1.0 app_type: supplier os: Android8.1.0 ver: 3.2.3If you only need to parse the request, directly use the parse-kv instruction on the
requestfield. For example:* | parse-kv uri_query,'&','='Preview of the processed log:
__source__: 10.43.xx.xx __tag__:__client_ip__: 12.120.xx.xx __tag__:__receive_time__: 1563517113 __topic__: request: https://yz.m.sm.cn/video/getlist/s?ver=3.2.3&app_type=supplier&os=Android8.1.0 app_type: supplier os: Android8.1.0 ver: 3.2.3