By Abing
With the development of stream processing, an increasing number of tools and languages are developed for more efficient, flexible, and easy-to-use data processing. In this case, Simple Log Service (SLS) provides Processing Language (SPL) statements that unify the syntax of query, data processing, and data manipulation. This ensures the flexibility of data processing. SPL is fully supported by iLogtail V2.0 to collect logs and time series data. This article describes the iLogtail plug-ins for data processing. This article also describes how to write SPL statements. iLogtail is updated to V2.0 to support data processing by using SPL statements instead of using only plug-ins. This helps you process data generated from terminals in a more flexible manner.
iLogtail provides the following methods for data processing:
• Use native plug-ins: Native plug-ins provide high performance and are developed in C++.
• Use extended plug-ins: Extended plug-ins provide a more diverse ecosystem of features and are more flexible. Extended plug-ins are developed in Go.
• Use SPL statements: iLogtail V2.0 provides plug-ins developed in C++ that support SPL for data processing. This significantly improves the data processing capability and optimizes both performance and flexibility. You need to only write SPL statements to use the computing capabilities of SPL to process data. For more information about the SPL syntax, visit https://www.alibabacloud.com/help/en/sls/user-guide/spl-syntax/
Benefit | Development threshold | |
---|---|---|
Native plug-ins | • Native plug-ins are developed in C++. • Native plug-ins provide highest performance and relatively comprehensive operator capabilities • with lowest resource overheads. |
• Native plug-ins are developed in C++ with a medium-level development threshold. • The configurations of native plug-ins can be flexibly customized. |
Extended plug-ins | • Extended plug-ins are developed in Go. • Extended plug-ins provide high performance and relatively comprehensive operator capabilities • with low resource overheads. |
• Extended plug-ins are developed in Go with a low-level development threshold. • The configurations of extended plug-ins can be flexibly customized. |
SPL | • SPL is developed in C++. • SPL provides high performance and comprehensive operator capabilities • with low resource overheads. • You can flexibly combine functions and operators of SPL. |
• SPL is not open source. You can write SPL statements to process data in most cases without the need to write code. |
iLogtail V2.0 that supports SPL provides the following benefits:
The following section describes how SPL statements are used to implement the same processing capabilities provided by native and extended plug-ins.
Extract log fields based on a regular expression. Sample NGINX log:
127.0.0.1 - - [07/Jul/2022:10:43:30 +0800] "POST /PutData?Category=YunOsAccountOpLog" 0.024 18204 200 37 "-" "aliyun-sdk-java"
Configurations that specify a native plug-in is used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtal/debug/simple.log
processors:
- Type: processor_parse_regex_native
SourceKey: content
Regex: ([\d\.]+) \S+ \S+ \[(\S+) \S+\] \"(\w+) ([^\\"]*)\" ([\d\.]+) (\d+) (\d+) (\d+|-) \"([^\\"]*)\" \"([^\\"]*)\"
Keys:
- ip
- time
- method
- url
- request_time
- request_length
- status
- length
- ref_url
- browser
flushers:
- Type: flusher_stdout
OnlyStdout: true
Configurations that specify SPL statements are used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_spl
Script: |
*
| parse-regexp content, '([\d\.]+) \S+ \S+ \[(\S+) \S+\] \"(\w+) ([^\\"]*)\" ([\d\.]+) (\d+) (\d+) (\d+|-) \"([^\\"]*)\" \"([^\\"]*)\"' as ip, time, method, url, request_time, request_length, status, length, ref_url, browser
| project-away content
flushers:
- Type: flusher_stdout
OnlyStdout: true
Output:
{
"ip": "127.0.0.1",
"time": "07/Jul/2022:10:43:30",
"method": "POST",
"url": "/PutData?Category=YunOsAccountOpLog",
"request_time": "0.024",
"request_length": "18204",
"status": "200",
"length": "37",
"ref_url": "-",
"browser": "aliyun-sdk-java",
"__time__": "1713184059"
}
Extract log fields based on a specific delimiter. Sample log:
127.0.0.1,07/Jul/2022:10:43:30 +0800,POST,PutData Category=YunOsAccountOpLog,0.024,18204,200,37,-,aliyun-sdk-java
Configurations that specify a native plug-in is used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_parse_delimiter_native
SourceKey: content
Separator: ","
Quote: '"'
Keys:
- ip
- time
- method
- url
- request_time
- request_length
- status
- length
- ref_url
- browser
flushers:
- Type: flusher_stdout
OnlyStdout: true
Configurations that specify SPL statements are used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_spl
Script: |
*
| parse-csv content as ip, time, method, url, request_time, request_length, status, length, ref_url, browser
| project-away content
flushers:
- Type: flusher_stdout
OnlyStdout: true
Output:
{
"ip": "127.0.0.1",
"time": "07/Jul/2022:10:43:30 +0800",
"method": "POST",
"url": "PutData?Category=YunOsAccountOpLog",
"request_time": "0.024",
"request_length": "18204",
"status": "200",
"length": "37",
"ref_url": "-",
"browser": "aliyun-sdk-java",
"__time__": "1713231487"
}
Parse JSON logs. Sample log:
{"url": "POST /PutData?Category=YunOsAccountOpLog HTTP/1.1","ip": "10.200.98.220",
"user-agent": "aliyun-sdk-java",
"request": "{\"status\":\"200\",\"latency\":\"18204\"}",
"time": "07/Jul/2022:10:30:28",
"__time__": "1713237315"
}
Configurations that specify a native plug-in is used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_parse_json_native
SourceKey: content
flushers:
- Type: flusher_stdout
OnlyStdout: true
Configurations that specify SPL statements are used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_spl
Script: |
*
| parse-json content
| project-away content
flushers:
- Type: flusher_stdout
OnlyStdout: true
Output:
{ "url": "POST /PutData?Category=YunOsAccountOpLog HTTP/1.1",
"ip": "10.200.98.220",
"user-agent": "aliyun-sdk-java",
"request": "{\"status\":\"200\",\"latency\":\"18204\"}",
"time": "07/Jul/2022:10:30:28",
"__time__": "1713237315"
}
Parse log fields based on a regular expression and specify that one of the fields is in the time format. Sample log:
127.0.0.1,07/Jul/2022:10:43:30 +0800,POST,PutData Category=YunOsAccountOpLog,0.024,18204,200,37,-,aliyun-sdk-java
Configurations that specify a native plug-in is used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_parse_regex_native
SourceKey: content
Regex: ([\d\.]+) \S+ \S+ \[(\S+) \S+\] \"(\w+) ([^\\"]*)\" ([\d\.]+) (\d+) (\d+) (\d+|-) \"([^\\"]*)\" \"([^\\"]*)\"
Keys:
- ip
- time
- method
- url
- request_time
- request_length
- status
- length
- ref_url
- browser
- Type: processor_parse_timestamp_native
SourceKey: time
SourceFormat: '%Y-%m-%dT%H:%M:%S'
flushers:
- Type: flusher_stdout
OnlyStdout: true
Configurations that specify SPL statements are used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_spl
Script: |
*
| parse-regexp content, '([\d\.]+) \S+ \S+ \[(\S+)\] \"(\w+) ([^\\"]*)\" ([\d\.]+) (\d+) (\d+) (\d+|-) \"([^\\"]*)\" \"([^\\"]*)\"' as ip, time, method, url, request_time, request_length, status, length, ref_url, browser
| extend ts=date_parse(time, '%Y-%m-%d %H:%i:%S')
| extend __time__=cast(to_unixtime(ts) as INTEGER)
| project-away ts
| project-away content
flushers:
- Type: flusher_stdout
OnlyStdout: true
Output:
{
"ip": "127.0.0.1",
"time": "07/Jul/2022:10:43:30 +0800",
"method": "POST",
"url": "PutData?Category=YunOsAccountOpLog",
"request_time": "0.024",
"request_length": "18204",
"status": "200",
"length": "37",
"ref_url": "-",
"browser": "aliyun-sdk-java",
"__time__": "1713231487"
}
Parse log fields based on a regular expression and filter parsed log fields. Sample log:
127.0.0.1 - - [07/Jul/2022:10:43:30 +0800] "POST /PutData?Category=YunOsAccountOpLog" 0.024 18204 200 37 "-" "aliyun-sdk-java"
127.0.0.1 - - [07/Jul/2022:10:44:30 +0800] "Get /PutData?Category=YunOsAccountOpLog" 0.024 18204 200 37 "-" "aliyun-sdk-java"
Configurations that specify a native plug-in is used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_parse_regex_native
SourceKey: content
Regex: ([\d\.]+) \S+ \S+ \[(\S+) \S+\] \"(\w+) ([^\\"]*)\" ([\d\.]+) (\d+) (\d+) (\d+|-) \"([^\\"]*)\" \"([^\\"]*)\"
Keys:
- ip
- time
- method
- url
- request_time
- request_length
- status
- length
- ref_url
- browser
- Type: processor_filter_regex_native
FilterKey:
- method
- status
FilterRegex:
- ^(POST|PUT)$
- ^200$
flushers:
- Type: flusher_stdout
OnlyStdout: true
Configurations that specify SPL statements are used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_spl
Script: |
*
| parse-regexp content, '([\d\.]+) \S+ \S+ \[(\S+) \S+\] \"(\w+) ([^\\"]*)\" ([\d\.]+) (\d+) (\d+) (\d+|-) \"([^\\"]*)\" \"([^\\"]*)\"' as ip, time, method, url, request_time, request_length, status, length, ref_url, browser
| project-away content
| where regexp_like(method, '^(POST|PUT)$') and regexp_like(status, '^200$')
flushers:
- Type: flusher_stdout
OnlyStdout: true
Output:
{
"ip": "127.0.0.1",
"time": "07/Jul/2022:10:43:30",
"method": "POST",
"url": "/PutData?Category=YunOsAccountOpLog",
"request_time": "0.024",
"request_length": "18204",
"status": "200",
"length": "37",
"ref_url": "-",
"browser": "aliyun-sdk-java",
"__time__": "1713238839"
}
Mask the sensitive content in logs. Sample log:
{"account":"1812213231432969","password":"04a23f38"}
Configurations that specify a native plug-in is used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_desensitize_native
SourceKey: content
Method: const
ReplacingString: "******"
ContentPatternBeforeReplacedString: 'password":"'
ReplacedContentPattern: '[^"]+'
flushers:
- Type: flusher_stdout
OnlyStdout: true
Configurations that specify SPL statements are used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_spl
Script: |
*
| parse-regexp content, 'password":"(\S+)"' as password
| extend content=replace(content, password, '******')
flushers:
- Type: flusher_stdout
OnlyStdout: true
Output:
{
"content": "{\"account\":\"1812213231432969\",\"password\":\"******\"}",
"__time__": "1713239305"
}
Add fields to the output. Sample log:
this is a test log
Configurations that specify an extended plug-in is used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_add_fields
Fields:
service: A
IgnoreIfExist: false
flushers:
- Type: flusher_stdout
OnlyStdout: true
Configurations that specify SPL statements are used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_spl
Script: |
*
| extend service='A'
flushers:
- Type: flusher_stdout
OnlyStdout: true
Output:
{
"content": "this is a test log",
"service": "A",
"__time__": "1713240293"
}
Parse JSON logs and drop specific parsed log fields. Sample log:
{"key1": 123456, "key2": "abcd"}
Configurations that specify an extended plug-in is used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_parse_json_native
SourceKey: content
- Type: processor_drop
DropKeys:
- key1
flushers:
- Type: flusher_stdout
OnlyStdout: true
Configurations that specify SPL statements are used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_spl
Script: |
*
| parse-json content
| project-away content
| project-away key1
flushers:
- Type: flusher_stdout
OnlyStdout: true
Output:
{ "key2": "abcd",
"__time__": "1713245944"
}
Parse JSON logs and rename parsed log fields. Sample log:
{"key1": 123456, "key2": "abcd"}
Configurations that specify an extended plug-in is used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_parse_json_native
SourceKey: content
- Type: processor_rename
SourceKeys:
- key1
DestKeys:
- new_key1
flushers:
- Type: flusher_stdout
OnlyStdout: true
Configurations that specify SPL statements are used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_spl
Script: |
*
| parse-json content
| project-away content
| project-rename new_key1=key1
flushers:
- Type: flusher_stdout
OnlyStdout: true
Output:
{
"new_key1": "123456",
"key2": "abcd",
"__time__": "1713249130"
}
Parse JSON logs and filter parsed logs. Sample log:
{"ip": "10.**.**.**", "method": "POST", "browser": "aliyun-sdk-java"}
{"ip": "10.**.**.**", "method": "POST", "browser": "chrome"}
{"ip": "192.168.**.**", "method": "POST", "browser": "aliyun-sls-ilogtail"}
Configurations that specify an extended plug-in is used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_parse_json_native
SourceKey: content
- Type: processor_filter_regex
Include:
ip: "10\\..*"
method: POST
Exclude:
browser: "aliyun.*"
flushers:
- Type: flusher_stdout
OnlyStdout: true
Configurations that specify SPL statements are used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_spl
Script: |
*
| parse-json content
| project-away content
| where regexp_like(ip, '10\..*') and regexp_like(method, 'POST') and not regexp_like(browser, 'aliyun.*')
flushers:
- Type: flusher_stdout
OnlyStdout: true
Output:
{
"ip": "10.**.**.**",
"method": "POST",
"browser": "chrome",
"__time__": "1713246645"
}
Parse JSON logs and map field values. Sample log:
{"_ip_":"192.168.0.1","Index":"900000003"}
{"_ip_":"255.255.255.255","Index":"3"}
Configurations that specify an extended plug-in is used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_parse_json_native
SourceKey: content
- Type: processor_dict_map
MapDict:
"127.0.0.1": "LocalHost-LocalHost"
"192.168.0.1": "default login"
SourceKey: "_ip_"
DestKey: "_processed_ip_"
Mode: "overwrite"
HandleMissing": true
Missing: "Not Detected"
flushers:
- Type: flusher_stdout
OnlyStdout: true
Configurations that specify SPL statements are used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_spl
Script: |
*
| parse-json content
| project-away content
| extend _processed_ip_=
CASE
WHEN _ip_ = '127.0.0.1' THEN 'LocalHost-LocalHost'
WHEN _ip_ = '192.168.0.1' THEN 'default login'
ELSE 'Not Detected'
END
flushers:
- Type: flusher_stdout
OnlyStdout: true
Output:
{
"_ip_": "192.168.0.1",
"Index": "900000003",
"_processed_ip_": "default login",
"__time__": "1713259557"
}
Replace a specific string in logs. Sample log:
hello,how old are you? nice to meet you
Configurations that specify an extended plug-in is used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_string_replace
SourceKey: content
Method: const
Match: "how old are you?"
ReplaceString: ""
flushers:
- Type: flusher_stdout
OnlyStdout: true
Configurations that specify SPL statements are used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_spl
Script: |
*
| extend content=replace(content, 'how old are you?', '')
flushers:
- Type: flusher_stdout
OnlyStdout: true
Output:
{ "content": "hello, nice to meet you",
"__time__": "1713260499"
}
Perform Base64 encoding on log fields. Sample log:
this is a test log
Configurations that specify an extended plug-in is used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_base64_encoding
SourceKey: content
NewKey: content1
flushers:
- Type: flusher_stdout
OnlyStdout: true
Configurations that specify SPL statements are used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_spl
Script: |
*
| extend content1=to_base64(cast(content as varbinary))
flushers:
- Type: flusher_stdout
OnlyStdout: true
Output:
{ "content": "this is a test log",
"content1": "dGhpcyBpcyBhIHRlc3QgbG9n",
"__time__": "1713318724"
}
Perform MD5 encoding on log fields. Sample log:
hello,how old are you? nice to meet you
Configurations that specify an extended plug-in is used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_string_replace
SourceKey: content
Method: const
Match: "how old are you?"
ReplaceString: ""
flushers:
- Type: flusher_stdout
OnlyStdout: true
Configurations that specify SPL statements are used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_spl
Script: |
*
| extend content1=lower(to_hex(md5(cast(content as varbinary))))
flushers:
- Type: flusher_stdout
OnlyStdout: true
Output:
{
"content": "this is a test log",
"content1": "4f3c93e010f366eca78e00dc1ed08984",
"__time__": "1713319673"
}
Sample log: 4
Configurations that specify SPL statements are used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_spl
Script: |
*
| extend val = cast(content as double)
| extend power_test = power(val, 2)
| extend round_test = round(val)
| extend sqrt_test = sqrt(val)
flushers:
- Type: flusher_stdout
OnlyStdout: true
Output:
{
"content": "4",
"power_test": 16.0,
"round_test": 4.0,
"sqrt_test": 2.0,
"val": 4.0,
"__time__": "1713319673"
}
Sample log:
https://homenew.console.aliyun.com/home/dashboard/ProductAndService
Configurations that specify SPL statements are used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_spl
Script: |
*
| extend encoded = url_encode(content)
| extend decoded = url_decode(encoded)
flushers:
- Type: flusher_stdout
OnlyStdout: true
Output:
{
"content": "https://homenew.console.aliyun.com/home/dashboard/ProductAndService",
"decoded": "https://homenew.console.aliyun.com/home/dashboard/ProductAndService",
"encoded": "https%3A%2F%2Fhomenew.console.aliyun.com%2Fhome%2Fdashboard%2FProductAndService",
"__time__": "1713319673"
}
Sample log:
https://sls.console.aliyun.com:443/lognext/project/dashboard-all/logsearch/nginx-demo?accounttraceid=d6241a173f88471c91d3405cda010ff5ghdw
Configurations that specify SPL statements are used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_spl
Script: |
*
| extend host = url_extract_host(content)
| extend query = url_extract_query(content)
| extend path = url_extract_path(content)
| extend protocol = url_extract_protocol(content)
| extend port = url_extract_port(content)
| extend param = url_extract_parameter(content, 'accounttraceid')
flushers:
- Type: flusher_stdout
OnlyStdout: true
Output:
{
"content": "https://sls.console.aliyun.com:443/lognext/project/dashboard-all/logsearch/nginx-demo?accounttraceid=d6241a173f88471c91d3405cda010ff5ghdw",
"host": "sls.console.aliyun.com",
"param": "d6241a173f88471c91d3405cda010ff5ghdw",
"path": "/lognext/project/dashboard-all/logsearch/nginx-demo",
"port": "443",
"protocol": "https",
"query": "accounttraceid=d6241a173f88471c91d3405cda010ff5ghdw",
"__time__": "1713319673"
}
Sample log:
{"num1": 199, "num2": 10, "num3": 9}
Configurations that specify SPL statements are used:
enable: true
inputs:
- Type: input_file
FilePaths:
- /workspaces/ilogtail/debug/simple.log
processors:
- Type: processor_spl
Script: |
*
| parse-json content
| extend compare_result = cast(num1 as double) > cast(num2 as double) AND cast(num2 as double) > cast(num3 as double)
flushers:
- Type: flusher_stdout
OnlyStdout: true
Output:
{ "compare_result": "true",
"content": "{\"num1\": 199, \"num2\": 10, \"num3\": 9}",
"num1": "199",
"num2": "10",
"num3": "9",
"__time__": "1713319673"
}
For more information about the capabilities of SPL, visit https://www.alibabacloud.com/help/en/sls/user-guide/function-overview
You are welcome to provide more use cases of iLogtail SPL.
A Deep Dive Into Rum-Integrated End-To-End Tracing: A Powerful Tool to Optimize Digital Experience
Alibaba Cloud Native API Gateway Helps Industries Connect to DeepSeek Safely and Reliably
532 posts | 52 followers
FollowAlibaba Cloud Native Community - March 29, 2024
Alibaba Cloud Native Community - August 14, 2024
Alibaba Cloud Native Community - May 13, 2024
Alibaba Cloud Native Community - May 31, 2024
Alibaba Cloud Native Community - November 1, 2024
Alibaba Cloud Native - September 12, 2024
532 posts | 52 followers
FollowRealtime Compute for Apache Flink offers a highly integrated platform for real-time data processing, which optimizes the computing of Apache Flink.
Learn MoreBuild a Data Lake with Alibaba Cloud Object Storage Service (OSS) with 99.9999999999% (12 9s) availability, 99.995% SLA, and high scalability
Learn MoreConduct large-scale data warehousing with MaxCompute
Learn MorePlan and optimize your storage budget with flexible storage services
Learn MoreMore Posts by Alibaba Cloud Native Community