Simple Log Service provides a scan-based query feature that allows you to search logs using specified fields without the need for index configuration. It also supports Simple Log Service Processing Language (SPL) statements for filtering, converting, and parsing results. This topic covers the basic syntax for scan-based queries.
How it works
After receiving a scan-based analysis request, Simple Log Service executes the following steps:
Executes the search statement to retrieve logs.
ImportantThe search statement in the first-level pipeline relies on index-based queries. Use
*
if no index filtering is needed. For example, before executingstatus:200 | WHERE userId = '123' | extend host=upper(hostname)
, you must create an index for thestatus
field, but not for theuserId
orhostname
fields.Executes SPL statements on the search results and returns the final outcome. For example, you can use an SPL statement to filter, convert, and parse data.
Basic syntax
The scan-based query supports SPL syntax. SPL statements allow you to extract structured information, conduct field operations, and filter data from the raw data read. It also supports multi-level pipeline cascades. The first-level pipeline is the index filter condition, followed by SPL instructions in subsequent levels. The final results are data processed by SPL.
索引查询语句 | <spl-cmd> ... | <spl-cmd> ...
Log sample
Raw fields: labeled [R], applicable to scan search.
Index fields: labeled [I], applicable to index search.
[I] __topic__: nginx-access-log
[I] Status: 200
[I] Host: api.abc.com
[R] Method: PUT
[R] ClientIp: 192.168.1.1
[R] Payload: {"Item": "1122", "UserId": "112233", "Operation": "AddCart"}
[R] BeginTime: 1705029260
[R] EndTime: 1705028561
[R] RT: 87
[R] Uri: /request/path-3/file-1
[R] UserAgent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_5; ar) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4
Examples
In SPL statements, constant strings are enclosed in single quotes ('), such as
* | where ClientIp = '192.168.1.1'
.If a field name includes special characters, enclose it in double quotes ("), such as
* | project-away "user-agent"
.
Filter by different conditions
Equality comparison
Status: 200 | where ClientIp = '192.168.1.1'
Case-insensitive search
__topic__: nginx-access-log | where lower(Method) != 'put'
Fuzzy match
Status: 200 | where UserAgent like '%Macintosh%'
Numerical comparison
Note that the default field type is varchar. Convert the type to bigint before performing numerical comparisons.
Status: 200 | where cast(RT as bigint) > 50
Regular expression match
# Find URIs that contain "path-number" Status: 200 | where regexp_like(Uri, 'path-\d+')
Calculate new fields
Calculate new fields from existing ones using the extend instruction.
Extract fields using regular expressions
# Extract the file number from the Uri field * not Status: 200 | extend fileNumber=regexp_extract(Uri, 'file-(\d+)', 1)
Extract fields from JSON
Status:200 | extend Item = json_extract_scalar(Payload, '$.Item')
Extract fields based on a separator
Status:200 | extend urlParam=split_part(Uri, '/', 3)
Calculate new fields from multiple field values
# Calculate the time difference based on BeginTime and EndTime Status:200 | extend timeRange = cast(BeginTime as bigint) - cast(EndTime as bigint)
Retain, remove, and rename fields
Keep only specific fields and remove the rest
Status:200 | project Status, Uri
Remove specific fields and keep the rest
Status:200 | project-away UserAgent
Rename fields
Status:200 | project-rename Latency=RT
Expand unstructured data
Expand all fields in JSON
# Filter non-empty Payloads and expand all JSON fields __topic__: nginx-access-log | where Payload is not null | parse-json Payload
Expand JSON fields and discard the original ones
status:200 | parse-json body | project-away body
Extract multiple fields using regular expressions
Status:200 | parse-regexp Uri, 'path-(\d+)/file-(\d+)' as pathIndex, fileIndex
Multi-level pipeline cascade
You can perform all the operations mentioned above in the same search statement using multi-level pipeline cascades, executed in sequence.
Status:200
| where Payload is not null
| parse-json Payload
| project-away Payload
| where Host='api.qzzw.com' and cast(RT as bigint) > 80
| extend timeRange=cast(BeginTime as bigint) - cast(EndTime as bigint)
| where timeRange > 500
| project UserId, Uri
Limits
The execution of SPL statements for scan-based query is limited. For more information, see Limits.
Random page turning is not supported.
Comparison between index- and scan-based query
Item | Index-based query | Scan-based query |
Syntax |
|
|
Index configuration needed | Yes. | The index-based search statement requires indexes, while other statements do not. |
Analytic statements supported | Yes. | Yes. |
Random pagination supported | Yes. | No. Random pagination is not supported. Only continuous pagination (forward or backward) is allowed. |
Log histogram | Displayed based on the results of the search statement. | Displayed based on the results of the search statement and the scan progress. |
Operators and functions | Logical and mathematical calculation, and fuzzy search are supported. SQL functions are not. | For details, see SPL instructions and SPL-supported SQL functions. |
Field types | Determined by the data types that are specified in index configurations. | The system considers the types of fields in SPL statements as text regardless of whether indexes are configured for the fields. For more information, see Convert data types. |
Result size | The number of logs to return can be specified in the Simple Log Service console or by calling an SDK. The maximum number is 100. | If one of the following conditions is met, the system stops the current scan and returns results:
|
Fees | You are charged for index traffic and index storage. For more information, see Billable items of pay-by-feature. | You are charged for scans based on the scan traffic, which is equivalent to the amount of data returned after scanning. The system identifies logs based on the results of index-based query. |