Scan-based query lets you search logs by field values without configuring indexes. Combined with SPL (Simple Log Service Processing Language) statements, you can filter, transform, and parse results directly from raw log data.
How it works
When you run a scan-based query, SLS:
-
Executes the search statement to retrieve logs.
ImportantThe search statement in the first-level pipeline relies on index-based queries. Use
*if no index filtering is needed. For example, before executingstatus:200 | WHERE userId = '123' | extend host=upper(hostname), you must create an index for thestatusfield, but not for theuserIdorhostnamefields. -
Applies SPL statements to the search results—filtering, converting, and parsing data—then returns the final output.
Basic syntax
The scan-based query supports SPL syntax. SPL extracts structured information, performs field operations, and filters raw data. Multi-level pipeline cascades are supported: the first level is the index filter, followed by SPL instructions in subsequent levels.
Index-based search statement | <spl-cmd> ... | <spl-cmd> ...
Log sample
-
Raw fields: labeled [R], applicable to scan search.
-
Index fields: labeled [I], applicable to index search.
[I] __topic__: nginx-access-log
[I] Status: 200
[I] Host: api.abc.com
[R] Method: PUT
[R] ClientIp: 192.168.1.1
[R] Payload: {"Item": "1122", "UserId": "112233", "Operation": "AddCart"}
[R] BeginTime: 1705029260
[R] EndTime: 1705028561
[R] RT: 87
[R] Uri: /request/path-3/file-1
[R] UserAgent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_5; ar) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4
Examples
-
In SPL statements, constant strings are enclosed in single quotes ('), such as
* | where ClientIp = '192.168.1.1'. -
If a field name includes special characters, enclose it in double quotes ("), such as
* | project-away "user-agent".
Filter by different conditions
-
Equality comparison
Status: 200 | where ClientIp = '192.168.1.1' -
Case-insensitive search
__topic__: nginx-access-log | where lower(Method) != 'put' -
Fuzzy match
Status: 200 | where UserAgent like '%Macintosh%' -
Numerical comparison
Fields default to varchar. Cast to bigint for numerical comparisons.
Status: 200 | where cast(RT as bigint) > 50 -
Regular expression match
# Find URIs that contain "path-number" Status: 200 | where regexp_like(Uri, 'path-\d+')
Calculate new fields
Calculate new fields from existing ones using the extend instruction.
-
Extract fields using regular expressions
# Extract the file number from the Uri field * not Status: 200 | extend fileNumber=regexp_extract(Uri, 'file-(\d+)', 1) -
Extract fields from JSON
Status:200 | extend Item = json_extract_scalar(Payload, '$.Item') -
Extract fields based on a separator
Status:200 | extend urlParam=split_part(Uri, '/', 3) -
Calculate new fields from multiple field values
# Calculate the time difference based on BeginTime and EndTime Status:200 | extend timeRange = cast(BeginTime as bigint) - cast(EndTime as bigint)
Retain, remove, and rename fields
-
Keep only specific fields and remove the rest
Status:200 | project Status, Uri -
Remove specific fields and keep the rest
Status:200 | project-away UserAgent -
Rename fields
Status:200 | project-rename Latency=RT
Expand unstructured data
-
Expand all fields in JSON
# Filter non-empty Payloads and expand all JSON fields __topic__: nginx-access-log | where Payload is not null | parse-json Payload -
Expand JSON fields and discard the original ones
status:200 | parse-json body | project-away body -
Extract multiple fields using regular expressions
Status:200 | parse-regexp Uri, 'path-(\d+)/file-(\d+)' as pathIndex, fileIndex
Multi-level pipeline cascade
Combine the operations above in a single statement using multi-level pipeline cascades, executed in sequence.
Status:200
| where Payload is not null
| parse-json Payload
| project-away Payload
| where Host='api.qzzw.com' and cast(RT as bigint) > 80
| extend timeRange=cast(BeginTime as bigint) - cast(EndTime as bigint)
| where timeRange > 500
| project UserId, Uri
Limits
-
SPL execution in scan-based query has limits. For more information, see Limitations.
-
Random pagination is not supported.
Index-based vs. scan-based query
|
Item |
Index-based query |
Scan-based query |
|
Syntax |
|
|
|
Index configuration needed |
Yes. |
The index-based search statement requires indexes, while other statements do not. |
|
Analytic statements supported |
Yes. |
Yes. |
|
Random pagination supported |
Yes. |
No. Only continuous (forward/backward) pagination is supported. |
|
Log histogram |
Displayed based on the results of the search statement. |
Displayed based on the results of the search statement and the scan progress. |
|
Operators and functions |
Logical and mathematical calculation, and fuzzy search are supported. SQL functions are not. |
For details, see SPL instructions and functions and SPL-supported SQL functions. |
|
Field types |
Determined by the data types that are specified in index configurations. |
SPL treats all fields as text regardless of index configuration. For more information, see Data type conversion. |
|
Result size |
Configurable in the SLS console or via SDK. Maximum: 100. |
The scan stops and returns results when any of these conditions is met:
|
|
Fees |
You are charged for index traffic and index storage. For more information, see Billable items for pay-by-feature. |
Charged by scan traffic (the data volume returned from scanning). Log identification uses index-based query results. |