All Products
Search
Document Center

Simple Log Service:Scan query syntax

Last Updated:Jun 02, 2026

Scan-based query lets you search logs by field values without configuring indexes. Combined with SPL (Simple Log Service Processing Language) statements, you can filter, transform, and parse results directly from raw log data.

How it works

When you run a scan-based query, SLS:

  1. Executes the search statement to retrieve logs.

    Important

    The search statement in the first-level pipeline relies on index-based queries. Use * if no index filtering is needed. For example, before executing status:200 | WHERE userId = '123' | extend host=upper(hostname), you must create an index for the status field, but not for the userId or hostname fields.

  2. Applies SPL statements to the search results—filtering, converting, and parsing data—then returns the final output.

Basic syntax

The scan-based query supports SPL syntax. SPL extracts structured information, performs field operations, and filters raw data. Multi-level pipeline cascades are supported: the first level is the index filter, followed by SPL instructions in subsequent levels.

Index-based search statement | <spl-cmd> ... | <spl-cmd> ...

Log sample

  • Raw fields: labeled [R], applicable to scan search.

  • Index fields: labeled [I], applicable to index search.

[I] __topic__: nginx-access-log
[I] Status: 200
[I] Host: api.abc.com
[R] Method: PUT
[R] ClientIp: 192.168.1.1
[R] Payload: {"Item": "1122", "UserId": "112233", "Operation": "AddCart"}
[R] BeginTime: 1705029260
[R] EndTime: 1705028561
[R] RT: 87
[R] Uri: /request/path-3/file-1
[R] UserAgent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_5; ar) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4

Examples

Note
  • In SPL statements, constant strings are enclosed in single quotes ('), such as * | where ClientIp = '192.168.1.1'.

  • If a field name includes special characters, enclose it in double quotes ("), such as * | project-away "user-agent".

Filter by different conditions

  • Equality comparison

    Status: 200 | where ClientIp = '192.168.1.1'
  • Case-insensitive search

    __topic__: nginx-access-log | where lower(Method) != 'put'
  • Fuzzy match

    Status: 200 | where UserAgent like '%Macintosh%'
  • Numerical comparison

    Fields default to varchar. Cast to bigint for numerical comparisons.

    Status: 200 |  where cast(RT as bigint) > 50
  • Regular expression match

    # Find URIs that contain "path-number"
    Status: 200 | where regexp_like(Uri, 'path-\d+')

Calculate new fields

Calculate new fields from existing ones using the extend instruction.

  • Extract fields using regular expressions

    # Extract the file number from the Uri field
    * not Status: 200 | extend fileNumber=regexp_extract(Uri, 'file-(\d+)', 1)
  • Extract fields from JSON

    Status:200 | extend Item = json_extract_scalar(Payload, '$.Item')
  • Extract fields based on a separator

    Status:200 | extend urlParam=split_part(Uri, '/', 3)
  • Calculate new fields from multiple field values

    # Calculate the time difference based on BeginTime and EndTime
    Status:200 | extend timeRange = cast(BeginTime as bigint) - cast(EndTime as bigint)

Retain, remove, and rename fields

  • Keep only specific fields and remove the rest

    Status:200 | project Status, Uri
  • Remove specific fields and keep the rest

    Status:200 | project-away UserAgent
  • Rename fields

    Status:200 | project-rename Latency=RT

Expand unstructured data

  • Expand all fields in JSON

    # Filter non-empty Payloads and expand all JSON fields
    __topic__: nginx-access-log | where Payload is not null | parse-json Payload
  • Expand JSON fields and discard the original ones

    status:200 
    | parse-json body 
    | project-away body
  • Extract multiple fields using regular expressions

    Status:200 | parse-regexp Uri, 'path-(\d+)/file-(\d+)' as pathIndex, fileIndex

Multi-level pipeline cascade

Combine the operations above in a single statement using multi-level pipeline cascades, executed in sequence.

Status:200 
| where Payload is not null 
| parse-json Payload 
| project-away Payload 
| where Host='api.qzzw.com' and cast(RT as bigint) > 80 
| extend timeRange=cast(BeginTime as bigint) - cast(EndTime as bigint)
| where timeRange > 500
| project UserId, Uri

Limits

  1. SPL execution in scan-based query has limits. For more information, see Limitations.

  2. Random pagination is not supported.

Index-based vs. scan-based query

Item

Index-based query

Scan-based query

Syntax

Search statement. For more information, see Search syntax and functions.

Search statement | SPL instruction 1 | SPL instruction 2 | .... For more information, see SPL syntax.

Index configuration needed

Yes.

The index-based search statement requires indexes, while other statements do not.

Analytic statements supported

Yes.

Yes.

Random pagination supported

Yes.

No.

Only continuous (forward/backward) pagination is supported.

Log histogram

Displayed based on the results of the search statement.

Displayed based on the results of the search statement and the scan progress.

Operators and functions

Logical and mathematical calculation, and fuzzy search are supported. SQL functions are not.

For details, see SPL instructions and functions and SPL-supported SQL functions.

Field types

Determined by the data types that are specified in index configurations.

SPL treats all fields as text regardless of index configuration. For more information, see Data type conversion.

Result size

Configurable in the SLS console or via SDK. Maximum: 100.

The scan stops and returns results when any of these conditions is met:

  • The identified log count reaches the specified return limit.

    You can set this in the SLS console or via SDK.

  • Scanned log count exceeds the auto-calculated upper limit (default: 100,000, derived from search statement results).

  • The scan duration exceeds 45 seconds.

Fees

You are charged for index traffic and index storage. For more information, see Billable items for pay-by-feature.

Charged by scan traffic (the data volume returned from scanning). Log identification uses index-based query results.