All Products
Search
Document Center

Simple Log Service:SPL instructions

Last Updated:Jan 18, 2025

This topic describes the Simple Log Service Processing Language (SPL) instructions.

Parameter types

The table below describes the data types for parameters used in SPL instructions.

Parameter type

Description

Bool

The parameter specifies a Boolean value. This type of parameter is a switch in SPL instructions.

Char

The parameter specifies an ASCII character. You must use single quotation marks ('') to enclose the character. For example, 'a' indicates the character a. '\t' indicates the tab character. '\11' indicates the ASCII character whose serial number corresponds to the octal number 11. '\x09' indicates the ASCII character whose serial number corresponds to the hexadecimal number 09.

Integer

The parameter specifies an integer value.

String

The parameter specifies a string. You must use single quotation marks ('') to enclose the string. For example, 'this is a string'.

RegExp

The parameter specifies a regular expression. The RE2 syntax is supported. You must use single quotation marks ('') to enclose the regular expression. For example, '([\d.]+)'.

For more information, see Syntax.

JSONPath

The parameter specifies a JSON path. You must use single quotation marks ('') to enclose the JSON path. For example, '$.body.values[0]'.

For more information, see JsonPath.

Field

The parameter specifies a field name. For example, | project level, content.

If the field name contains special characters other than letters, digits, and underscores, you must use double quotation marks ("") to enclose the field name. For example, | project "a:b:c".

Note

For more information about case sensitivity of field names, see SPL function definitions in different scenarios.

FieldPattern

The parameter specifies a field name or a combination of a field name and a wildcard character. An asterisk (*) can be used as a wildcard character, which matches zero or multiple characters. You must use double quotation marks ("") to enclose the field name or the combination. For example, | project "__tag__:*".

Note

For more information about case sensitivity of field names, see SPL function definitions in different scenarios.

SPLExp

The parameter specifies an SPL expression.

SQLExp

The parameter specifies an SQL expression.

SPL instruction list

Instruction category

Instruction name

Description

Control instructions

.let

Defines a named dataset. For more information about SPL datasets, see SPL datasets.

Field processing instructions

project

Retains the fields that match the specified pattern and renames the specified fields. During instruction execution, all retain-related expressions are executed before rename-related expressions.

project-away

Removes the fields that match the specified pattern and retains all other fields as they are.

project-rename

Renames the specified fields and retains all other fields as they are.

expand-values

Expands a first-level JSON object for the specified field and returns multiple result entries.

SQL calculation instructions on structured data

extend

Creates fields based on the result of SQL expression-based data calculation. For more information about the SQL functions that are supported, see SQL functions supported by SPL.

where

This instruction filters data based on the result of SQL expression-based data calculation. Data that matches the specified SQL expression is retained. For more information about the SQL functions that are supported by the where instruction, see SQL functions supported by SPL.

Semi-structured data extraction instructions

parse-regexp

This instruction extracts the information that matches groups in the specified regular expression from the specified field.

parse-csv

Extracts information in the CSV format from the specified field.

parse-json

Extracts the first-layer JSON information from the specified field.

parse-kv

This instruction extracts the key-value pair information from the specified field.

Control instructions

.let

Defines a named dataset as the input for subsequent SPL expressions. For detailed information on SPL datasets, see SPL datasets.

Syntax

.let <dataset>=<spl-expr>

Parameter description

Parameter

Type

Required

Description

dataset

String

Yes

The name of the dataset. The name can contain letters, digits, and underscores, and must start with a letter. The name is case-sensitive.

spl-expr

SPLExp

Yes

The SPL expression that is used to generate the dataset.

Examples

  • Example 1: Filter and classify access logs by status codes before exporting them.

    • SPL statement

      -- Define the processing result of SPL as a named dataset src, which is used as the input of subsequent SPL expressions
      .let src = * 
      | where status=cast(status as BIGINT);
      
      -- Use the named dataset src as the input, filter the data whose status field is 5xx, and define the dataset err. The dataset is not exported
      .let err = $src
      | where status >= 500
      | extend msg='ERR';
      
      -- Use the named dataset src as the input, filter the data whose status field is 2xx, and define the dataset ok. The dataset is not exported
      .let ok = $src
      | where status >= 200 and status < 300
      | extend msg='OK';
      
      -- Export the named datasets err and ok
      $err;
      $ok;
  • Input data

    # Entry 1
    status: '200'
    body: 'this is a test'
    
    # Entry 2
    status: '500'
    body: 'internal error'
    
    # Entry 3
    status: '404'
    body: 'not found'
  • Output data

    # Entry 1: The dataset is err
    status: '500'
    body: 'internal error'
    msg: 'ERR'
    
    # Entry 2: The dataset is ok
    status: '200'
    body: 'this is a test'
    msg: 'OK'

Field processing instructions

project

Retains fields that match a specified pattern and renames specified fields. All expressions related to retaining fields are executed before those related to renaming during the execution of the instruction.

Important

By default, the fields __time__ and __time_ns_part__ are retained and cannot be renamed or overwritten. For more information, see Time fields.

Syntax

| project -wildcard <field-pattern>, <output>=<field>, ...

Parameter description

Parameter

Type

Required

Description

wildcard

Bool

No

Specifies whether to enable the wildcard match mode. By default, the exact match mode is used. If you want to enable the wildcard match mode, you must configure this parameter.

field-pattern

FieldPattern

Yes

The name of the field to retain, or a combination of a field and a wildcard character. All matched fields are processed.

output

Field

Yes

The new name of the field to rename. You cannot rename multiple fields to the same name.

Important

If the new field name is the same as an existing field name in the input data, see Retention and overwrite of old and new values.

field

Field

Yes​

The original name of the field to rename.

  • If the field does not exist in the input data, the rename operation is not performed.

  • You cannot rename a field multiple times.

Examples

  • Example 1: Retain a field.

    * | project level, err_msg
  • Example 2: Rename a field.

    * | project log_level=level, err_msg
  • Example 3: Retain the field that exactly matches __tag__:*.

    * | project "__tag__:*"

project-away

Removes fields that match a specified pattern, retaining all other fields unchanged.

Important

By default, the fields __time__ and __time_ns_part__ are retained. For more information, see Time fields.

Syntax

| project-away -wildcard <field-pattern>, ...

Parameter description

Parameter

Type

Required

Description

wildcard

Bool

No

Specifies whether to enable the wildcard match mode. By default, the exact match mode is used. If you want to enable the wildcard match mode, you must configure this parameter.

field-pattern

FieldPattern

Yes

The name of the field to remove, or a combination of a field and a wildcard character. All matched fields are processed.

project-rename

Renames specified fields while retaining all other fields unchanged.

Important

By default, the fields __time__ and __time_ns_part__ are retained and cannot be renamed or overwritten. For more information, see Time fields.

Syntax

| project-rename <output>=<field>, ...

Parameter description

Parameter

Type

Required

Description

output

Field

Yes

The new name of the field to rename. You cannot rename multiple fields to the same name.

Important

If the new field name is the same as an existing field name in the input data, see Retention and overwrite of old and new values.

field

Field

Yes

The original name of the field to rename.

  • If the field does not exist in the input data, the rename operation is not performed.

  • You cannot rename a field multiple times.

Example

Rename specified fields.

* | project-rename log_level=level, log_err_msg=err_msg

expand-values

Expands a first-level JSON object in a specified field, resulting in multiple output entries.

Important

Syntax

| expand-values -path=<path> -limit=<limit> -keep <field> as <output>

Parameter description

Parameter

Type

Required

Description

path

JSONPath

No​

The JSON path in the specified field. The JSON path is used to identify the information that you want to expand.

This parameter is empty by default, which specifies that the complete data of the specified field is expanded.

limit

Integer

No

The maximum number of entries that can be obtained after expanding a JSON object for the specified field. The value is an integer from 1 to 10. Default value: 10.

keep

Bool

No

Specifies whether to retain the original field after the expand operation is performed. By default, the original field is not retained. If you want to retain the original field, you must configure this parameter.

field

Field

Yes

The original name of the field to expand. The data type of the field must be VARCHAR. If the field does not exist, the expand operation is not performed.

output

Filed

No

The name of the new field that is obtained after the expand operation is performed. If you do not configure this parameter, the output data is exported to the original field.

You can expand a first-level JSON object for a field based on the following logic:

JSON array: expands an array by element.

JSON dictionary: expands a dictionary by key-value pair.

Other JSON types: returns the original values.

Invalid JSON: returns null.

Examples

  • Example 1: Expand an array to return multiple result entries.

    • SPL statement

      * | expand-values y
    • Input data

      x: 'abc'
      y: '[0,1,2]'
    • Output data, including three entries

      # Entry 1
      x: 'abc'
      y: '0'
      
      # Entry 2
      x: 'abc'
      y: '1'
      
      # Entry 3
      x: 'abc'
      y: '2'
  • Example 2: Expand a dictionary to return multiple result entries.

    • SPL statement

      * | expand-values y
    • Input data

      x: 'abc'
      y: '{"a": 1, "b": 2}'
    • Output data, including two entries

      # Entry 1
      x: 'abc'
      y: '{"a": 1}'
      
      # Entry 2
      x: 'abc'
      y: '{"b": 2}'
  • Example 3: Expand content that matches a specified JSONPath expression and export to a new field.

    • SPL statement

      * | expand-values -keep content -path='$.body' as body
    • Input data

      content: '{"body": [0, {"a": 1, "b": 2}]}'
    • Output data, including two entries

      # Entry 1
      content: '{"body": [1, 2]}'
      body: '0'
      
      # Entry 2
      content: '{"body": [1, 2]}'
      body: '{"a": 1, "b": 2}'

SQL calculation instructions on structured data

extend

Generates fields based on SQL expression-based data calculations. For a list of supported SQL functions, see SQL functions supported by SPL.

Syntax

| extend <output>=<sql-expr>, ...

Parameter description

Parameter

Type

Required

Description

output

Field

Yes

The name of the field to create. You cannot create the same field to store the results of multiple expressions.

Important

If the new field name is the same as an existing field name in the input data, the new field overwrites the existing field based on the data type and value.

sql-expr

SQLExpr

Yes

The data processing expression.

Important

For more information about null value processing, see Null value processing in SPL expressions.

Examples

  • Example 1: Apply a computation expression.

    * | extend Duration = EndTime - StartTime
  • Example 2: Utilize a regular expression.

    * | extend server_protocol_version=regexp_extract(server_protocol, '\d+')
  • Example 3: Extract JSONPath content and convert a field's data type.

    • SPL statement

      *
      | extend a=json_extract(content, '$.body.a'), b=json_extract(content, '$.body.b')
      | extend b=cast(b as BIGINT)
    • Input data

      content: '{"body": {"a": 1, "b": 2}}'
    • Output data

      content: '{"body": {"a": 1, "b": 2}}'
      a: '1'
      b: 2

where

Filters data based on SQL expression-based calculations. Data matching the specified SQL expression is retained. For a list of SQL functions supported by the where instruction, see SQL functions supported by SPL.

Syntax

| where <sql-expr>

Parameter description

Parameter

Type

Required

Description

sql-expr

SQLExp

Yes

The SQL expression. Data that matches this expression is retained.

Important

For more information about null value processing in SQL expressions, see Null value processing in SPL expressions.

Examples

  • Example 1: Filter data based on field content.

    * | where userId='123'
  • Example 2: Filter data using a regular expression that matches based on a field name.

    * | where regexp_like(server_protocol, '\d+')
  • Example 3: Convert a field's data type to match all server error data.

    * | where cast(status as BIGINT) >= 500

Semi-structured data extraction instructions

parse-regexp

Extracts information matching groups in a specified regular expression from a field.

Important

Syntax

| parse-regexp <field>, <pattern> as <output>, ...

Parameter description

Parameter

Type

Required

Description

field

Field

Yes

The original name of the field from which you want to extract information.

The input data must include this field, and its type must be VARCHAR with a non-null value. Otherwise, the extract operation is not performed.

pattern

Regexp

Yes

The regular expression. The RE2 syntax is supported.

output

Field

No

The name of the output field that you want to use to store the extraction result of the regular extraction.

Examples

  • Example 1: Use exploratory match mode.

    • SPL statement

      *
      | parse-regexp content, '(\S+)' as ip -- Generate the ip: 10.0.0.0 field.
      | parse-regexp content, '\S+\s+(\w+)' as method -- Generate the method: GET field.
    • Input data

      content: '10.0.0.0 GET /index.html 15824 0.043'
    • Output data

      content: '10.0.0.0 GET /index.html 15824 0.043'
      ip: '10.0.0.0'
      method: 'GET'
  • Example 2: Use full pattern match mode with unnamed capturing groups in the regular expression.

    • SPL statement

      * | parse-regexp content, '(\S+)\s+(\w+)' as ip, method
    • Input data

      content: '10.0.0.0 GET /index.html 15824 0.043'
    • Output data

      content: '10.0.0.0 GET /index.html 15824 0.043'
      ip: '10.0.0.0'
      method: 'GET'

parse-csv

Extracts CSV-formatted information from a specified field.

Important

Syntax

| parse-csv -delim=<delim> -quote=<quote> -strict <field> as <output>, ...

Parameter description

Parameter

Type

Required

Description

delim

String

No​

The delimiter of the input data. You can specify one to three valid ASCII characters.

You can use escape characters to indicate special characters. For example, \t indicates the tab character, \11 indicates the ASCII character whose serial number corresponds to the octal number 11, and \x09 indicates the ASCII character whose serial number corresponds to the hexadecimal number 09.

You can also use a combination of multiple characters as the delimiter. For example, $$$, ^_^.

Default value: comma (,).

quote

Char

No​

The quote of the input data. You can specify a single valid ASCII character. If the input data contains delimiters, you must specify a quote.

For example, you can specify double quotation marks (""), single quotation marks (''), or an unprintable character (0x01).

By default, quotes are not used.

Important

This parameter takes effect only if you set the delim parameter to a single character. You must specify different values for the quote and delim parameters.

strict

Bool

No

Specifies whether to enable strict pairing if the number of values in the input data is different from the number of fields specified in output.

  • False: non-strict pairing. The maximum pairing policy is used.

    • If the number of values exceeds the number of fields, the extra values are not returned.

    • If the number of fields exceeds the number of values, the extra fields are returned as empty strings.

  • True: strict pairing. No fields are returned.

Default value: False. If you want to enable strict paring, configure this parameter.

field

Field

Yes

The name of the field to parse.

The data content must include this field. The type must be VARCHAR, and its value must not be null. Otherwise, the extract operation is not performed.

output

Field

Yes

The name of the field that you want to use to store the parsing result of the input data.

Examples

  • Example 1: Match data in simple mode.

    • SPL statement

      * | parse-csv content as x, y, z
    • Input data

      content: 'a,b,c'
    • Output data

      content: 'a,b,c'
      x: 'a'
      y: 'b'
      z: 'c'
  • Example 2: Use double quotes as the quote character to match data containing special characters.

    • SPL statement

      * | parse-csv content as ip, time, host
    • Input data

      content: '192.168.0.100,"10/Jun/2019:11:32:16,127 +0800",example.aliyundoc.com'
    • Output data

      content: '192.168.0.100,"10/Jun/2019:11:32:16,127 +0800",example.aliyundoc.com'
      ip: '192.168.0.100'
      time: '10/Jun/2019:11:32:16,127 +0800'
      host: 'example.aliyundoc.com'
  • Example 3: Use a combination of multiple characters as separators.

    • SPL statement

      * | parse-csv -delim='||' content as time, ip, req
    • Input data

      content: '05/May/2022:13:30:28||127.0.0.1||POST /put?a=1&b=2'
    • Output data

      content: '05/May/2022:13:30:28||127.0.0.1||POST /put?a=1&b=2'
      time: '05/May/2022:13:30:28'
      ip: '127.0.0.1'
      req: 'POST /put?a=1&b=2'

parse-json

Extracts first-layer JSON information from a specified field.

Important

Syntax

| parse-json -mode=<mode> -path=<path> -prefix=<prefix> <field>

Parameter description

Parameter

Type

Required

Description

mode

String

No

The mode that is used to extract information when the name of the output field is the same as an existing field name in the input data. The default value is overwrite.

path

JSONPath

No

The JSON path in the specified field. The JSON path is used to locate the information that you want to extract.

The default value is an empty string. If you use the default value, the complete data of the specified field is extracted.

prefix

String

No​

The prefix of the fields that are generated by expanding a JSON structure. The default value is an empty string.

field

Field

Yes

The name of the field that you want to parse.

Make sure that this field is included in the input data and the field value is a non-null value and meets one of the following conditions. Otherwise, the extract operation is not performed.

  • The data type is JSON

  • The data type is VARCHAR, and the field value is a valid JSON string

Examples

  • Example 1: Extract all keys and values from the y field.

    • SPL statement

      * | parse-json y
    • Input data

      x: '0'
      y: '{"a": 1, "b": 2}'
    • Output data

      x: '0'
      y: '{"a": 1, "b": 2}'
      a: '1'
      b: '2'
  • Example 2: Extract the value of the body key from the content field as different fields.

    • SPL statement

      * | parse-json -path='$.body' content
    • Input data

      content: '{"body": {"a": 1, "b": 2}}'
    • Output data

      content: '{"body": {"a": 1, "b": 2}}'
      a: '1'
      b: '2'
  • Example 3: Extract information in preserve mode, retaining the original value for existing fields.

    • SPL statement

      * | parse-json -mode='preserve' y
    • Input data

      a: 'xyz'
      x: '0'
      y: '{"a": 1, "b": 2}'
    • Output data

      x: '0'
      y: '{"a": 1, "b": 2}'
      a: 'xyz'
      b: '2'

parse-kv

Extracts key-value pair information from a specified field.

Important

Syntax

| parse-kv -mode=<mode> -prefix=<prefix> -regexp <field>, <pattern>

Parameter

Parameter

Type

Required

Description

mode

String

No

If the output field name is the same as an existing field name in the input data, you can select an overwrite mode based on your business requirements.

The default value is overwrite. For more information, see Field extraction check and overwrite mode.

prefix

String

​No

The prefix of the output fields. The default value is an empty string.

regexp

Bool

Yes

Specifies whether to enable the regular extraction mode.

field

Field

Yes

The original name of the field from which you want to extract information.

The input data must include this field, the type must be VARCHAR, and its value must not be null. Otherwise, the extract operation is not performed.

pattern

RegExpr

Yes

The regular expression that contains two capturing groups. One capturing group extracts the field name and the other capturing group extracts the field value. RE2 regular expressions are supported.

Examples

  • Example 1: Extract key-value pairs in regular extraction mode when delimiters between pairs and keys and values vary.

    • SPL statement

      * | parse-kv -regexp content, '([^&?]+)(?:=|:)([^&?]+)'
    • Input data

      content: 'k1=v1&k2=v2?k3:v3'
      k1: 'xyz'
    • Output data

      content: 'k1=v1&k2=v2?k3:v3'
      k1: 'v1'
      k2: 'v2'
      k3: 'v3'
  • Example 2: Extract information in preserve mode, retaining the original value for existing fields.

    • SPL statement

      * | parse-kv -regexp -mode='preserve' content, '([^&?]+)(?:=|:)([^&?]+)'
    • Input data

      content: 'k1=v1&k2=v2?k3:v3'
      k1: 'xyz'
    • Output data

      content: 'k1=v1&k2=v2?k3:v3'
      k1: 'xyz'
      k2: 'v2'
      k3: 'v3'
  • Example 3: Extract information from complex unstructured data in regular extraction mode, where values are digits or strings enclosed in double quotes.

    • SPL statement

      * | parse-kv -regexp content, '([^&?]+)(?:=|:)([^&?]+)'
    • Input data

      content: 'verb="GET" URI="/healthz" latency="45.911µs" userAgent="kube-probe/1.30+" audit-ID="" srcIP="192.168.123.45:40092" contentType="text/plain; charset=utf-8" resp=200'
    • Output data

      content: 'verb="GET" URI="/healthz" latency="45.911µs" userAgent="kube-probe/1.30+" audit-ID="" srcIP="192.168.123.45:40092" contentType="text/plain; charset=utf-8" resp=200'
      verb: 'GET'
      URI: '/healthz'
      latency: '45.911µs'
      userAgent: 'kube-probe/1.30+'
      audit-ID: ''
      srcIP: '192.168.123.45:40092'
      contentType: 'text/plain; charset=utf-8'
      resp: '200'