All Products
Search
Document Center

Simple Log Service:SPL instructions

Last Updated:Jun 28, 2025

This topic describes Simple Log Service Processing Language (SPL) instructions.

Parameter data types

The following table describes the data types of parameters supported in SPL instructions.

Parameter data type

Description

Bool

The parameter specifies a Boolean value. This type of parameter is a switch in SPL instructions.

Char

The parameter specifies an ASCII character. You must enclose the parameter value in single quotation marks (''). For example, 'a' indicates the character a, '\t' indicates the tab character, '\11' indicates the ASCII character whose serial number corresponds to the octal number 11, and '\x09' indicates the ASCII character whose serial number corresponds to the hexadecimal number 09.

Integer

The parameter specifies an integer value.

String

The parameter specifies a string value. You must enclose the parameter value in single quotation marks (''). For example, 'this is a string'.

RegExp

The parameter specifies an RE2 regular expression. You must enclose the parameter value in single quotation marks (''). For example, '([\d.]+)'.

For more information about the syntax, see Syntax.

JSONPath

The parameter specifies a JSON path. You must enclose the parameter value in single quotation marks (''). For example, '$.body.values[0]'.

For more information about the syntax, see JsonPath.

Field

The parameter specifies a field name. For example, | project level, content.

If the field name contains special characters other than letters, digits, and underscores, you must enclose the field name in double quotation marks (""). For example, | project "a:b:c".

Note

For more information about case sensitivity of field names, see SPL features in different scenarios.

FieldPattern

The parameter specifies a field name or a combination of a field name and a wildcard character. An asterisk (*) can be used as a wildcard character, which matches zero or multiple characters. You must enclose the parameter value in double quotation marks (""). For example, | project "__tag__:*".

Note

For more information about case sensitivity of field names, see SPL features in different scenarios.

SPLExp

The parameter specifies an SPL expression.

SQLExp

The parameter specifies an SQL expression.

SPL instructions

Instruction category

Instruction name

Description

Control instruction

.let

This instruction defines named datasets. For more information about SPL datasets, see SPL datasets.

Field processing instructions

project

This instruction retains the fields that match the specified pattern and renames the specified fields. During instruction execution, all retain-related expressions are executed before rename-related expressions.

project-away

This instruction removes the fields that match the specified pattern and retains all other fields as they are.

project-rename

This instruction renames the specified fields and retains all other fields as they are.

expand-values

This instruction expands a first-level JSON object for the specified field and returns multiple result entries.

SQL computing instructions on structured data

extend

This instruction creates fields based on the result of SQL expression-based data calculation. For more information about the supported SQL functions, see SQL functions supported by SPL.

where

This instruction filters data based on the result of SQL expression-based data calculation and retains the data that matches the specified SQL expression. For more information about the SQL functions supported by the where instruction, see SQL functions supported by SPL.

Semi-structured data extraction instructions

parse-regexp

This instruction extracts the information that matches groups in the specified regular expression from the specified field.

parse-csv

This instruction extracts information in the CSV format from the specified field.

parse-json

This instruction extracts the first-layer JSON information from the specified field.

parse-kv

This instruction extracts the key-value pair information from the specified field.

Data transformation instructions (new version)

pack-fields

This instruction packs log fields and outputs the fields to a new field in JSON serialization format. This instruction is applicable to scenarios in which structured data transmission is required, such as API request body construction.

log-to-metric

The e_to_metric function converts logs to metrics that can be stored in a Metricstore.

metric-to-metric

This instruction further processes existing time series data, such as adding, modifying, or removing tags.

Aggregate instructions

stats

This instruction is used for statistical analysis of logs, similar to aggregate functions in SQL (such as COUNT, SUM, and AVG). It performs statistical, grouping, and aggregation operations on specific fields in log data.

sort

This instruction sorts query results. You can sort field values or statistical results in ascending (asc) or descending (desc) order. It is an important tool for quickly locating key data and generating ordered reports in log analysis.

limit

This instruction limits the number of log entries returned in query results. It is one of the core instructions for controlling data volume. You can use the limit instruction to effectively prevent performance issues or resource waste caused by excessively large query results. This instruction is applicable to various scenarios such as log analysis and real-time monitoring.

Control instruction

.let

This instruction defines named datasets as the input for subsequent SPL expressions. For more information about SPL datasets, see SPL datasets.

Syntax

.let <dataset>=<spl-expr>

Parameters

Parameter

Type

Required

Description

dataset

String

Yes

The name of the dataset. The name can contain letters, digits, and underscores, and must start with a letter. The name is case-sensitive.

spl-expr

SPLExp

Yes

The SPL expression used to generate a dataset.

Example

Filter and categorize access logs based on status codes, and then export the logs.

  • SPL statement

    -- Define the SPL processing result as a named dataset src, which is used as the input for subsequent SPL expressions
    .let src = * 
    | where status=cast(status as BIGINT);
    
    -- Use the named dataset src as the input for an SPL expression to obtain the data whose status field is 5xx and define the SPL processing result as a dataset named err. Do not export the dataset.
    .let err = $src
    | where status >= 500
    | extend msg='ERR';
    
    -- Use the named dataset src as the input for an SPL expression to obtain the data whose status field is 2xx and define the SPL processing result as a dataset named ok. Do not export the dataset.
    .let ok = $src
    | where status >= 200 and status < 300
    | extend msg='OK';
    
    -- Export the named datasets err and ok
    $err;
    $ok;
  • Input data

    # Entry 1
    status: '200'
    body: 'this is a test'
    
    # Entry 2
    status: '500'
    body: 'internal error'
    
    # Entry 3
    status: '404'
    body: 'not found'
  • Result

    # Entry 1: Dataset err
    status: '500'
    body: 'internal error'
    msg: 'ERR'
    
    # Entry 2: Dataset ok
    status: '200'
    body: 'this is a test'
    msg: 'OK'

Field processing instructions

project

This instruction retains the fields that match the specified pattern and renames the specified fields. During instruction execution, all retain-related expressions are executed before rename-related expressions.

Important

By default, the time fields __time__ and __time_ns_part__ are retained and cannot be renamed or overwritten. For more information, see Time fields.

Syntax

| project -wildcard <field-pattern>, <output>=<field>, ...

Parameters

Parameter

Type

Required

Description

wildcard

Bool

No

Specifies whether to enable the wildcard match mode. By default, the exact match mode is used. If you want to enable the wildcard match mode, you must configure this parameter.

field-pattern

FieldPattern

Yes

The name of the field to retain, or a combination of a field and a wildcard character. All matched fields are processed.

output

Field

Yes

The new name of the field to rename. You cannot rename multiple fields to the same name.

Important

If the new field name is the same as an existing field name in the input data, see Retention and overwriting of new and old values for the value retention policy.

field

Field

Yes

The original name of the field to rename.

  • If the field does not exist in the input data, the rename operation is not performed.

  • You cannot rename a field multiple times.

Examples

  • Example 1: Retain a field.

    * | project level, err_msg
  • Example 2: Rename a field.

    * | project log_level=level, err_msg
  • Example 3: Retain the field __tag__:* that matches the exact match mode.

    * | project "__tag__:*"

project-away

This instruction removes the fields that match the specified pattern and retains all other fields as they are.

Important

This instruction retains the time fields __time__ and __time_ns_part__ by default. For more information, see Time fields.

Syntax

| project-away -wildcard <field-pattern>, ...

Parameters

Parameter

Type

Required

Description

wildcard

Bool

No

Specifies whether to enable the wildcard match mode. By default, the exact match mode is used. If you want to enable the wildcard match mode, you must configure this parameter.

field-pattern

FieldPattern

Yes

The name of the field to remove, or a combination of a field and a wildcard character. All matched fields are processed.

project-rename

This instruction renames the specified fields and retains all other fields as they are.

Important

By default, the time fields __time__ and __time_ns_part__ are retained and cannot be renamed or overwritten. For more information, see Time fields.

Syntax

| project-rename <output>=<field>, ...

Parameters

Parameter

Type

Required

Description

output

Field

Yes

The new name of the field to rename. You cannot rename multiple fields to the same name.

Important

If the new field name is the same as an existing field name in the input data, see Retention and overwriting of new and old values for the value retention policy.

field

Field

Yes

The original name of the field to rename.

  • If the field does not exist in the input data, the rename operation is not performed.

  • You cannot rename a field multiple times

Example

Rename the specified fields.

* | project-rename log_level=level, log_err_msg=err_msg

expand-values

This instruction expands a first-level JSON object for the specified field and returns multiple result entries.

Important

Syntax

| expand-values -path=<path> -limit=<limit> -keep <field> as <output>

Parameters

Parameter

Type

Required

Description

path

JSONPath

No

The JSON path in the specified field. The JSON path is used to identify the information that you want to expand.

This parameter is empty by default, which specifies that the complete data of the specified field is expanded.

limit

Integer

No

The maximum number of entries that can be obtained after expanding a JSON object for the specified field. The value is an integer from 1 to 10. Default value: 10.

keep

Bool

No

Specifies whether to retain the original field after the expand operation is performed. By default, the original field is not retained. If you want to retain the original field, you must configure this parameter.

field

Field

Yes

The original name of the field to extract. The data type is VARCHAR. If the field does not exist, the expand operation is not performed.

output

Filed

No

The name of the new field that is obtained after the expand operation is performed. If you do not configure this parameter, the output data is exported to the original field.

You can expand a first-level JSON object for a field based on the following logic:

JSON array: expands an array by element.

JSON dictionary: expands a dictionary by key-value pair.

Other JSON types: returns the original values.

Invalid JSON: returns null.

Examples

  • Example 1: Expand an array and return multiple result entries.

    • SPL statement

      * | expand-values y
    • Input data

      x: 'abc'
      y: '[0,1,2]'
    • Output data, including three entries

      # Entry 1
      x: 'abc'
      y: '0'
      
      # Entry 2
      x: 'abc'
      y: '1'
      
      # Entry 3
      x: 'abc'
      y: '2'
  • Example 2: Expand a dictionary and return multiple result entries.

    • SPL statement

      * | expand-values y
    • Input data

      x: 'abc'
      y: '{"a": 1, "b": 2}'
    • Output data, including two entries

      # Entry 1
      x: 'abc'
      y: '{"a": 1}'
      
      # Entry 2
      x: 'abc'
      y: '{"b": 2}'
  • Example 3: Expand the content that matches the specified JSONPath expression and export the output data to a new field.

    • SPL statement

      * | expand-values -path='$.body' -keep content as body
    • Input data

      content: '{"body": [0, {"a": 1, "b": 2}]}'
    • Output data, including two entries

      # Entry 1
      content: '{"body": [1, 2]}'
      body: '0'
      
      # Entry 2
      content: '{"body": [1, 2]}'
      body: '{"a": 1, "b": 2}'

SQL computing instructions on structured data

extend

This instruction creates fields based on the result of SQL expression-based data calculation. For more information about the supported SQL functions, see SQL functions supported by SPL.

Syntax

| extend <output>=<sql-expr>, ...

Parameters

Parameter

Type

Required

Description

output

Field

Yes

The name of the field to create. You cannot create the same field to store the results of multiple expressions.

Important

If the new field name is the same as an existing field name in the input data, the new field overwrites the existing field based on the data type and value.

sql-expr

SQLExpr

Yes

The data processing expression.

Important

For more information about how to process null values, see Processing of null values in SPL expressions.

Examples

  • Example 1: Use a computation expression.

    * | extend Duration = EndTime - StartTime
  • Example 2: Use a regular expression.

    * | extend server_protocol_version=regexp_extract(server_protocol, '\d+')
  • Example 3: Extract the JSONPath content and convert the data type of a field.

    • SPL statement

      *
      | extend a=json_extract(content, '$.body.a'), b=json_extract(content, '$.body.b')
      | extend b=cast(b as BIGINT)
    • Input data

      content: '{"body": {"a": 1, "b": 2}}'
    • Result

      content: '{"body": {"a": 1, "b": 2}}'
      a: '1'
      b: 2

where

This instruction filters data based on the result of SQL expression-based data calculation and retains the data that matches the specified SQL expression. For more information about the SQL functions supported by the where instruction, see SQL functions supported by SPL.

Syntax

| where <sql-expr>

Parameters

Parameter

Type

Required

Description

sql-expr

SQLExp

Yes

The SQL expression. Data that matches this expression is retained.

Important

For more information about how to process null values in SQL expressions, see Processing of null values in SPL expressions.

Examples

  • Example 1: Filter data based on the field content.

    * | where userId='123'
  • Example 2: Filter data by using a regular expression that matches data based on a field name.

    * | where regexp_like(server_protocol, '\d+')
  • Example 3: Convert the data type of a field to match all data of server errors.

    * | where cast(status as BIGINT) >= 500

Semi-structured data extraction instructions

parse-regexp

This instruction extracts the information that matches groups in the specified regular expression from the specified field.

Important
  • The data type of the output field is VARCHAR. If the output field name is the same as an existing field name in the input data, see Retention and overwriting of new and old values for the value retention policy.

  • You cannot operate on the time fields __time__ and __time_ns_part__. For more information, see Time fields.

Syntax

| parse-regexp <field>, <pattern> as <output>, ...

Parameters

Parameter

Type

Required

Description

field

Field

Yes

The original name of the field from which you want to extract information.

The input data must contain this field, the type must be VARCHAR, and its value must be non-null. Otherwise, the fetch operation is not executed.

pattern

Regexp

Yes

The regular expression. The RE2 syntax is supported.

output

Field

No

The name of the output field that you want to use to store the extraction result of the regular extraction.

Examples

  • Example 1: Use the exploratory match mode.

    • SPL statement

      *
      | parse-regexp content, '(\S+)' as ip -- Generate the ip: 10.0.0.0 field.
      | parse-regexp content, '\S+\s+(\w+)' as method -- Generate the method: GET field.
    • Input data

      content: '10.0.0.0 GET /index.html 15824 0.043'
    • Result

      content: '10.0.0.0 GET /index.html 15824 0.043'
      ip: '10.0.0.0'
      method: 'GET'
  • Example 2: Use the full pattern match mode and use unnamed capturing groups in a regular expression.

    • SPL statement

      * | parse-regexp content, '(\S+)\s+(\w+)' as ip, method
    • Input data

      content: '10.0.0.0 GET /index.html 15824 0.043'
    • Result

      content: '10.0.0.0 GET /index.html 15824 0.043'
      ip: '10.0.0.0'
      method: 'GET'

parse-csv

This instruction extracts information in the CSV format from the specified field.

Important
  • The data type of the output field is VARCHAR. If the output field name is the same as an existing field name in the input data, see Retention and overwriting of new and old values for the value retention policy.

  • You cannot operate on the time fields __time__ and __time_ns_part__. For more information, see Time fields.

Syntax

| parse-csv -delim=<delim> -quote=<quote> -strict <field> as <output>, ...

Parameters

Parameter

Type

Required

Description

delim

String

No

The delimiter of the input data. You can specify one to three valid ASCII characters.

You can use escape characters to indicate special characters. For example, \t indicates the tab character, \11 indicates the ASCII character whose serial number corresponds to the octal number 11, and \x09 indicates the ASCII character whose serial number corresponds to the hexadecimal number 09.

You can also use a combination of multiple characters as the delimiter, such as $$$, ^_^.

Default value: comma (,).

quote

Char

No

The quote of the input data. You can specify a single valid ASCII character. If the input data contains delimiters, you must specify a quote.

For example, you can specify double quotation marks (""), single quotation marks (''), or an unprintable character (0x01).

By default, quotes are not used.

Important

This parameter takes effect only if you set the delim parameter to a single character. You must specify different values for the quote and delim parameters.

strict

Bool

No

Specifies whether to enable strict pairing when the number of values in the input data does not match the number of fields specified in the output parameter.

  • False: non-strict pairing. The maximum pairing policy is used.

    • If the number of values exceeds the number of fields, the extra values are not returned.

    • If the number of fields exceeds the number of values, the extra fields are returned as empty strings.

  • True: strict pairing. No fields are returned.

Default value: False. If you want to enable strict paring, configure this parameter.

field

Field

Yes

The name of the field to parse.

The data content must contain this field, the type must be VARCHAR, and its value must be non-null. Otherwise, the extract operation is not performed.

output

Field

Yes

The name of the field that you want to use to store the parsing result of the input data.

Examples

  • Example 1: Match data in simple mode.

    • SPL statement

      * | parse-csv content as x, y, z
    • Input data

      content: 'a,b,c'
    • Result

      content: 'a,b,c'
      x: 'a'
      y: 'b'
      z: 'c'
  • Example 2: Use double quotation marks as the quote to match data that contains special characters.

    • SPL statement

      * | parse-csv content as ip, time, host
    • Input data

      content: '192.168.0.100,"10/Jun/2019:11:32:16,127 +0800",example.aliyundoc.com'
    • Result

      content: '192.168.0.100,"10/Jun/2019:11:32:16,127 +0800",example.aliyundoc.com'
      ip: '192.168.0.100'
      time: '10/Jun/2019:11:32:16,127 +0800'
      host: 'example.aliyundoc.com'
  • Example 3: Use a combination of multiple characters as the delimiter.

    • SPL statement

      * | parse-csv -delim='||' content as time, ip, req
    • Input data

      content: '05/May/2022:13:30:28||127.0.0.1||POST /put?a=1&b=2'
    • Result

      content: '05/May/2022:13:30:28||127.0.0.1||POST /put?a=1&b=2'
      time: '05/May/2022:13:30:28'
      ip: '127.0.0.1'
      req: 'POST /put?a=1&b=2'

parse-json

This instruction extracts the first-layer JSON information from the specified field.

Important
  • The data type of the output field is VARCHAR. If the output field name is the same as an existing field name in the input data, see Retention and overwriting of new and old values for the value retention policy.

  • You cannot operate on the time fields __time__ and __time_ns_part__. For more information, see Time fields.

Syntax

| parse-json -mode=<mode> -path=<path> -prefix=<prefix> <field>

Parameters

Parameter

Type

Required

Description

mode

String

No

The mode that is used to extract information when the name of the output field is the same as an existing field name in the input data. The default value is overwrite.

path

JSONPath

No

The JSON path in the specified field. The JSON path is used to identify the information that you want to extract.

The default value is an empty string. If you use the default value, the complete data of the specified field is extracted.

prefix

String

No

The prefix of the fields that are generated by expanding a JSON structure. The default value is an empty string.

field

Field

Yes

The name of the field to parse.

Make sure that this field is included in the input data and the field value is a non-null value and meets one of the following conditions. Otherwise, the extract operation is not performed.

  • The data type is JSON

  • The data type is VARCHAR, and the field value is a valid JSON string

Examples

  • Example 1: Extract all keys and values from the y field.

    • SPL statement

      * | parse-json y
    • Input data

      x: '0'
      y: '{"a": 1, "b": 2}'
    • Result

      x: '0'
      y: '{"a": 1, "b": 2}'
      a: '1'
      b: '2'
  • Example 2: Extract the value of the body key from the content field as different fields.

    • SPL statement

      * | parse-json -path='$.body' content
    • Input data

      content: '{"body": {"a": 1, "b": 2}}'
    • Result

      content: '{"body": {"a": 1, "b": 2}}'
      a: '1'
      b: '2'
  • Example 3: Extract information in preserve mode. For an existing field, retain the original value.

    • SPL statement

      * | parse-json -mode='preserve' y
    • Input data

      a: 'xyz'
      x: '0'
      y: '{"a": 1, "b": 2}'
    • Result

      x: '0'
      y: '{"a": 1, "b": 2}'
      a: 'xyz'
      b: '2'

parse-kv

This instruction extracts the key-value pair information from the specified field.

Important
  • The data type of the output field is VARCHAR. If the output field name is the same as an existing field name in the input data, see Retention and overwriting of new and old values for the value retention policy.

  • You cannot operate on the time fields __time__ and __time_ns_part__. For more information, see Time fields.

Syntax

Delimiter-based extraction

Extract key-value pairs based on the specified delimiter.

| parse-kv -mode=<mode> -prefix=<prefix> -greedy <field>, <delim>, <kv-sep>

Regular expression-based extraction

Extract key-value pairs based on the specified regular expression.

| parse-kv -regexp -mode=<mode> -prefix=<prefix> <field>, <pattern>

Parameters

Delimiter-based extraction

Parameter

Type

Required

Description

mode

String

No

If the output field name is the same as an existing field name in the input data, you can select an overwrite mode based on your business requirements.

The default value is overwrite. For more information, see Field extraction check and overwrite mode.

prefix

String

No

The prefix of the output fields. The default value is an empty string.

greedy

Bool

No

Specifies whether to enable greedy matching for field values.

  • Disable: The field value matching stops when a delim character is encountered.

  • Enable: The content before the next key-value pair is fully matched as the field value.

field

Field

Yes

The name of the field to parse.

  1. If this field does not exist in the data entry or its value is null, no processing is performed on the entry.

  2. If no key-value pairs are matched in the data content, no processing is performed on the entry.

delim

Char

Yes

The delimiter between different key-value pairs. You can specify one to five valid ASCII characters, such as ^_^.

The delimiter cannot be a substring of kv-sep.

kv-sep

Char

Yes

The character that connects a key and a value in a key-value pair. You can specify one to five valid ASCII characters, such as #$#.

The character cannot be a substring of delim.

Regular expression-based extraction

Parameter

Type

Required

Description

regexp

Bool

Yes

Specifies whether to enable the regular extraction mode.

mode

String

No

If the output field name is the same as an existing field name in the input data, you can select an overwrite mode based on your business requirements.

The default value is overwrite. For more information, see Field extraction check and overwrite mode.

prefix

String

No

The prefix of the output fields. The default value is an empty string.

field

Field

Yes

The original name of the field to extract.

The input data is required to contain this field, with a type of VARCHAR, and its value must be non-null. Otherwise, the extract operation is not performed.

pattern

RegExpr

Yes

The regular expression that contains two capturing groups. One capturing group extracts the field name and the other capturing group extracts the field value. RE2 regular expressions are supported.

Examples

  • Example 1: Extract multiple characters from SLS time series data in Labels as data fields

    • SPL statement

      * | parse-kv -prefix='__labels__.' __labels__, '|', '#$#'
    • Input data

      __name__: 'net_in'
      __value__: '231461.57374215033'
      __time_nano__: '1717378679274117026'
      __labels__: 'cluster#$#sls-etl|hostname#$#iZbp17raa25u0xi4wifopeZ|interface#$#veth02cc91d2|ip#$#192.168.22.238'
    • Output data

      __name__: 'net_in'
      __value__: '231461.57374215033'
      __time_nano__: '1717378679274117026'
      __labels__: 'cluster#$#sls-etl|hostname#$#iZbp17raa25u0xi4wifopeZ|interface#$#veth02cc91d2|ip#$#192.168.22.238'
      __labels__.cluster: 'sls-etl'
      __labels__.hostname: 'iZbp17raa25u0xi4wifopeZ'
      __labels__.interface: 'veth02cc91d2'
      __labels__.ip: '192.168.22.238'
  • Example 2: Enable greedy matching mode to extract key-value information from access logs.

    • SPL statement

      * | parse-kv -greedy content, ' ', '='
    • Input data

      content: 'src=127.0.0.1 dst=192.168.0.0 bytes=125 msg=connection refused body=this is test time=2024-05-21T00:00:00'
    • Output data

      content: 'src=127.0.0.1 dst=192.168.0.0 bytes=125 msg=connection refused body=this is test time=2024-05-21T00:00:00'
      src: '127.0.0.1'
      dst: '192.168.0.0'
      bytes: '125'
      msg: 'connection refused'
      body: 'this is test'
      time: '2024-05-21T00:00:00'
  • Example 3: Enable regular extraction mode to process complex delimiters between key-value pairs and separators between keys and values.

    • SPL statement

      * | parse-kv -regexp content, '([^&?]+)(?:=|:)([^&?]+)'
    • Input data

      content: 'k1=v1&k2=v2?k3:v3'
      k1: 'xyz'
    • Output data

      content: 'k1=v1&k2=v2?k3:v3'
      k1: 'v1'
      k2: 'v2'
      k3: 'v3'
  • Example 4: Extract information in preserve mode. For an existing field, retain the original value.

    • SPL statement

      * | parse-kv -regexp -mode='preserve' content, '([^&?]+)(?:=|:)([^&?]+)'
    • Input data

      content: 'k1=v1&k2=v2?k3:v3'
      k1: 'xyz'
    • Result

      content: 'k1=v1&k2=v2?k3:v3'
      k1: 'xyz'
      k2: 'v2'
      k3: 'v3'

Data transformation instructions (new version)

Supported regions

China (Shanghai), China (Heyuan)

pack-fields

This instruction packs multiple fields and outputs them to a new field in JSON serialization format. This instruction is applicable to scenarios in which structured data transmission is required, such as API request body construction.

Important
  • By default, non-Varchar type fields (including __time__ and __time_ns_part__) are not processed.

  • By default, source data is not retained.

Syntax

| pack-fields -keep -ltrim -include=<include> -exclude=<exclude> as <output>

Parameters

Parameter

Type

Required

Description

output

String

Yes

The name of the output field that is obtained after the pack operation is performed. The field value is in the JSON format.

include

RegExp

No

The whitelist. Fields that match the regular expression specified in the whitelist are packed. Default value: ".*", which indicates that all fields in a log are matched and packed. For more information, see Regular expressions.

exclude

RegExp

No

The blacklist (takes precedence over the whitelist). Fields that match the regular expression specified in the blacklist are not packed. This parameter is empty by default, which indicates that no matching is performed. For more information, see Regular expressions.

ltrim

String

No

Removes the prefix from the output field name.

keep

Bool

No

Specifies whether to retain the source data after the data is packed.

True: The source data is retained in the output result.

False (default): The source data is not retained in the output result.

Examples

  • Example 1: Pack all fields in a log to the test field. By default, the original fields that are packed are deleted.

    • SPL statement

      * | pack-fields -include='\w+' as test
    • Input data

      test1:123
      test2:456
      test3:789
    • Result

      test:{"test1": "123", "test2": "456", "test3": "789"}
  • Example 2: Pack all fields in a log to the test field. The original fields that are packed are not deleted.

    • SPL statement

      * | pack-fields -keep -include='\w+' as test
    • Input data

      test1:123
      test2:456
      test3:789
    • Result

      test:{"test1": "123", "test2": "456", "test3": "789"}
      test1:123
      test2:456
      test3:789
  • Example 3: Pack the test and abcd fields to the content field. The original fields that are packed are not deleted.

    • SPL statement

      * | pack-fields -keep -include='\w+' as content
    • Input data

      abcd@#%:123
      test:456
      abcd:789
    • Result

      abcd:789
      abcd@#%:123
      content:{"test": "456", "abcd": "789"}
      test:456
  • Example 4: Do not pack the test and abcd fields. Pack the remaining fields to the content field. Delete the original fields that are packed.

    • SPL statement

      * | pack-fields -exclude='\w+' as content
    • Input data

      abcd@#%:123
      test:456
      abcd:789
    • Result

      abcd:789
      content:{"abcd@#%": "123"}
      test:456
  • Example 5: Extract all key-value pairs that match the regular expression from the field value, and pack the key-value pairs to the name field.

    • SPL statement

      * | parse-kv -prefix='k_' -regexp dict, '(\w+):(\w+)' | pack-fields -include='k_.*' -ltrim = 'k_' as name
    • Input data

      dict: x:123, y:456, z:789
    • Result

      dict:x:123, y:456, z:789
      name:{"x": "123", "y": "456", "z": "789"}

log-to-metric

Convert the log format to time series storage format.

Important
  • By default, log data that does not meet the requirements of time series data (Metric) is ignored.

  • The time unit of the time field in the raw log data is automatically detected. The following time units are supported: seconds, milliseconds, microseconds, and nanoseconds.

  • Hash write is enabled by default.

Syntax

| log-to-metric -wildcard -format -names=<names> -labels=<labels> -time_field=<time_field>

Parameters

Parameter

Type

Required

Description

wildcard

Bool

No

Specifies whether to enable the wildcard match mode for the field names specified by the names and labels parameters.

By default, the exact match mode is used. If you want to enable the wildcard match mode, you must configure this parameter.

format

Bool

No

Specifies whether to enable automatic formatting.

By default, this feature is disabled. Invalid label data is skipped during data transformation.

After the feature is enabled, invalid label data is formatted. Label values cannot contain characters such as "|", "#", and "$". If a label value contains these characters, the characters are replaced with underscores (_).

names

FieldList

Yes

The list of log fields that are used to generate metric time series points.

If a field in the input data matches at least one specified field name or field name pattern, a metric time series point is generated for the field. The metric name is the field name, and the metric value is the field value.

For example, [mem, "mem:pct"] indicates that two time series points are generated. The names of the time series points are mem and mem:pct.

Important
  • The input field name must match the regular expression [a-zA-Z_:][a-zA-Z0-9_:]*. Otherwise, no time series point is generated for the field.

  • The input field value must meet one of the following requirements. Otherwise, no time series point is generated for the field:

    • The field is of a numeric data type, such as TINYINT, SMALLINT, INTEGER, BIGINT, HUGEINT, REAL, or DOUBLE.

    • The field is of the VARCHAR data type, and its value can be converted to a valid DOUBLE value.

For more information about the format of time series data, see Time series data (Metric).

labels

FieldList

No

The list of log fields that are used to construct time series label information.

If a field in the input data matches at least one specified field name or field name pattern, the field is added to the labels of the time series point. The label name is the field name, and the label value is the field value.

For example, [host, ip] indicates that two time series labels are added. The labels are host and ip and their original field values.

Important
  • The input field name must match the regular expression [a-zA-Z_][a-zA-Z0-9_]*. Otherwise, the label does not contain the field in the generated time series point.

  • The input field value cannot contain the VERTICAL LINE character (|). Otherwise, the field is not included in the labels of the generated time series point.

  • The input field must be of the VARCHAR data type. Otherwise, no corresponding label is generated.

For more information about the format of time series data, see Time series data (Metric).

time_field

String

No

The time field of the metric. By default, the __time__ field in the log is used as the time field of the metric.

Important

  • The input field must be in the timestamp format. The data type of the field must be BIGINT or VARCHAR. If the field is of the VARCHAR data type, its value must be convertible to a valid BIGINT value.

Examples

  • Example 1: Convert the log that contains the rt field to the time series data format.

    • SPL statement

      * | log-to-metric -names='["rt"]'
    • Input data

      __time__: 1614739608
      rt: 123
    • Result

      __labels__:
      __name__:rt
      __time_nano__:1614739608
      __value__:123
  • Example 2: Convert the log that contains the rt field to the time series data format and add the host field as a label field.

    • SPL statement

      * | log-to-metric -names='["rt"]' -labels='["host"]'
    • Input data

      __time__: 1614739608
      rt: 123
      host: myhost
    • Result

      __labels__:host#$#myhost
      __name__:rt
      __time_nano__:1614739608
      __value__:123
  • Example 3: Convert the log that contains the rt and qps fields to the time series data format and add the host field as a label field.

    • SPL statement

      * | log-to-metric -names='["rt", "qps"]' -labels='["host"]'
    • Input data

      __time__: 1614739608
      rt: 123
      qps: 10
      host: myhost
    • Result

      __labels__:host#$#myhost
      __name__:rt
      __time_nano__:1614739608
      __value__:123
      
      __labels__:host#$#myhost
      __name__:qps
      __time_nano__:1614739608
      __value__:10

  • Example 4: Use fuzzy matching to convert the log that contains the rt1 and rt2 fields to the time series data format and add the host field as a label field.

    • SPL statement

      * | log-to-metric -wildcard -names='["rt*"]' -labels='["host"]'
    • Input data

      __time__: 1614739608
      rt1: 123
      rt2: 10
      host: myhost
    • Result

      __labels__:host#$#myhost
      __name__:rt1
      __time_nano__:1614739608
      __value__:123
      
      __labels__:host#$#myhost
      __name__:rt2
      __time_nano__:1614739608
      __value__:10
  • Example 5: Convert the log that contains the rt and qps fields to the time series data format, add the host field as a label field, and automatically format the label value.

    • SPL statement

      * | log-to-metric -format -names='["rt", "qps"]' -labels='["host"]'
    • Input data

      __time__: 1614739608
      rt: 123
      qps: 10
      host: myhost1|myhost2
    • Result

      __labels__:host#$#myhost1_myhost2
      __name__:rt
      __time_nano__:1614739608
      __value__:123
      
      __labels__:host#$#myhost1_myhost2
      __name__:qps
      __time_nano__:1614739608
      __value__:10
  • Example 6: Convert the log that contains the rt and qps fields to the time series data format, rename the fields to max_rt and total_qps, and add the host field as a label field.

    • SPL statement

      * | project-rename max_rt = rt, total_qps = qps| log-to-metric -names='["max_rt", "total_qps"]' -labels='["host"]'
    • Input data

      __time__: 1614739608
      rt: 123
      qps: 10
      host: myhost
    • Result

      __labels__:host#$#myhost
      __name__:max_rt
      __time_nano__:1614739608
      __value__:123
      
      __labels__:host#$#myhost
      __name__:total_qps
      __time_nano__:1614739608
      __value__:10
  • Example 7: Convert the log that contains the rt and qps fields to the time series data format, rename the fields to max_rt and total_qps, rename the host field to hostname, and add the hostname field as a label field.

    • SPL statement

      * | project-rename max_rt = rt, total_qps = qps, hostname=host| log-to-metric -names='["max_rt", "total_qps"]' -labels='["hostname"]'
    • Input data

      __time__: 1614739608
      rt: 123
      qps: 10
      host: myhost
    • Result

      __labels__:hostname#$#myhost
      __name__:max_rt
      __time_nano__:1614739608
      __value__:123
      
      __labels__:hostname#$#myhost
      __name__:total_qps
      __time_nano__:1614739608
      __value__:10
  • Example 8: Convert the log that contains the remote_user field to the time series data format, add the status field as a label field, use the time field as the time field of the time series data, and specify the time unit of the original log data as nanoseconds.

    • SPL statement

      * | log-to-metric -names='["remote_user"]' -labels='["status"]' -time_field='time'
    • Input data

      time:1652943594
      remote_user:89
      request_length:4264
      request_method:GET
      status:200
    • Result

      __labels__:status#$#200
      __name__:remote_user
      __time_nano__:1652943594
      __value__:89

metric-to-metric

This instruction further processes existing time series data, such as adding, modifying, or removing tags.

Important
  • The input field name must match the regular expression [a-zA-Z_][a-zA-Z0-9_]*. Otherwise, the label does not contain the field in the generated time series point.

  • If the three option parameters contain the same field, the priority is: add_labels > del_labels > rename_labels.

For more information about the format of output time series data, see Time series data (Metric).

Syntax

| metric-to-metric -format -add_labels=<add_labels> -del_labels=<del_labels> -rename_labels=<rename_labels>

Parameters

Parameter

Type

Required

Description

add_labels

Array

No

The list of label fields to add. These fields are used to construct new time series label information.

The original data is added to the labels of the time series point. Only the VARCHAR data type is supported.

For example, if {"host":"http://www.xxx.com", "ip":"127.0.0.1"} is the original data and you specify ["host", "ip"], |host#$#http://www.xxx.com|ip#$#127.0.0.1 is added to the original labels. If the host field already exists in the original labels, the field value is overwritten.

del_labels

Array

No

The list of label fields to remove. These fields are used to construct new time series label information.

If a field in the input data matches a field name in the original labels, the field is removed from the original labels.

For example, if the value of the original labels is host#$#http://www.xxx.com|ip#$#127.0.0.1 and you specify ["ip"], one time series label is removed and the original labels are updated to host#$#http://www.xxx.com. If the ip field does not exist in the original labels, no processing is performed.

rename_labels

Map

No

The list of label fields to rename. These fields are used to construct new time series label information.

The labels of the original time series point are updated based on the mapping information. The key is the field name, and the value is the new field name.

For example, {"host":"host_new", "ip":"ip_new"} renames "host" to "host_new" and "ip" to "ip_new". If the corresponding field does not exist in the original labels, no processing is performed.

format

Bool

No

Specifies whether to enable automatic formatting. By default, this feature is disabled. Invalid data is skipped during data transformation.

After the feature is enabled:

  • The __labels__ field is sorted.

  • The LabelKey and LabelValue values are formatted.

    • LabelKey: The value must match the regular expression "[a-zA-Z_][a-zA-Z0-9_]". Invalid characters are replaced with spaces.

    • LabelValue: The value cannot contain characters such as "|", "#", and "$". If the value contains these characters, the characters are replaced with underscores (_).

  • Labels whose LabelValue value is an empty string in the __labels__ field are dropped. However, the data entry is retained.

  • Duplicate labels in the __labels__ field are removed. The labels are retained based on the alphabetical order of the LabelValue values.

Examples

  • Example 1: Add a label

    • SPL statement

      * | extend qps = '10'|metric-to-metric -add_labels='["qps"]'
    • Input data

      __labels__:host#$#myhost
      __name__:rt
      __time_nano__:1614739608
      __value__:123
    • Result

      __labels__:host#$#myhost|qps#$#10
      __name__:rt
      __time_nano__:1614739608
      __value__:123
  • Example 2: Remove a label

    • SPL statement

      * | metric-to-metric -del_labels='["qps"]'
    • Input data

      __labels__:host#$#myhost|qps#$#10
      __name__:rt
      __time_nano__:1614739608
      __value__:123
    • Result

      __labels__:host#$#myhost
      __name__:rt
      __time_nano__:1614739608
      __value__:123
  • Example 3: Rename a label

    • SPL statement

      * | metric-to-metric -rename_labels='{"host":"etl_host"}'
    • Input data

      __labels__:host#$#myhost|qps#$#10
      __name__:rt
      __time_nano__:1614739608
      __value__:123
    • Result

      __labels__:etl_host#$#myhost|qps#$#10
      __name__:rt
      __time_nano__:1614739608
      __value__:123
  • Example 4: Format invalid data with one click

    • SPL statement

      * | metric-to-metric -format
    • Input data

      __labels__:host#$#myhost|qps#$#10|asda$cc#$#j|ob|schema#$#|#$#|#$#xxxx
      __name__:rt
      __time_nano__:1614739608
      __value__:123
    • Result

      __labels__:asda_cc#$#j|host#$#myhost|qps#$#10
      __name__:rt
      __time_nano__:1614739608
      __value__:123

Aggregate instructions

stats

This instruction is used for statistical analysis of logs, similar to aggregate functions in SQL (such as COUNT, SUM, AVG, etc.). It performs statistical, grouping, and aggregation operations on specific fields in log data.

Important
  • This instruction is dedicated to Logstore query and analysis. It is not applicable to scenarios such as new version data transformation, SPL rule consumption, write processors, and Logtail configurations.

  • By default, the stats instruction returns the first 100 aggregation results. If you need to return more results, you can use the limit instruction.

Syntax

stats <output>=<aggOperator> by <group>,[<group>...]

Parameters

Parameter

Type

Required

Description

output

String

Yes

Specifies an alias for the statistical result field.

aggOperator

SQLExp

Yes

The following aggregate functions are supported:

  • count

  • count_if

  • min

  • max

  • sum

  • avg

  • skewness

  • kurtosis

  • approx_percentile

  • approx_distinct

  • bool_and

  • bool_or

  • every

  • arbitrary

  • array_agg

group

String

No

Specifies the dimension for aggregation, similar to the field in the GROUP BY clause in SQL.

Examples

  • Example 1: Count pv by ip for access logs

    • SPL statement

      * | stats pv=count(*) by ip
    • Input data

      ip: 192.168.1.1
      latencyMs: 10
      
      ip: 192.168.1.1
      latencyMs: 20
      
      ip: 192.168.1.2
      latencyMs: 10
    • Output data

      ip: 192.168.1.2
      pv: 1
      
      ip: 192.168.1.1
      pv: 2
  • Example 2: Statistics of latency min/max for all ip in accesslog

    • SPL statement

      * 
      | extend latencyMs=cast(latencyMs as bigint)
      | stats minLatencyMs=min(latencyMs), maxLatencyMs=max(latencyMs) by ip
    • Input data

      ip: 192.168.1.1
      latencyMs: 10
      
      ip: 192.168.1.1
      latencyMs: 20
      
      ip: 192.168.1.2
      latencyMs: 10
    • Output data

      ip: 192.168.1.2
      minLatencyMs: 10
      maxLatencyMs: 20
      
      ip: 192.168.1.1
      minLatencyMs: 10
      maxLatencyMs: 10
  • Example 3: Count all pv for access logs

    • SPL statement

      * | stats pv=count(*)
    • Input data

      ip: 192.168.1.1
      latencyMs: 10
      
      ip: 192.168.1.1
      latencyMs: 20
      
      ip: 192.168.1.2
      latencyMs: 10
    • Output data

      pv: 3

sort

This instruction sorts query results. You can sort field values or statistical results in ascending (asc) or descending (desc) order. It is an important tool for quickly locating key data and generating ordered reports in log analysis.

Important

This instruction is dedicated to Logstore query and analysis. It is not applicable to scenarios such as new version data transformation, SPL rule consumption, write processors, and Logtail configurations.

Syntax

sort <field> [asc/desc] ,(<field> [asc/desc])

Parameters

Parameter

Type

Required

Description

field

String

Yes

Specifies the field to sort. The following field types are supported:

  • Raw log fields (such as status and request_time).

  • Statistical fields (such as count(*) and avg(response_time)).

  • Time fields (such as @timestamp).

asc/desc

String

No

  • asc: ascending order (default).

  • desc: descending order (common scenario: sort statistical values from high to low).

Example

Sort accesslog by latencyMs.

  • SPL statement

    * 
    | extend latencyMs=cast(latencyMs as bigint) 
    | sort latencyMs desc
  • Input data

    ip: 192.168.1.1
    latencyMs: 10
    
    ip: 192.168.1.1
    latencyMs: 20
    
    ip: 192.168.1.2
    latencyMs: 15
  • Output data

    ip: 192.168.1.1
    latencyMs: 20
    
    ip: 192.168.1.2
    latencyMs: 15
    
    ip: 192.168.1.1
    latencyMs: 10

limit

This instruction limits the number of log entries returned in query results. It is one of the core instructions for controlling data volume. You can use the limit instruction to effectively prevent performance issues or resource waste caused by excessively large query results. This instruction is applicable to various scenarios such as log analysis and real-time monitoring.

Important
  • This instruction is dedicated to Logstore query and analysis. It is not applicable to scenarios such as new version data transformation, SPL rule consumption, write processors, and Logtail configurations.

  • If you do not use the sort instruction to specify a sorting rule, the order of the output results of the limit instruction is random (because the natural order is not guaranteed when logs are stored).

Syntax

limit (<offset>,) <size>

Parameters

Parameter

Type

Required

Description

offset

Interger

No

Skips the first offset rows.

size

Interger

Yes

The row limit.

Example

Sort access logs by latencyMs and return the first entry.

  • SPL statement

    * 
    | extend latencyMs=cast(latencyMs as bigint) 
    | sort latencyMs
    | limit 1
  • Input data

    ip: 192.168.1.1
    latencyMs: 10
    
    ip: 192.168.1.1
    latencyMs: 20
    
    ip: 192.168.1.2
    latencyMs: 15
  • Output data

    ip: 192.168.1.1
    latencyMs: 20