This topic describes Simple Log Service Processing Language (SPL) instructions.
Parameter data types
The following table describes the data types of parameters supported in SPL instructions.
Parameter data type | Description |
Bool | The parameter specifies a Boolean value. This type of parameter is a switch in SPL instructions. |
Char | The parameter specifies an ASCII character. You must enclose the parameter value in single quotation marks (''). For example, |
Integer | The parameter specifies an integer value. |
String | The parameter specifies a string value. You must enclose the parameter value in single quotation marks (''). For example, |
RegExp | The parameter specifies an RE2 regular expression. You must enclose the parameter value in single quotation marks (''). For example, For more information about the syntax, see Syntax. |
JSONPath | The parameter specifies a JSON path. You must enclose the parameter value in single quotation marks (''). For example, For more information about the syntax, see JsonPath. |
Field | The parameter specifies a field name. For example, If the field name contains special characters other than letters, digits, and underscores, you must enclose the field name in double quotation marks (""). For example, Note For more information about case sensitivity of field names, see SPL features in different scenarios. |
FieldPattern | The parameter specifies a field name or a combination of a field name and a wildcard character. An asterisk (*) can be used as a wildcard character, which matches zero or multiple characters. You must enclose the parameter value in double quotation marks (""). For example, Note For more information about case sensitivity of field names, see SPL features in different scenarios. |
SPLExp | The parameter specifies an SPL expression. |
SQLExp | The parameter specifies an SQL expression. |
SPL instructions
Instruction category | Instruction name | Description |
Control instruction | This instruction defines named datasets. For more information about SPL datasets, see SPL datasets. | |
Field processing instructions | This instruction retains the fields that match the specified pattern and renames the specified fields. During instruction execution, all retain-related expressions are executed before rename-related expressions. | |
This instruction removes the fields that match the specified pattern and retains all other fields as they are. | ||
This instruction renames the specified fields and retains all other fields as they are. | ||
This instruction expands a first-level JSON object for the specified field and returns multiple result entries. | ||
SQL computing instructions on structured data | This instruction creates fields based on the result of SQL expression-based data calculation. For more information about the supported SQL functions, see SQL functions supported by SPL. | |
This instruction filters data based on the result of SQL expression-based data calculation and retains the data that matches the specified SQL expression. For more information about the SQL functions supported by the where instruction, see SQL functions supported by SPL. | ||
Semi-structured data extraction instructions | This instruction extracts the information that matches groups in the specified regular expression from the specified field. | |
This instruction extracts information in the CSV format from the specified field. | ||
This instruction extracts the first-layer JSON information from the specified field. | ||
This instruction extracts the key-value pair information from the specified field. | ||
Data transformation instructions (new version) | This instruction packs log fields and outputs the fields to a new field in JSON serialization format. This instruction is applicable to scenarios in which structured data transmission is required, such as API request body construction. | |
The e_to_metric function converts logs to metrics that can be stored in a Metricstore. | ||
This instruction further processes existing time series data, such as adding, modifying, or removing tags. | ||
Aggregate instructions | This instruction is used for statistical analysis of logs, similar to aggregate functions in SQL (such as | |
This instruction sorts query results. You can sort field values or statistical results in ascending ( | ||
This instruction limits the number of log entries returned in query results. It is one of the core instructions for controlling data volume. You can use the |
Control instruction
.let
This instruction defines named datasets as the input for subsequent SPL expressions. For more information about SPL datasets, see SPL datasets.
Syntax
.let <dataset>=<spl-expr>
Parameters
Parameter | Type | Required | Description |
dataset | String | Yes | The name of the dataset. The name can contain letters, digits, and underscores, and must start with a letter. The name is case-sensitive. |
spl-expr | SPLExp | Yes | The SPL expression used to generate a dataset. |
Example
Filter and categorize access logs based on status codes, and then export the logs.
SPL statement
-- Define the SPL processing result as a named dataset src, which is used as the input for subsequent SPL expressions .let src = * | where status=cast(status as BIGINT); -- Use the named dataset src as the input for an SPL expression to obtain the data whose status field is 5xx and define the SPL processing result as a dataset named err. Do not export the dataset. .let err = $src | where status >= 500 | extend msg='ERR'; -- Use the named dataset src as the input for an SPL expression to obtain the data whose status field is 2xx and define the SPL processing result as a dataset named ok. Do not export the dataset. .let ok = $src | where status >= 200 and status < 300 | extend msg='OK'; -- Export the named datasets err and ok $err; $ok;
Input data
# Entry 1 status: '200' body: 'this is a test' # Entry 2 status: '500' body: 'internal error' # Entry 3 status: '404' body: 'not found'
Result
# Entry 1: Dataset err status: '500' body: 'internal error' msg: 'ERR' # Entry 2: Dataset ok status: '200' body: 'this is a test' msg: 'OK'
Field processing instructions
project
This instruction retains the fields that match the specified pattern and renames the specified fields. During instruction execution, all retain-related expressions are executed before rename-related expressions.
By default, the time fields __time__ and __time_ns_part__ are retained and cannot be renamed or overwritten. For more information, see Time fields.
Syntax
| project -wildcard <field-pattern>, <output>=<field>, ...
Parameters
Parameter | Type | Required | Description |
wildcard | Bool | No | Specifies whether to enable the wildcard match mode. By default, the exact match mode is used. If you want to enable the wildcard match mode, you must configure this parameter. |
field-pattern | FieldPattern | Yes | The name of the field to retain, or a combination of a field and a wildcard character. All matched fields are processed. |
output | Field | Yes | The new name of the field to rename. You cannot rename multiple fields to the same name. Important If the new field name is the same as an existing field name in the input data, see Retention and overwriting of new and old values for the value retention policy. |
field | Field | Yes | The original name of the field to rename.
|
Examples
Example 1: Retain a field.
* | project level, err_msg
Example 2: Rename a field.
* | project log_level=level, err_msg
Example 3: Retain the field
__tag__:*
that matches the exact match mode.* | project "__tag__:*"
project-away
This instruction removes the fields that match the specified pattern and retains all other fields as they are.
This instruction retains the time fields __time__ and __time_ns_part__ by default. For more information, see Time fields.
Syntax
| project-away -wildcard <field-pattern>, ...
Parameters
Parameter | Type | Required | Description |
wildcard | Bool | No | Specifies whether to enable the wildcard match mode. By default, the exact match mode is used. If you want to enable the wildcard match mode, you must configure this parameter. |
field-pattern | FieldPattern | Yes | The name of the field to remove, or a combination of a field and a wildcard character. All matched fields are processed. |
project-rename
This instruction renames the specified fields and retains all other fields as they are.
By default, the time fields __time__ and __time_ns_part__ are retained and cannot be renamed or overwritten. For more information, see Time fields.
Syntax
| project-rename <output>=<field>, ...
Parameters
Parameter | Type | Required | Description |
output | Field | Yes | The new name of the field to rename. You cannot rename multiple fields to the same name. Important If the new field name is the same as an existing field name in the input data, see Retention and overwriting of new and old values for the value retention policy. |
field | Field | Yes | The original name of the field to rename.
|
Example
Rename the specified fields.
* | project-rename log_level=level, log_err_msg=err_msg
expand-values
This instruction expands a first-level JSON object for the specified field and returns multiple result entries.
The data type of the output fields is VARCHAR. If the output field name is the same as an existing field name in the input data, see Retention and overwriting of new and old values for the value retention policy.
You cannot operate on the time fields
__time__
and__time_ns_part__
. For more information, see Time fields.Data transformation of the new version is supported. For more information about SPL usage scenarios, see SPL features in different scenarios.
Syntax
| expand-values -path=<path> -limit=<limit> -keep <field> as <output>
Parameters
Parameter | Type | Required | Description |
path | JSONPath | No | The JSON path in the specified field. The JSON path is used to identify the information that you want to expand. This parameter is empty by default, which specifies that the complete data of the specified field is expanded. |
limit | Integer | No | The maximum number of entries that can be obtained after expanding a JSON object for the specified field. The value is an integer from 1 to 10. Default value: 10. |
keep | Bool | No | Specifies whether to retain the original field after the expand operation is performed. By default, the original field is not retained. If you want to retain the original field, you must configure this parameter. |
field | Field | Yes | The original name of the field to extract. The data type is |
output | Filed | No | The name of the new field that is obtained after the expand operation is performed. If you do not configure this parameter, the output data is exported to the original field. You can expand a first-level JSON object for a field based on the following logic: JSON array: expands an array by element. JSON dictionary: expands a dictionary by key-value pair. Other JSON types: returns the original values. Invalid JSON: returns |
Examples
Example 1: Expand an array and return multiple result entries.
SPL statement
* | expand-values y
Input data
x: 'abc' y: '[0,1,2]'
Output data, including three entries
# Entry 1 x: 'abc' y: '0' # Entry 2 x: 'abc' y: '1' # Entry 3 x: 'abc' y: '2'
Example 2: Expand a dictionary and return multiple result entries.
SPL statement
* | expand-values y
Input data
x: 'abc' y: '{"a": 1, "b": 2}'
Output data, including two entries
# Entry 1 x: 'abc' y: '{"a": 1}' # Entry 2 x: 'abc' y: '{"b": 2}'
Example 3: Expand the content that matches the specified JSONPath expression and export the output data to a new field.
SPL statement
* | expand-values -path='$.body' -keep content as body
Input data
content: '{"body": [0, {"a": 1, "b": 2}]}'
Output data, including two entries
# Entry 1 content: '{"body": [1, 2]}' body: '0' # Entry 2 content: '{"body": [1, 2]}' body: '{"a": 1, "b": 2}'
SQL computing instructions on structured data
extend
This instruction creates fields based on the result of SQL expression-based data calculation. For more information about the supported SQL functions, see SQL functions supported by SPL.
Syntax
| extend <output>=<sql-expr>, ...
Parameters
Parameter | Type | Required | Description |
output | Field | Yes | The name of the field to create. You cannot create the same field to store the results of multiple expressions. Important If the new field name is the same as an existing field name in the input data, the new field overwrites the existing field based on the data type and value. |
sql-expr | SQLExpr | Yes | The data processing expression. Important For more information about how to process null values, see Processing of null values in SPL expressions. |
Examples
Example 1: Use a computation expression.
* | extend Duration = EndTime - StartTime
Example 2: Use a regular expression.
* | extend server_protocol_version=regexp_extract(server_protocol, '\d+')
Example 3: Extract the JSONPath content and convert the data type of a field.
SPL statement
* | extend a=json_extract(content, '$.body.a'), b=json_extract(content, '$.body.b') | extend b=cast(b as BIGINT)
Input data
content: '{"body": {"a": 1, "b": 2}}'
Result
content: '{"body": {"a": 1, "b": 2}}' a: '1' b: 2
where
This instruction filters data based on the result of SQL expression-based data calculation and retains the data that matches the specified SQL expression. For more information about the SQL functions supported by the where instruction, see SQL functions supported by SPL.
Syntax
| where <sql-expr>
Parameters
Parameter | Type | Required | Description |
sql-expr | SQLExp | Yes | The SQL expression. Data that matches this expression is retained. Important For more information about how to process null values in SQL expressions, see Processing of null values in SPL expressions. |
Examples
Example 1: Filter data based on the field content.
* | where userId='123'
Example 2: Filter data by using a regular expression that matches data based on a field name.
* | where regexp_like(server_protocol, '\d+')
Example 3: Convert the data type of a field to match all data of server errors.
* | where cast(status as BIGINT) >= 500
Semi-structured data extraction instructions
parse-regexp
This instruction extracts the information that matches groups in the specified regular expression from the specified field.
The data type of the output field is VARCHAR. If the output field name is the same as an existing field name in the input data, see Retention and overwriting of new and old values for the value retention policy.
You cannot operate on the time fields __time__ and __time_ns_part__. For more information, see Time fields.
Syntax
| parse-regexp <field>, <pattern> as <output>, ...
Parameters
Parameter | Type | Required | Description |
field | Field | Yes | The original name of the field from which you want to extract information. The input data must contain this field, the type must be |
pattern | Regexp | Yes | The regular expression. The RE2 syntax is supported. |
output | Field | No | The name of the output field that you want to use to store the extraction result of the regular extraction. |
Examples
Example 1: Use the exploratory match mode.
SPL statement
* | parse-regexp content, '(\S+)' as ip -- Generate the ip: 10.0.0.0 field. | parse-regexp content, '\S+\s+(\w+)' as method -- Generate the method: GET field.
Input data
content: '10.0.0.0 GET /index.html 15824 0.043'
Result
content: '10.0.0.0 GET /index.html 15824 0.043' ip: '10.0.0.0' method: 'GET'
Example 2: Use the full pattern match mode and use unnamed capturing groups in a regular expression.
SPL statement
* | parse-regexp content, '(\S+)\s+(\w+)' as ip, method
Input data
content: '10.0.0.0 GET /index.html 15824 0.043'
Result
content: '10.0.0.0 GET /index.html 15824 0.043' ip: '10.0.0.0' method: 'GET'
parse-csv
This instruction extracts information in the CSV format from the specified field.
The data type of the output field is VARCHAR. If the output field name is the same as an existing field name in the input data, see Retention and overwriting of new and old values for the value retention policy.
You cannot operate on the time fields __time__ and __time_ns_part__. For more information, see Time fields.
Syntax
| parse-csv -delim=<delim> -quote=<quote> -strict <field> as <output>, ...
Parameters
Parameter | Type | Required | Description |
delim | String | No | The delimiter of the input data. You can specify one to three valid ASCII characters. You can use escape characters to indicate special characters. For example, \t indicates the tab character, \11 indicates the ASCII character whose serial number corresponds to the octal number 11, and \x09 indicates the ASCII character whose serial number corresponds to the hexadecimal number 09. You can also use a combination of multiple characters as the delimiter, such as Default value: comma (,). |
quote | Char | No | The quote of the input data. You can specify a single valid ASCII character. If the input data contains delimiters, you must specify a quote. For example, you can specify double quotation marks (""), single quotation marks (''), or an unprintable character (0x01). By default, quotes are not used. Important This parameter takes effect only if you set the delim parameter to a single character. You must specify different values for the quote and delim parameters. |
strict | Bool | No | Specifies whether to enable strict pairing when the number of values in the input data does not match the number of fields specified in the
Default value: False. If you want to enable strict paring, configure this parameter. |
field | Field | Yes | The name of the field to parse. The data content must contain this field, the type must be |
output | Field | Yes | The name of the field that you want to use to store the parsing result of the input data. |
Examples
Example 1: Match data in simple mode.
SPL statement
* | parse-csv content as x, y, z
Input data
content: 'a,b,c'
Result
content: 'a,b,c' x: 'a' y: 'b' z: 'c'
Example 2: Use double quotation marks as the quote to match data that contains special characters.
SPL statement
* | parse-csv content as ip, time, host
Input data
content: '192.168.0.100,"10/Jun/2019:11:32:16,127 +0800",example.aliyundoc.com'
Result
content: '192.168.0.100,"10/Jun/2019:11:32:16,127 +0800",example.aliyundoc.com' ip: '192.168.0.100' time: '10/Jun/2019:11:32:16,127 +0800' host: 'example.aliyundoc.com'
Example 3: Use a combination of multiple characters as the delimiter.
SPL statement
* | parse-csv -delim='||' content as time, ip, req
Input data
content: '05/May/2022:13:30:28||127.0.0.1||POST /put?a=1&b=2'
Result
content: '05/May/2022:13:30:28||127.0.0.1||POST /put?a=1&b=2' time: '05/May/2022:13:30:28' ip: '127.0.0.1' req: 'POST /put?a=1&b=2'
parse-json
This instruction extracts the first-layer JSON information from the specified field.
The data type of the output field is VARCHAR. If the output field name is the same as an existing field name in the input data, see Retention and overwriting of new and old values for the value retention policy.
You cannot operate on the time fields __time__ and __time_ns_part__. For more information, see Time fields.
Syntax
| parse-json -mode=<mode> -path=<path> -prefix=<prefix> <field>
Parameters
Parameter | Type | Required | Description |
mode | String | No | The mode that is used to extract information when the name of the output field is the same as an existing field name in the input data. The default value is overwrite. |
path | JSONPath | No | The JSON path in the specified field. The JSON path is used to identify the information that you want to extract. The default value is an empty string. If you use the default value, the complete data of the specified field is extracted. |
prefix | String | No | The prefix of the fields that are generated by expanding a JSON structure. The default value is an empty string. |
field | Field | Yes | The name of the field to parse. Make sure that this field is included in the input data and the field value is a non-null value and meets one of the following conditions. Otherwise, the extract operation is not performed.
|
Examples
Example 1: Extract all keys and values from the y field.
SPL statement
* | parse-json y
Input data
x: '0' y: '{"a": 1, "b": 2}'
Result
x: '0' y: '{"a": 1, "b": 2}' a: '1' b: '2'
Example 2: Extract the value of the body key from the content field as different fields.
SPL statement
* | parse-json -path='$.body' content
Input data
content: '{"body": {"a": 1, "b": 2}}'
Result
content: '{"body": {"a": 1, "b": 2}}' a: '1' b: '2'
Example 3: Extract information in preserve mode. For an existing field, retain the original value.
SPL statement
* | parse-json -mode='preserve' y
Input data
a: 'xyz' x: '0' y: '{"a": 1, "b": 2}'
Result
x: '0' y: '{"a": 1, "b": 2}' a: 'xyz' b: '2'
parse-kv
This instruction extracts the key-value pair information from the specified field.
The data type of the output field is VARCHAR. If the output field name is the same as an existing field name in the input data, see Retention and overwriting of new and old values for the value retention policy.
You cannot operate on the time fields __time__ and __time_ns_part__. For more information, see Time fields.
Syntax
Delimiter-based extraction
Extract key-value pairs based on the specified delimiter.
| parse-kv -mode=<mode> -prefix=<prefix> -greedy <field>, <delim>, <kv-sep>
Regular expression-based extraction
Extract key-value pairs based on the specified regular expression.
| parse-kv -regexp -mode=<mode> -prefix=<prefix> <field>, <pattern>
Parameters
Delimiter-based extraction
Parameter | Type | Required | Description |
mode | String | No | If the output field name is the same as an existing field name in the input data, you can select an overwrite mode based on your business requirements. The default value is overwrite. For more information, see Field extraction check and overwrite mode. |
prefix | String | No | The prefix of the output fields. The default value is an empty string. |
greedy | Bool | No | Specifies whether to enable greedy matching for field values.
|
field | Field | Yes | The name of the field to parse.
|
delim | Char | Yes | The delimiter between different key-value pairs. You can specify one to five valid ASCII characters, such as The delimiter cannot be a substring of |
kv-sep | Char | Yes | The character that connects a key and a value in a key-value pair. You can specify one to five valid ASCII characters, such as The character cannot be a substring of |
Regular expression-based extraction
Parameter | Type | Required | Description |
regexp | Bool | Yes | Specifies whether to enable the regular extraction mode. |
mode | String | No | If the output field name is the same as an existing field name in the input data, you can select an overwrite mode based on your business requirements. The default value is overwrite. For more information, see Field extraction check and overwrite mode. |
prefix | String | No | The prefix of the output fields. The default value is an empty string. |
field | Field | Yes | The original name of the field to extract. The input data is required to contain this field, with a type of |
pattern | RegExpr | Yes | The regular expression that contains two capturing groups. One capturing group extracts the field name and the other capturing group extracts the field value. RE2 regular expressions are supported. |
Examples
Example 1: Extract multiple characters from SLS time series data in Labels as data fields
SPL statement
* | parse-kv -prefix='__labels__.' __labels__, '|', '#$#'
Input data
__name__: 'net_in' __value__: '231461.57374215033' __time_nano__: '1717378679274117026' __labels__: 'cluster#$#sls-etl|hostname#$#iZbp17raa25u0xi4wifopeZ|interface#$#veth02cc91d2|ip#$#192.168.22.238'
Output data
__name__: 'net_in' __value__: '231461.57374215033' __time_nano__: '1717378679274117026' __labels__: 'cluster#$#sls-etl|hostname#$#iZbp17raa25u0xi4wifopeZ|interface#$#veth02cc91d2|ip#$#192.168.22.238' __labels__.cluster: 'sls-etl' __labels__.hostname: 'iZbp17raa25u0xi4wifopeZ' __labels__.interface: 'veth02cc91d2' __labels__.ip: '192.168.22.238'
Example 2: Enable greedy matching mode to extract key-value information from access logs.
SPL statement
* | parse-kv -greedy content, ' ', '='
Input data
content: 'src=127.0.0.1 dst=192.168.0.0 bytes=125 msg=connection refused body=this is test time=2024-05-21T00:00:00'
Output data
content: 'src=127.0.0.1 dst=192.168.0.0 bytes=125 msg=connection refused body=this is test time=2024-05-21T00:00:00' src: '127.0.0.1' dst: '192.168.0.0' bytes: '125' msg: 'connection refused' body: 'this is test' time: '2024-05-21T00:00:00'
Example 3: Enable regular extraction mode to process complex delimiters between key-value pairs and separators between keys and values.
SPL statement
* | parse-kv -regexp content, '([^&?]+)(?:=|:)([^&?]+)'
Input data
content: 'k1=v1&k2=v2?k3:v3' k1: 'xyz'
Output data
content: 'k1=v1&k2=v2?k3:v3' k1: 'v1' k2: 'v2' k3: 'v3'
Example 4: Extract information in preserve mode. For an existing field, retain the original value.
SPL statement
* | parse-kv -regexp -mode='preserve' content, '([^&?]+)(?:=|:)([^&?]+)'
Input data
content: 'k1=v1&k2=v2?k3:v3' k1: 'xyz'
Result
content: 'k1=v1&k2=v2?k3:v3' k1: 'xyz' k2: 'v2' k3: 'v3'
Data transformation instructions (new version)
Supported regions
China (Shanghai), China (Heyuan)
pack-fields
This instruction packs multiple fields and outputs them to a new field in JSON serialization format. This instruction is applicable to scenarios in which structured data transmission is required, such as API request body construction.
By default, non-Varchar type fields (including
__time__
and__time_ns_part__
) are not processed.By default, source data is not retained.
Syntax
| pack-fields -keep -ltrim -include=<include> -exclude=<exclude> as <output>
Parameters
Parameter | Type | Required | Description |
output | String | Yes | The name of the output field that is obtained after the pack operation is performed. The field value is in the JSON format. |
include | RegExp | No | The whitelist. Fields that match the regular expression specified in the whitelist are packed. Default value: ".*", which indicates that all fields in a log are matched and packed. For more information, see Regular expressions. |
exclude | RegExp | No | The blacklist (takes precedence over the whitelist). Fields that match the regular expression specified in the blacklist are not packed. This parameter is empty by default, which indicates that no matching is performed. For more information, see Regular expressions. |
ltrim | String | No | Removes the prefix from the output field name. |
keep | Bool | No | Specifies whether to retain the source data after the data is packed. True: The source data is retained in the output result. False (default): The source data is not retained in the output result. |
Examples
Example 1: Pack all fields in a log to the test field. By default, the original fields that are packed are deleted.
SPL statement
* | pack-fields -include='\w+' as test
Input data
test1:123 test2:456 test3:789
Result
test:{"test1": "123", "test2": "456", "test3": "789"}
Example 2: Pack all fields in a log to the test field. The original fields that are packed are not deleted.
SPL statement
* | pack-fields -keep -include='\w+' as test
Input data
test1:123 test2:456 test3:789
Result
test:{"test1": "123", "test2": "456", "test3": "789"} test1:123 test2:456 test3:789
Example 3: Pack the test and abcd fields to the content field. The original fields that are packed are not deleted.
SPL statement
* | pack-fields -keep -include='\w+' as content
Input data
abcd@#%:123 test:456 abcd:789
Result
abcd:789 abcd@#%:123 content:{"test": "456", "abcd": "789"} test:456
Example 4: Do not pack the test and abcd fields. Pack the remaining fields to the content field. Delete the original fields that are packed.
SPL statement
* | pack-fields -exclude='\w+' as content
Input data
abcd@#%:123 test:456 abcd:789
Result
abcd:789 content:{"abcd@#%": "123"} test:456
Example 5: Extract all key-value pairs that match the regular expression from the field value, and pack the key-value pairs to the name field.
SPL statement
* | parse-kv -prefix='k_' -regexp dict, '(\w+):(\w+)' | pack-fields -include='k_.*' -ltrim = 'k_' as name
Input data
dict: x:123, y:456, z:789
Result
dict:x:123, y:456, z:789 name:{"x": "123", "y": "456", "z": "789"}
log-to-metric
Convert the log format to time series storage format.
By default, log data that does not meet the requirements of time series data (Metric) is ignored.
The time unit of the time field in the raw log data is automatically detected. The following time units are supported: seconds, milliseconds, microseconds, and nanoseconds.
Hash write is enabled by default.
Syntax
| log-to-metric -wildcard -format -names=<names> -labels=<labels> -time_field=<time_field>
Parameters
Parameter | Type | Required | Description |
wildcard | Bool | No | Specifies whether to enable the wildcard match mode for the field names specified by the By default, the exact match mode is used. If you want to enable the wildcard match mode, you must configure this parameter. |
format | Bool | No | Specifies whether to enable automatic formatting. By default, this feature is disabled. Invalid label data is skipped during data transformation. After the feature is enabled, invalid label data is formatted. Label values cannot contain characters such as |
names | FieldList | Yes | The list of log fields that are used to generate metric time series points. If a field in the input data matches at least one specified field name or field name pattern, a metric time series point is generated for the field. The metric name is the field name, and the metric value is the field value. For example, Important
For more information about the format of time series data, see Time series data (Metric). |
labels | FieldList | No | The list of log fields that are used to construct time series label information. If a field in the input data matches at least one specified field name or field name pattern, the field is added to the labels of the time series point. The label name is the field name, and the label value is the field value. For example, Important
For more information about the format of time series data, see Time series data (Metric). |
time_field | String | No | The time field of the metric. By default, the Important
|
Examples
Example 1: Convert the log that contains the rt field to the time series data format.
SPL statement
* | log-to-metric -names='["rt"]'
Input data
__time__: 1614739608 rt: 123
Result
__labels__: __name__:rt __time_nano__:1614739608 __value__:123
Example 2: Convert the log that contains the rt field to the time series data format and add the host field as a label field.
SPL statement
* | log-to-metric -names='["rt"]' -labels='["host"]'
Input data
__time__: 1614739608 rt: 123 host: myhost
Result
__labels__:host#$#myhost __name__:rt __time_nano__:1614739608 __value__:123
Example 3: Convert the log that contains the
rt
andqps
fields to the time series data format and add the host field as a label field.SPL statement
* | log-to-metric -names='["rt", "qps"]' -labels='["host"]'
Input data
__time__: 1614739608 rt: 123 qps: 10 host: myhost
Result
__labels__:host#$#myhost __name__:rt __time_nano__:1614739608 __value__:123 __labels__:host#$#myhost __name__:qps __time_nano__:1614739608 __value__:10
Example 4: Use fuzzy matching to convert the log that contains the rt1 and rt2 fields to the time series data format and add the host field as a label field.
SPL statement
* | log-to-metric -wildcard -names='["rt*"]' -labels='["host"]'
Input data
__time__: 1614739608 rt1: 123 rt2: 10 host: myhost
Result
__labels__:host#$#myhost __name__:rt1 __time_nano__:1614739608 __value__:123 __labels__:host#$#myhost __name__:rt2 __time_nano__:1614739608 __value__:10
Example 5: Convert the log that contains the
rt
andqps
fields to the time series data format, add thehost
field as a label field, and automatically format the label value.SPL statement
* | log-to-metric -format -names='["rt", "qps"]' -labels='["host"]'
Input data
__time__: 1614739608 rt: 123 qps: 10 host: myhost1|myhost2
Result
__labels__:host#$#myhost1_myhost2 __name__:rt __time_nano__:1614739608 __value__:123 __labels__:host#$#myhost1_myhost2 __name__:qps __time_nano__:1614739608 __value__:10
Example 6: Convert the log that contains the
rt
andqps
fields to the time series data format, rename the fields tomax_rt
andtotal_qps
, and add thehost
field as a label field.SPL statement
* | project-rename max_rt = rt, total_qps = qps| log-to-metric -names='["max_rt", "total_qps"]' -labels='["host"]'
Input data
__time__: 1614739608 rt: 123 qps: 10 host: myhost
Result
__labels__:host#$#myhost __name__:max_rt __time_nano__:1614739608 __value__:123 __labels__:host#$#myhost __name__:total_qps __time_nano__:1614739608 __value__:10
Example 7: Convert the log that contains the
rt
andqps
fields to the time series data format, rename the fields to max_rt and total_qps, rename the host field to hostname, and add the hostname field as a label field.SPL statement
* | project-rename max_rt = rt, total_qps = qps, hostname=host| log-to-metric -names='["max_rt", "total_qps"]' -labels='["hostname"]'
Input data
__time__: 1614739608 rt: 123 qps: 10 host: myhost
Result
__labels__:hostname#$#myhost __name__:max_rt __time_nano__:1614739608 __value__:123 __labels__:hostname#$#myhost __name__:total_qps __time_nano__:1614739608 __value__:10
Example 8: Convert the log that contains the remote_user field to the time series data format, add the status field as a label field, use the time field as the time field of the time series data, and specify the time unit of the original log data as nanoseconds.
SPL statement
* | log-to-metric -names='["remote_user"]' -labels='["status"]' -time_field='time'
Input data
time:1652943594 remote_user:89 request_length:4264 request_method:GET status:200
Result
__labels__:status#$#200 __name__:remote_user __time_nano__:1652943594 __value__:89
metric-to-metric
This instruction further processes existing time series data, such as adding, modifying, or removing tags.
The input field name must match the regular expression
[a-zA-Z_][a-zA-Z0-9_]*
. Otherwise, the label does not contain the field in the generated time series point.If the three option parameters contain the same field, the priority is: add_labels > del_labels > rename_labels.
For more information about the format of output time series data, see Time series data (Metric).
Syntax
| metric-to-metric -format -add_labels=<add_labels> -del_labels=<del_labels> -rename_labels=<rename_labels>
Parameters
Parameter | Type | Required | Description |
add_labels | Array | No | The list of label fields to add. These fields are used to construct new time series label information. The original data is added to the labels of the time series point. Only the VARCHAR data type is supported. For example, if |
del_labels | Array | No | The list of label fields to remove. These fields are used to construct new time series label information. If a field in the input data matches a field name in the original labels, the field is removed from the original labels. For example, if the value of the original labels is |
rename_labels | Map | No | The list of label fields to rename. These fields are used to construct new time series label information. The labels of the original time series point are updated based on the mapping information. The key is the field name, and the value is the new field name. For example, |
format | Bool | No | Specifies whether to enable automatic formatting. By default, this feature is disabled. Invalid data is skipped during data transformation. After the feature is enabled:
|
Examples
Example 1: Add a label
SPL statement
* | extend qps = '10'|metric-to-metric -add_labels='["qps"]'
Input data
__labels__:host#$#myhost __name__:rt __time_nano__:1614739608 __value__:123
Result
__labels__:host#$#myhost|qps#$#10 __name__:rt __time_nano__:1614739608 __value__:123
Example 2: Remove a label
SPL statement
* | metric-to-metric -del_labels='["qps"]'
Input data
__labels__:host#$#myhost|qps#$#10 __name__:rt __time_nano__:1614739608 __value__:123
Result
__labels__:host#$#myhost __name__:rt __time_nano__:1614739608 __value__:123
Example 3: Rename a label
SPL statement
* | metric-to-metric -rename_labels='{"host":"etl_host"}'
Input data
__labels__:host#$#myhost|qps#$#10 __name__:rt __time_nano__:1614739608 __value__:123
Result
__labels__:etl_host#$#myhost|qps#$#10 __name__:rt __time_nano__:1614739608 __value__:123
Example 4: Format invalid data with one click
SPL statement
* | metric-to-metric -format
Input data
__labels__:host#$#myhost|qps#$#10|asda$cc#$#j|ob|schema#$#|#$#|#$#xxxx __name__:rt __time_nano__:1614739608 __value__:123
Result
__labels__:asda_cc#$#j|host#$#myhost|qps#$#10 __name__:rt __time_nano__:1614739608 __value__:123
Aggregate instructions
stats
This instruction is used for statistical analysis of logs, similar to aggregate functions in SQL (such as COUNT
, SUM
, AVG
, etc.). It performs statistical, grouping, and aggregation operations on specific fields in log data.
This instruction is dedicated to Logstore query and analysis. It is not applicable to scenarios such as new version data transformation, SPL rule consumption, write processors, and Logtail configurations.
By default, the stats instruction returns the first 100 aggregation results. If you need to return more results, you can use the limit instruction.
Syntax
stats <output>=<aggOperator> by <group>,[<group>...]
Parameters
Parameter | Type | Required | Description |
output | String | Yes | Specifies an alias for the statistical result field. |
aggOperator | SQLExp | Yes | The following aggregate functions are supported:
|
group | String | No | Specifies the dimension for aggregation, similar to the field in the GROUP BY clause in SQL. |
Examples
Example 1: Count
pv
byip
for access logsSPL statement
* | stats pv=count(*) by ip
Input data
ip: 192.168.1.1 latencyMs: 10 ip: 192.168.1.1 latencyMs: 20 ip: 192.168.1.2 latencyMs: 10
Output data
ip: 192.168.1.2 pv: 1 ip: 192.168.1.1 pv: 2
Example 2: Statistics of latency
min/max
for allip
inaccesslog
SPL statement
* | extend latencyMs=cast(latencyMs as bigint) | stats minLatencyMs=min(latencyMs), maxLatencyMs=max(latencyMs) by ip
Input data
ip: 192.168.1.1 latencyMs: 10 ip: 192.168.1.1 latencyMs: 20 ip: 192.168.1.2 latencyMs: 10
Output data
ip: 192.168.1.2 minLatencyMs: 10 maxLatencyMs: 20 ip: 192.168.1.1 minLatencyMs: 10 maxLatencyMs: 10
Example 3: Count all
pv
for access logsSPL statement
* | stats pv=count(*)
Input data
ip: 192.168.1.1 latencyMs: 10 ip: 192.168.1.1 latencyMs: 20 ip: 192.168.1.2 latencyMs: 10
Output data
pv: 3
sort
This instruction sorts query results. You can sort field values or statistical results in ascending (asc
) or descending (desc
) order. It is an important tool for quickly locating key data and generating ordered reports in log analysis.
This instruction is dedicated to Logstore query and analysis. It is not applicable to scenarios such as new version data transformation, SPL rule consumption, write processors, and Logtail configurations.
Syntax
sort <field> [asc/desc] ,(<field> [asc/desc])
Parameters
Parameter | Type | Required | Description |
field | String | Yes | Specifies the field to sort. The following field types are supported:
|
asc/desc | String | No |
|
Example
Sort accesslog
by latencyMs
.
SPL statement
* | extend latencyMs=cast(latencyMs as bigint) | sort latencyMs desc
Input data
ip: 192.168.1.1 latencyMs: 10 ip: 192.168.1.1 latencyMs: 20 ip: 192.168.1.2 latencyMs: 15
Output data
ip: 192.168.1.1 latencyMs: 20 ip: 192.168.1.2 latencyMs: 15 ip: 192.168.1.1 latencyMs: 10
limit
This instruction limits the number of log entries returned in query results. It is one of the core instructions for controlling data volume. You can use the limit
instruction to effectively prevent performance issues or resource waste caused by excessively large query results. This instruction is applicable to various scenarios such as log analysis and real-time monitoring.
This instruction is dedicated to Logstore query and analysis. It is not applicable to scenarios such as new version data transformation, SPL rule consumption, write processors, and Logtail configurations.
If you do not use the sort instruction to specify a sorting rule, the order of the output results of the limit instruction is random (because the natural order is not guaranteed when logs are stored).
Syntax
limit (<offset>,) <size>
Parameters
Parameter | Type | Required | Description |
offset | Interger | No | Skips the first |
size | Interger | Yes | The row limit. |
Example
Sort access logs by latencyMs and return the first entry.
SPL statement
* | extend latencyMs=cast(latencyMs as bigint) | sort latencyMs | limit 1
Input data
ip: 192.168.1.1 latencyMs: 10 ip: 192.168.1.1 latencyMs: 20 ip: 192.168.1.2 latencyMs: 15
Output data
ip: 192.168.1.1 latencyMs: 20