This topic describes Simple Log Service Processing Language (SPL), including its implementation, syntax, and instruction expressions.
SPL overview
Simple Log Service provides SPL (SLS Processing Language) statements for structured information extraction, field operations, and data filtering of raw data. Additionally, Simple Log Service offers multi-level pipeline cascade functionality, where the first level pipeline is the index filter condition, followed by multiple levels of SPL instruction pipelines, ultimately outputting the result data after processing. If you are familiar with SQL, you can refer to Comparison of SPL and SQL usage scenarios when using Simple Log Service SPL in different data processing scenarios.
How it works
Simple Log Service supports SPL in the following features: Collection by Logtail, ingest processor, rule-based data consumption, data transformation (new version), scan-based query and log analysis. The following figure shows how SPL works:
For more information about the capabilities that are supported by SPL in different scenarios, see SPL in different scenarios.
Limits
Category | Item | Logtail collection | Ingest processor | Real-time consumption | Data transformation (new version) | Scan query |
SPL complexity | Number of script pipeline levels | 16 | 16 | 16 | 16 | 16 |
Script length | 64 KB | 64 KB | 10 KB | 10 KB | 64 KB | |
SPL runtime | Memory size Important For more information, see Handle errors. | 50 MB | 1 GB | 1 GB | 1 GB | 2 GB |
Timeout period Important For more information, see Handle errors. | 1 second | 5 seconds | 5 seconds | 5 seconds | 2 seconds |
SPL syntax
SPL statements
An SPL statement supports multi-level data processing, uses a vertical bar (|) as the pipeline symbol for connection, and ends with a semicolon (;). The following section describes the SPL syntax:
Syntax
<data-source> | <spl-expr> | <spl-expr> ;
Parameters
Parameter
Description
Example
data-source
The data source, including a Logstore and an SPL-defined dataset. For more information about the data sources that are supported by SPL in different scenarios, see SPL in different scenarios.
* | project status, body;
spl-expr
The data processing expression. For more information, see SPL instruction expressions.
Syntax symbols
The following table describes the syntax symbols used in SPL.
Symbol | Description | Example |
* | The placeholder that allows you to specify a Logstore as the data source of your SPL statement. | Filter and categorize access logs based on status codes, and then output the results.
|
. | The keyword of the SPL syntax if an SPL statement starts with a period (.). | |
| | The SPL pipeline symbol, which is used to introduce an SPL instruction expression. Format: | |
; | The end identifier of an SPL statement. This symbol is optional in a single statement or in the last statement among multiple statements. | |
'...' | The quotes that are used to enclose a string constant. | |
"..." | The quotes that are used to enclose a field name or a field name pattern. | |
-- | The comment symbol for single-line content. | |
/*...*/ | You can comment multiple lines of content. | |
$ | The symbol for named dataset references. Format: |
SPL data types
The following table describes the log field data types supported by SPL:
Category | Data type | Description |
Basic numeric data types | BOOLEAN | Boolean type. |
TINYINT | An integer with a width of 8 bits. | |
SMALLINT | An integer with a width of 16 bits. | |
INTEGER | An integer with a width of 32 bits. | |
BIGINT | An integer with a width of 64 bits. | |
HUGEINT | An integer with a width of 128 bits. | |
REAL | A variable-precision floating-point number with a width of 32 bits. | |
DOUBLE | A variable-precision floating-point number with a width of 64 bits. | |
TIMESTAMP | A UNIX timestamp that is accurate to the nanosecond. | |
DATE | The date data type. Format: YYYY-MM-DD. | |
VARCHAR | The variable-length character data type. | |
VARBINARY | The variable-length binary data type. | |
Structured numeric data types | ARRAY | The array type. Brackets ( Example: |
MAP | The dictionary type. A dictionary key must be of a basic numeric data type. A dictionary value can be of an arbitrary data type. Brackets ( Example: | |
JSON data type | JSON | The JSON type. |
For more information about how to convert data types during SPL-based data processing, see General references.
SPL instruction expressions
Instruction expression syntax
cmd -option=<option> -option ... <expression>, ... as <output>, ...
Parameters
Parameter | Description |
cmd | The name of the instruction. |
option | The following parameter formats are supported:
|
expression | Required. The processing logic on the data source. You do not need to specify a parameter name. The position of the <expression> parameter must comply with the definition of the instruction. The following types of expressions are supported:
|
output | The output field that contains the processing result. Example: |
Instructions
SPL supports the following instructions. For more information, see SPL instructions.
Category | Name | Description |
Control instruction | Defines named datasets. | |
Field processing instructions | Retains the fields that match the specified pattern, and renames the specified fields. During instruction execution, all retain-related expressions are executed before rename-related ones. | |
Removes the fields that match the specified pattern, and retains all other fields as they are. | ||
Renames the specified fields, and retains all other fields as they are. | ||
Expands a first-level JSON object for the specified field, and returns multiple result entries. | ||
SQL instructions for structured data computation | Creates fields based on the result of SQL expression-based data calculation. For more information, see SPL-supported SQL functions. | |
Filters data based on the result of SQL expression-based data calculation, and retains the data that matches the specified SQL expression. For more information, see SPL-supported SQL functions. | ||
Instructions for extracting semi-structured data | Extracts the information that matches groups in the specified regular expression from the specified field. | |
Extracts information in the CSV format from the specified field. | ||
Extracts the first-layer JSON information from the specified field. | ||
Extracts the key-value pair information from the specified field. | ||
Data transformation of the new version Data | Encapsulates log fields, serializes the fields in the JSON format, and exports the fields to a new field. This instruction is applicable to scenarios in which structured transmission is required, such as API request body construction. | |
The e_to_metric function converts logs to metrics that can be stored in a Metricstore. | ||
Processes existing time series data, such as adding, modifying, or removing tags. | ||
Aggregation instructions | Used for statistical analysis of logs, similar to aggregate functions in SQL (such as | |
Sorts query results, supporting ascending ( | ||
Used to limit the number of log rows returned in query results. It is one of the core instructions for controlling data volume. Through |