A data filtering plugin determines which log records to collect based on specified conditions.
Data filtering plugin overview
Log Service offers several data filtering plugins. Choose one based on your requirements.
|
Plugin name |
Type |
Description |
|
Filtering processor |
built-in |
Collects only logs where the field value exactly matches a specified allowlist pattern. |
|
filter_regex |
custom |
Supports the following filtering modes:
|
Entry point
To use a Logtail plugin for log processing, add it when you create or modify a Logtail configuration. For more information, see Overview.
Differences between built-in and custom plugins
Built-in plugins: Implemented in C++ for high performance.
Custom plugins: Implemented in Go to provide flexibility and a rich ecosystem. If your service logs are too complex for built-in plugins, consider using a custom plugin.
-
Performance limitations of custom plugins
-
Using a custom plugin to process logs consumes more LoongCollector resources, primarily CPU. If needed, you can use configuration management to adjust the LoongCollector parameter settings.
-
If the raw data generation rate exceeds 5 MB/s, avoid using complex plugin combinations. Instead, use a custom plugin for simple processing, followed by data transformation for advanced processing.
-
-
Log collection limitations
-
Custom plugins process text logs in line mode, storing file-level metadata, such as
__tag__:__path__and__topic__, in each log entry. -
Adding a custom plugin affects tag-related features:
-
The context query and LiveTail feature are unavailable. To use these features, you must add an aggregators configuration.
-
The
__topic__field is renamed to__log_topic__. If you add an aggregators configuration, both the__topic__and__log_topic__fields will be present in the logs. If you do not need the__log_topic__field, you can use the processor_drop plugin to delete it. -
Fields such as
__tag__:__path__no longer have a built-in field index. You must create a field index for them.
-
-
Native filter plugin
The native filter plugin filters logs by their field values.
Configuration
|
Parameter |
Description |
|
Whitelist |
Define an allowlist to collect only logs that meet specific conditions. You must specify the target field name and a regular expression for filtering. The regular expression must match the entire string. Partial matches are not supported. For details on writing a regular expression, see Regular Expression Tutorial. Multiple conditions in the allowlist are joined by a logical AND. The following are examples:
|
Log filtering plug-ins (advanced)
Use the processor_filter_regex plug-in or the processor_filter_key_regex plug-in to filter logs. This topic describes the parameters and provides configuration examples for each plug-in.
Limitations
-
The form-based configuration is available only for text logs and container standard output. For all other sources, you must use the JSON-based configuration.
-
The Go regular expression engine is based on RE2 and has the following limitations compared to the PCRE engine:
-
Syntax differences in named capturing groups
Go uses the
(?P<name>...)syntax, whereas PCRE uses(?<name>...). -
Unsupported regular expression patterns
-
Lookaround:
(?=...),(?!...),(?<=...), and(?<!...). -
Conditional expressions:
(?(condition)true|false). -
Recursive matching:
(?R)and(?0). -
Subroutine references:
(?&name)and(?P>name). -
Atomic groups:
(?>...).
-
When you debug regular expressions with tools like Regex101, avoid using the unsupported patterns listed above to prevent processing failures.
-
processor_filter_regex (Filter by value)
processor_filter_key_regex (Filter by key)
References
-
Configure a Logtail pipeline using the API:
-
Configure a processor plug-in in the console:
-
Collect container logs (stdout/file) from a cluster using a Kubernetes CRD