All Products
Search
Document Center

Simple Log Service:Logtail configurations

Last Updated:Nov 03, 2023

A Logtail configuration is a set of policies that are used by Logtail to collect logs. You can configure parameters such as a data source and a collection mode to customize a Logtail configuration. This topic describes how to configure parameters for a Logtail configuration when you use the Simple Log Service API to collect logs.

Basic parameters of a Logtail configuration

Parameter

Type

Required

Example

Description

configName

string

Yes

config-sample

The name of the Logtail configuration. The name must be unique in the project to which the Logtail configuration belongs. After the Logtail configuration is created, you cannot change the name of the Logtail configuration.

The name must meet the following requirements:

  • The name can contain only lowercase letters, digits, hyphens (-), and underscores (_).

  • The name must start and end with a lowercase letter or a digit.

  • The name must be 2 to 128 characters in length.

inputType

string

Yes

file

The type of the data source. Valid values:

  • plugin: Logs such as MySQL binary logs are collected by using Logtail plug-ins.

  • file: Text logs are collected by using existing modes, which include the full regex mode and delimiter mode.

inputDetail

JSON object

Yes

None

The detailed configuration of the data source. For more information, see inputDetail.

outputType

string

Yes

LogService

The output type of collected logs. The value is fixed as LogService. Collected logs can be uploaded only to Simple Log Service.

outputDetail

JSON object

Yes

None

The detailed configuration of collected logs. For more information, see outputDetail.

logSample

string

No

None

The sample log.

InputDetail

Basic parameters

Parameter

Type

Required

Example

Description

filterKey

array

No

["ip"]

The fields that are used to filter logs. A log is collected only when the values of the fields in the log match the regular expressions that are specified in the filterRegex parameter.

filterRegex

array

No

["^10.*"]

The regular expressions that are used to match the values of the fields. The fields are specified in the filterKey parameter. The number of elements in the value of the filterRegex parameter must be the same as the number of elements in the value of the filterKey parameter.

shardHashKey

array

No

["__source__"]

The mode that is used to write data. By default, data is written to Simple Log Service in load balancing mode. For more information, see Load balancing mode.

If you configure this parameter, data is written to Simple Log Service in shard mode. For more information, see Shard mode. The __source__ field is supported.

enableRawLog

boolean

No

false

Specifies whether to upload raw logs. Valid values:

  • true

  • false (default)

sensitive_keys

array

No

None

The configuration that is used to mask data. For more information, see sensitive_keys.

mergeType

string

No

topic

The method that is used to aggregate data. Valid values:

  • topic (default): Data is aggregated by topic.

  • logstore: Data is aggregated by Logstore.

delayAlarmBytes

int

No

209715200

The alert threshold of log collection progress. Default value: 209715200. This value specifies that an alert is triggered if 200 MB of data is not collected within a specified period of time.

adjustTimezone

boolean

No

false

Specifies whether to change the time zone of logs. This parameter is valid only if the time parsing feature is enabled. For example, if you configure the timeFormat parameter, you can also configure the adjustTimezone parameter.

logTimezone

string

No

GMT+08:00

The offset of the time zone. Format: GMT+HH:MM or GMT-HH:MM. For example, if you want to collect logs whose time zone is UTC+8, set the value to GMT+08:00.

advanced

JSON object

No

None

The advanced features. For more information, see advanced.

sensitive_keys

  • Parameters

    Parameter

    Type

    Required

    Example

    Description

    key

    string

    Yes

    content

    The name of the log field.

    type

    string

    Yes

    const

    The method that is used to mask the content of the log field. Valid values:

    • const: The sensitive content is replaced by the value of the const field.

    • md5: The sensitive content is replaced by the MD5 hash value that is generated for the content.

    regex_begin

    string

    Yes

    'password':'

    The regular expression that is used to match the prefix of sensitive content. The regular expression is used to find sensitive content. You must use the RE2 syntax to specify the regular expression. For more information, visit RE2 syntax.

    regex_content

    string

    Yes

    [^']*

    The regular expression that is used to match sensitive content. You must use the RE2 syntax to specify the regular expression. For more information, visit RE2 syntax.

    all

    boolean

    Yes

    true

    Specifies whether to replace all sensitive content in the log field. Valid values:

    • true: replaces all sensitive content in the log field. This is the recommended value.

    • false: replaces only the sensitive content that the specified regular expressions match for the first time in the log field.

    const

    string

    No

    "********"

    If you set the type parameter to const, you must configure this parameter.

  • Configuration example

    If you want to mask password in the content field whose value is [{'account':'1812213231432969','password':'04a23f38'}, {'account':'1812213685634','password':'123a'}] and replace the values of password with ********, use the following settings for the sensitive_keys parameter:

    sensitive_keys = [{"all": true,
    "const": "********",
    "regex_content": "[^']*",
    "regex_begin": "'password':'",
    "type": "const",
    "key": "content"}]                    
  • Sample log

    [{'account':'1812213231432969','password':'********'}, {'account':'1812213685634','password':'********'}]

advanced

Parameter

Type

Required

Example

Description

enable_root_path_collection

boolean

No

false

Specifies whether to allow data collection from Windows root directories, such as D:\log*. Valid values:

  • true

  • false (default)

Important
  • This parameter is a global parameter. If you set this parameter to true for a specific Logtail configuration, Logtail is allowed to collect data from root directories based on all the Logtail configurations that are applied to the same server as the specific Logtail configuration. The setting of this parameter is valid until Logtail on the server is restarted.

  • This parameter is available only if you install Logtail V1.0.0.22 or later on a Windows server.

exactly_once_concurrency

int

No

1

Specifies whether to enable the ExactlyOnce write feature. The ExactlyOnce write feature allows you to specify the maximum number of log groups that can be concurrently sent when Logtail collects data from a file. Valid values: 0 to 512. For more information, see Additional information: ExactlyOnce write feature. Valid values:

  • 0: The ExactlyOnce write feature is disabled.

  • Other values: The ExactlyOnce write feature is enabled. The value specifies the maximum number of log groups that can be concurrently sent when Logtail collects data from a file.

Important
  • If you set this parameter to a larger value, more memory and disk overheads are generated. We recommend that you configure this parameter based on the local write traffic.

  • If the value of this parameter is less than the number of shards in the Logstore that is used, Logtail randomly processes data to ensure that the data is evenly written to each shard. The value that you specify for this parameter can be different from the number of shards.

  • The setting takes effect only for the files that are generated after you configure this parameter.

  • Only Logtail V1.0.21 and later support this parameter.

enable_log_position_meta

boolean

No

true

Specifies whether to add the metadata information of a source log file to each log that is collected from the file. The metadata information includes the __tag__:__inode__ and __file_offset__ fields. Valid values:

  • true

  • false

Note

Only Logtail V1.0.21 and later support this parameter.

specified_year

uint

No

0

The year that is used to supplement the log time if the time of a raw log does not contain the year information. You can specify the current year or a different year. Valid values:

  • 0: The current year is used.

  • Specific year: A year other than the current year is used. Example: 2020.

Note

Only Logtail V1.0.21 and later support this parameter.

force_multiconfig

boolean

No

false

Specifies whether Logtail can use the Logtail configuration to collect data from the files that are matched based on other Logtail configurations. Default value: false.

If you want Logtail to collect data from a file by using different Logtail configurations, you can configure this parameter. For example, you can configure this parameter to collect data from a file to two Logstores by using two Logtail configurations.

raw_log_tag

string

No

__raw__

The field that is used to store raw logs that are uploaded. Default value: __raw__.

blacklist

object

No

None

The blacklist configuration. For more information, see Parameters of blacklist.

tail_size_kb

int

No

1024

The size of data that is collected from a file the first time Logtail reads the file. The value determines the start position from which Logtail collects data. The first time Logtail reads a file, Logtail can read up to 1,024 KB of data in the file by default.

  • If the file size is less than 1,024 KB, Logtail collects data from the beginning of the file.

  • If the file size is greater than 1,024 KB, Logtail collects the last 1,024 KB of data in the file.

Valid values: 0 to 10485760. Unit: KB. You can change the value.

batch_send_interval

int

No

3

The interval at which aggregated data is sent. Default value: 3. Unit: seconds.

max_rotate_queue_size

int

No

20

The maximum length of the queue in which a file is rotated. Default value: 20.

enable_precise_timestamp

boolean

No

false

Specifies whether to extract time values with high precision. If you do not include this parameter in the configuration, the system uses the value false by default. The value false specifies that time values with high precision are not extracted.

If you set this parameter to true, Logtail automatically parses the specified time values into timestamps with millisecond precision and stores the timestamps in the field specified by the precise_timestamp_key parameter.

Note
  • Make sure that Use System Time is disabled in the Logtail configuration.

  • Only Logtail V1.0.32 and later support this parameter.

precise_timestamp_key

string

No

"precise_timestamp"

The field that stores timestamps with high precision. If you do not include this parameter in the configuration, the system uses the precise_timestamp field by default.

precise_timestamp_unit

string

No

"ms"

The unit of the timestamps with high precision. If you do not include this parameter in the configuration, the system uses ms by default. Valid values: ms, us, and ns.

The following table describes the parameters of blacklist.

Parameter

Type

Required

Example

Description

dir_blacklist

array

No

["/home/admin/dir1", "/home/admin/dir2*"]

The blacklist of directories, which must be absolute paths. You can use asterisks (*) as wildcard characters to match multiple directories.

For example, if you specify /home/admin/dir1, all files in the /home/admin/dir1 directory are skipped during log collection.

filename_blacklist

array

No

["app*.log", "password"]

The blacklist of file names. The files that match a name specified in this parameter are skipped during log collection regardless of the directories to which the files belong. You can use asterisks (*) as wildcard characters to match multiple file names.

filepath_blacklist

array

No

["/home/admin/private*.log"]

The blacklist of file paths, which must be absolute paths. You can use asterisks (*) as wildcard characters to match multiple files.

For example, if you specify /home/admin/private*.log, all files whose name starts with private and ends with .log in the /home/admin/ directory are skipped during log collection.

Configurations that are specific to Logtail for text log collection

Basic parameters

Parameter

Type

Required

Example

Description

logType

string

Yes

common_reg_log

The mode in which logs are collected. Valid values:

  • json_log: collects logs in JSON mode.

  • common_reg_log: collects logs in full regex mode.

  • plugin: collects logs in plug-in mode.

  • delimiter_log: collects logs in delimiter mode.

logPath

string

Yes

/var/log/http/

The log file path.

filePattern

string

Yes

access*.log

The log file name.

topicFormat

string

Yes

none

The method that is used to generate a topic. Valid values:

  • none: No log topics are generated.

  • default: The log file path is used as the topic of collected logs.

  • group_topic: The topic of the machine group to which the Logtail configuration is applied is used as the topic of collected logs.

  • Regular expression that is used to match a log file path: A part of the log file path is used as the topic of collected logs. Example: /var/log/(.*).log.

For more information, see Log topics.

timeFormat

string

No

%Y/%m/%d %H:%M:%S

The format of the log time. For more information, see Time formats.

preserve

boolean

No

true

Specifies whether to use the timeout mechanism on log files. If a log file is not updated within a specified period of time, Logtail considers the file to be timed out. Valid values:

  • true (default): Log files never time out.

  • false: If a log file is not updated within 30 minutes, Logtail considers the file to be timed out and no longer monitors the file.

preserveDepth

integer

No

1

The maximum levels of directories whose files are monitored by using the timeout mechanism. If you set the preserve parameter to false, you must configure this parameter. Valid values: 1 to 3.

fileEncoding

string

No

utf8

The encoding format of log files. Valid values: utf8 and gbk.

discardUnmatch

boolean

No

true

Specifies whether to discard the logs that fail to be matched. Valid values:

  • true

  • false

maxDepth

int

No

100

The maximum levels of directories that are monitored. Valid values: 0 to 1000. The value 0 specifies that only the log file directory that you specify is monitored.

delaySkipBytes

int

No

0

The threshold that is used to determine whether to discard data if the data is not collected within a specified period of time. Valid values:

  • 0 (default): Data is not discarded.

  • Other values: Data is discarded. For example, if the size of data that is not collected within a specified period of time exceeds the specified threshold, such as 1024 KB, the data is discarded.

dockerFile

boolean

No

false

Specifies whether to collect logs from container files. Default value: false.

dockerIncludeLabel

JSON object

No

None

The container label whitelist. The whitelist specifies the containers whose data you want to collect. By default, this parameter is empty, which indicates that you want to collect logs or stdout and stderr from all containers. When you configure the container label whitelist, the LabelKey parameter is required, and the LabelValue parameter is optional.

  • If the LabelValue parameter is empty, containers whose container labels contain the keys specified by LabelKey are matched.

  • If the LabelValue parameter is not empty, containers whose container labels contain the key-value pairs specified by LabelKey and LabelValue are matched.

    By default, string matching is performed for the values of the LabelValue parameter. Containers are matched only if the values of the container labels are the same as the values of the LabelValue parameter. If you specify a value that starts with a caret (^) and ends with a dollar sign ($) for the LabelValue parameter, regular expression matching is performed. For example, if you set the LabelKey parameter to io.kubernetes.container.name and set the LabelValue parameter to ^(nginx|cube)$, a container named nginx and a container named cube are matched.

Note
  • Do not specify duplicate values for the LabelKey parameter. If you specify duplicate values for the LabelKey parameter, only one of the values takes effect.

  • Key-value pairs are in the logical OR relation. If the labels of a container match one of the specified key-value pairs, the container is matched.

dockerExcludeLabel

JSON object

No

None

The container label blacklist. The blacklist specifies the containers whose data you want to exclude. By default, this parameter is empty, which indicates that you want to collect data from all containers. When you configure the container label blacklist, the LabelKey parameter is required, and the LabelValue parameter is optional.

  • If the LabelValue parameter is empty, containers whose container labels contain the keys specified by LabelKey are filtered out.

  • If the LabelValue parameter is not empty, containers whose container labels contain the key-value pairs specified by LabelKey and LabelValue are filtered out.

    By default, string matching is performed for the values of the LabelValue parameter. Containers are matched only if the values of the container labels are the same as the values of the LabelValue parameter. If you specify a value that starts with a caret (^) and ends with a dollar sign ($) for the LabelValue parameter, regular expression matching is performed. For example, if you set the LabelKey parameter to io.kubernetes.container.name and set the LabelValue parameter to ^(nginx|cube)$, a container named nginx and a container named cube are matched.

Note
  • Do not specify duplicate values for the LabelKey parameter. If you specify duplicate values for the LabelKey parameter, only one of the values takes effect.

  • Key-value pairs are in the logical OR relation. If the labels of a container match one of the specified key-value pairs, the container is filtered out.

dockerIncludeEnv

JSON object

No

None

The environment variable whitelist. The whitelist specifies the containers whose data you want to collect. By default, this parameter is empty, which indicates that you want to collect logs or stdout and stderr from all containers. When you configure the environment variable whitelist, the EnvKey parameter is required, and the EnvValue parameter is optional.

  • If the EnvValue parameter is empty, containers whose environment variables contain the keys specified by EnvKey are matched.

  • If the EnvValue parameter is not empty, containers whose environment variables contain the key-value pairs specified by EnvKey and EnvValue are matched.

    By default, string matching is performed for the values of the EnvValue parameter. Containers are matched only if the values of the environment variables are the same as the values of the EnvValue parameter. If you specify a value that starts with a caret (^) and ends with a dollar sign ($) for the EnvValue parameter, regular expression matching is performed. For example, if you set the EnvKey parameter to NGINX_SERVICE_PORT and set the EnvValue parameter to ^(80|6379)$, containers whose port number is 80 and containers whose port number is 6379 are matched.

Note

Key-value pairs are in the logical OR relation. If the environment variables of a container match one of the specified key-value pairs, the container is matched.

dockerExcludeEnv

JSON object

No

None

The environment variable blacklist. The blacklist specifies the containers whose data you want to exclude. By default, this parameter is empty, which indicates that you want to collect data from all containers. When you configure the environment variable blacklist, the EnvKey parameter is required, and the EnvValue parameter is optional.

  • If the EnvValue parameter is empty, containers whose environment variables contain the keys specified by EnvKey are filtered out.

  • If the EnvValue parameter is not empty, containers whose environment variables contain the key-value pairs specified by EnvKey and EnvValue are filtered out.

    By default, string matching is performed for the values of the EnvValue parameter. Containers are matched only if the values of the environment variables are the same as the values of the EnvValue parameter. If you specify a value that starts with a caret (^) and ends with a dollar sign ($) for the EnvValue parameter, regular expression matching is performed. For example, if you set the EnvKey parameter to NGINX_SERVICE_PORT and set the EnvValue parameter to ^(80|6379)$, containers whose port number is 80 and containers whose port number is 6379 are matched.

Note

Key-value pairs are in the logical OR relation. If the environment variables of a container match one of the specified key-value pairs, the container is filtered out.

Configurations that are specific to log collection in full regex mode and simple mode

  • Parameters

    Parameter

    Type

    Required

    Example

    Description

    key

    array

    Yes

    ["content"]

    The fields that are specified for raw logs.

    logBeginRegex

    string

    No

    .*

    The regular expression that is used to match the beginning of the first line of a log.

    regex

    string

    No

    (.*)

    The regular expression that is used to extract the value of a field.

  • Configuration example

    {
        "configName": "logConfigName", 
        "outputType": "LogService", 
        "inputType": "file", 
        "inputDetail": {
            "logPath": "/logPath", 
            "filePattern": "*", 
            "logType": "common_reg_log", 
            "topicFormat": "default", 
            "discardUnmatch": false, 
            "enableRawLog": true, 
            "fileEncoding": "utf8", 
            "maxDepth": 10, 
            "key": [
                "content"
            ], 
            "logBeginRegex": ".*", 
            "regex": "(.*)"
        }, 
        "outputDetail": {
            "projectName": "test-project", 
            "logstoreName": "test-logstore"
        }
    }

Configurations that are specific to log collection in JSON mode

Parameter

Type

Required

Example

Description

timeKey

string

No

time

The key that is used to specify the time field.

Configurations that are specific to log collection in delimiter mode

  • Parameters

    Parameter

    Type

    Required

    Example

    Description

    separator

    string

    No

    ,

    The delimiter. You must select a delimiter based on the format of logs that you want to collect. For more information, see Collect logs in delimiter mode.

    quote

    string

    Yes

    \

    The quote. If a log field contains delimiters, you must specify a quote to enclose the field. Simple Log Service parses the content that is enclosed in a pair of quotes into a complete field. You must select a quote based on the format of logs that you want to collect. For more information, see Collect logs in delimiter mode.

    key

    array

    Yes

    [ "ip", "time"]

    The fields that are specified for raw logs.

    timeKey

    string

    Yes

    time

    The time field. You must specify a field in the value of key as the time field.

    autoExtend

    boolean

    No

    true

    Specifies whether to upload a log if the number of fields parsed from the log is less than the number of specified keys.

    For example, if you specify a vertical bar (|) as the delimiter, the log 11|22|33|44|55 is parsed into the following fields: 11, 22, 33, 44, and 55. You can set the keys to A, B, C, D, and E.

    • true: The log 11|22|33|55 is uploaded to Simple Log Service, and 55 is uploaded as the value of the D key.

    • false: The log 11|22|33|55 is discarded because the number of fields parsed from the log does not match the number of specified keys.

  • Configuration example

    {
        "configName": "logConfigName", 
        "logSample": "testlog", 
        "inputType": "file", 
        "outputType": "LogService", 
        "inputDetail": {
            "logPath": "/logPath", 
            "filePattern": "*", 
            "logType": "delimiter_log", 
            "topicFormat": "default", 
            "discardUnmatch": true, 
            "enableRawLog": true, 
            "fileEncoding": "utf8", 
            "maxDepth": 999, 
            "separator": ",", 
            "quote": "\"", 
            "key": [
                "ip", 
                "time"
            ], 
            "autoExtend": true
        }, 
        "outputDetail": {
            "projectName": "test-project", 
            "logstoreName": "test-logstore"
        }
    }

Configurations that are specific to Logtail plug-ins

  • Parameters

    The following table describes the configurations that are specific to log collection by using plug-ins.

    Parameter

    Type

    Required

    Example

    Description

    plugin

    JSON object

    Yes

    None

    If you use a Logtail plug-in to collect logs, you must configure this parameter. For more information, see Use Logtail plug-ins to collect data.

  • Configuration example

    {
        "configName": "logConfigName", 
        "outputType": "LogService", 
        "inputType": "plugin",
        "inputDetail": {
            "plugin": {
                "inputs": [
                    {
                        "detail": {
                            "ExcludeEnv": null, 
                            "ExcludeLabel": null, 
                            "IncludeEnv": null, 
                            "IncludeLabel": null, 
                            "Stderr": true, 
                            "Stdout": true
                        }, 
                        "type": "service_docker_stdout"
                    }
                ]
            }
        }, 
        "outputDetail": {
            "projectName": "test-project", 
            "logstoreName": "test-logstore"
        }
    }

outputDetail

outputDetail is used to specify the project and Logstore that store the collected logs.

Parameter

Type

Required

Example

Description

projectName

string

Yes

my-project

The name of the project. The name must be the same as the name of the project that you specify in the API request.

logstoreName

string

Yes

my-logstore

The name of the Logstore.

Additional information: ExactlyOnce write feature

After you enable the ExactlyOnce write feature, Logtail records fine-grained checkpoints by file to the disk of the server on which Logtail is installed. If exceptions such as a process error or a server restart occur during log collection, Logtail uses the checkpoints to determine the scope of data that must be processed in each file when log collection is resumed, and then uses the incremental sequence numbers that are provided by Simple Log Service to prevent duplicate data from being sent. However, the ExactlyOnce write feature consumes disk write resources. Limits:

  • Checkpoints are stored in the disk of the server on which Logtail is installed. If checkpoints are lost because the disk has no available space or becomes faulty, the checkpoints cannot be recovered.

  • Checkpoints record only the metadata information of a file. Checkpoints do not record the data of a file. If the file is deleted or modified, the checkpoints may not be recovered.

  • The ExactlyOnce write feature is based on the current write sequence numbers that are recorded by Simple Log Service. Each shard supports only 10,000 records. If the limit is exceeded, the previous records are replaced. To ensure reliability, make sure that the value that is calculated by using the following formula does not exceed 9500: Value = Number of active files that are written to the same Logstore × Number of Logtail instances. We recommend that you reserve a gap between the value and 9500.

    • Number of active files: the number of files that are being read and sent. Files that are generated during log file rotation and have the same logical file name are sent in serial mode. These files are considered as one active file.

    • Number of Logtail instances: the number of Logtail processes. By default, each server hosts one Logtail instance. The number of Logtail instances is the same as the number of servers.

By default, the sync command is not run when Logtail writes checkpoints to disk. This helps ensure performance. However, if buffered data fails to be written to disk when the server restarts, the checkpoints may be lost. To enable the sync-based write feature, you can add "enable_checkpoint_sync_write": true, to the startup configuration file /usr/local/ilogtail/ilogtail_config.json of Logtail. For more information, see Configure the startup parameters of Logtail.