A Logtail configuration is a set of policies that are used by Logtail to collect logs. You can configure parameters such as a data source and a collection mode to customize a Logtail configuration. This topic describes how to configure parameters for a Logtail configuration when you use Log Service API to collect logs.

Basic parameters of a Logtail configuration

ParameterTypeRequiredExampleDescription
configNamestringYesconfig-sampleThe name of the Logtail configuration. The name must be unique in the project to which the Logtail configuration belongs. After you create a Logtail configuration, you cannot change the name of the Logtail configuration.

The name must meet the following requirements:

  • The name can contain only lowercase letters, digits, hyphens (-), and underscores (_).
  • The name must start and end with a lowercase letter or a digit.
  • The name must be 2 to 128 characters in length.
inputTypestringYesfileThe type of the data source. Valid values:
  • plugin: Logs such as MySQL binary logs are collected by using Logtail plug-ins.
  • file: Text logs are collected by using existing modes, which include the full regex mode and delimiter mode.
inputDetailJSON objectYesNoneThe detailed configuration of the data source. For more information, see inputDetail.
outputTypestringYesLogServiceThe output type of collected logs. The value is fixed as LogService. Collected logs can be uploaded only to Log Service.
outputDetailJSON objectYesNoneThe detailed configuration of collected logs. For more information, see outputDetail.
logSamplestringNoNoneThe sample log.

InputDetail

Basic parameters

ParameterTypeRequiredExampleDescription
filterKeyarrayNo["ip"]The fields that are used to filter logs. A log is collected only when the values of the fields in the log match the regular expressions that are specified in the filterRegex parameter.
filterRegexarrayNo["^10.*"]The regular expressions that are used to match the values of the fields. The fields are specified in the filterKey parameter. The number of elements in the value of the filterRegex parameter must be the same as the number of elements in the value of the filterKey parameter.
shardHashKeyarrayNo["__source__"]The mode that is used to write data. By default, data is written to Log Service in load balancing mode. For more information, see Load balancing mode.

If you configure this parameter, data is written to Log Service in shard mode. For more information, see Shard mode. The __source__ field is supported.

enableRawLogbooleanNofalseSpecifies whether to upload raw logs. Valid values:
  • true
  • false
sensitive_keysarrayNoNoneThe configuration that is used to mask data. For more information, see sensitive_keys.
mergeTypestringNotopicThe method that is used to aggregate data. Valid values:
  • topic: Data is aggregated by topic. This is the default value.
  • logstore: Data is aggregated by Logstore.
delayAlarmBytesintNo209715200The alert threshold of log collection progress. Default value: 209715200. This value specifies that an alert is triggered if 200 MB of data is not collected within a specified period of time.
adjustTimezonebooleanNofalseSpecifies whether to change the time zone of logs. This parameter is valid only if the time parsing feature is enabled. For example, if you configure the timeFormat parameter, you can also configure the adjustTimezone parameter.
logTimezonestringNoGMT+08:00The offset of the time zone. Format: GMT+HH:MM or GMT-HH:MM. For example, if you want to collect logs whose time zone is UTC+8, set the value to GMT+08:00.
advancedJSON objectNoNoneThe advanced features. For more information, see advanced.

sensitive_keys

  • Parameters
    ParameterTypeRequiredExampleDescription
    keystringYescontentThe name of the log field.
    typestringYesconstThe method that is used to mask the content of the log field. Valid values:
    • const: The sensitive content is replaced by the value of the const field.
    • md5: The sensitive content is replaced by the MD5 hash value that is generated for the content.
    regex_beginstringYes'password':'The regular expression that is used to match the prefix of sensitive content. The regular expression is used to find sensitive content. You must use the RE2 syntax to specify the regular expression. For more information, visit RE2 syntax.
    regex_contentstringYes[^']*The regular expression that is used to match sensitive content. You must use the RE2 syntax to specify the regular expression. For more information, visit RE2 syntax.
    allbooleanYestrueSpecifies whether to replace all sensitive content in the log field. Valid values:
    • true: replaces all sensitive content in the log field. This is the recommended value.
    • false: replaces only the sensitive content that the specified regular expressions match for the first time in the log field.
    conststringNo"********"If you set the type parameter to const, you must configure this parameter.
  • Configuration example
    If you want to mask password in the log field content whose value is [{'account':'1812213231432969','password':'04a23f38'}, {'account':'1812213685634','password':'123a'}] and replace the values of password with ********, use the following settings for the sensitive_keys parameter:
    sensitive_keys = [{"all": true,
    "const": "********",
    "regex_content": "[^']*",
    "regex_begin": "'password':'",
    "type": "const",
    "key": "content"}]                    
  • Sample log
    [{'account':'1812213231432969','password':'********'}, {'account':'1812213685634','password':'********'}]

advanced

ParameterTypeRequiredExampleDescription
enable_root_path_collectionbooleanNofalseSpecifies whether to allow data collection from Windows root directories, such as D:\log*. Valid values:
  • true
  • false (default)
Important
  • This parameter is a global parameter. If you set this parameter to true for a specific Logtail configuration, Logtail is allowed to collect data from root directories based on all the Logtail configurations that are applied to the same server as the specific Logtail configuration. The setting of this parameter is valid until Logtail on the server is restarted.
  • This parameter is available only if you install Logtail V1.0.0.22 or later on a Windows server.
exactly_once_concurrencyintNo1Specifies whether to enable the ExactlyOnce write feature. The ExactlyOnce write feature allows you to specify the maximum number of log groups that can be concurrently sent when Logtail collects data from a file. Valid values: 0 to 512. For more information, see Additional information: ExactlyOnce write feature. Valid values:
  • 0: The ExactlyOnce write feature is disabled.
  • Other values: The ExactlyOnce write feature is enabled. The value specifies the maximum number of log groups that can be concurrently sent when Logtail collects data from a file.
Important
  • If you set this parameter to a larger value, more memory and disk overheads are generated. We recommend that you configure this parameter based on the local write traffic.
  • If the value of this parameter is less than the number of shards in the Logstore that is used, Logtail randomly processes data to ensure that the data is evenly written to each shard. The value that you specify for this parameter can be different from the number of shards.
  • The setting takes effect only for the files that are generated after you configure this parameter.
  • Only Logtail V1.0.21 and later support this parameter.
enable_log_position_metabooleanNotrueSpecifies whether to add the metadata information of a source log file to each log that is collected from the file. The metadata information includes the __tag__:__inode__ and __file_offset__ fields. Valid values:
  • true
  • false
Note Only Logtail V1.0.21 and later support this parameter.
specified_yearuintNo0The year that is used to supplement the log time if the time of a raw log does not contain the year information. You can specify the current year or a different year. Valid values:
  • 0: The current year is used.
  • Specific year: A year other than the current year is used. Example: 2020.
Note Only Logtail V1.0.21 and later support this parameter.
force_multiconfigbooleanNofalseSpecifies whether Logtail can use the Logtail configuration to collect data from the files that are matched based on other Logtail configurations. Default value: false.

If you want Logtail to collect data from a file by using different Logtail configurations, you can configure this parameter. For example, you can configure this parameter to collect data from a file to two Logstores by using two Logtail configurations.

raw_log_tagstringNo__raw__The field that is used to store raw logs that are uploaded. Default value: __raw__.
blacklistobjectNoNoneThe blacklist configuration. For more information, see Parameters of blacklist.
tail_size_kbintNo1024The size of data that is collected from a file the first time Logtail reads the file. The value determines the start position from which Logtail collects data. The first time Logtail reads a file, Logtail can read up to 1,024 KB of data in the file by default.
  • If the file size is less than 1,024 KB, Logtail collects data from the beginning of the file.
  • If the file size is greater than 1,024 KB, Logtail collects the last 1,024 KB of data in the file.

Valid values: 0 to 10485760. Unit: KB. You can change the value.

batch_send_intervalintNo3The interval at which aggregated data is sent. Default value: 3. Unit: seconds.
max_rotate_queue_sizeintNo20The maximum length of the queue in which a file is rotated. Default value: 20.
enable_precise_timestampbooleanNofalseSpecifies whether to extract time values with high precision. If you do not include this parameter in the configuration, the system uses the value false by default. The value false specifies that time values with high precision are not extracted.
If you set this parameter to true, Logtail automatically parses the specified time values into timestamps with millisecond precision and stores the timestamps in the field specified by the precise_timestamp_key parameter.
Note
  • Make sure that Use System Time is disabled in the Logtail configuration.
  • Only Logtail V1.0.32 and later support this parameter.
precise_timestamp_keystringNo"precise_timestamp"The field that stores timestamps with high precision. If you do not include this parameter in the configuration, the system uses the precise_timestamp field by default.
precise_timestamp_unitstringNo"ms"The unit of the timestamps with high precision. If you do not include this parameter in the configuration, the system uses ms by default. Valid values: ms, us, and ns.
The following table describes the parameters of blacklist.
ParameterTypeRequiredExampleDescription
dir_blacklistarrayNo["/home/admin/dir1", "/home/admin/dir2*"]The blacklist of directories, which must be absolute paths. You can use asterisks (*) as wildcard characters to match multiple directories.

For example, if you specify /home/admin/dir1, all files in the /home/admin/dir1 directory are skipped during log collection.

filename_blacklistarrayNo["app*.log", "password"]The blacklist of file names. The files that match a name specified in this parameter are skipped during log collection regardless of the directories to which the files belong. You can use asterisks (*) as wildcard characters to match multiple file names.
filepath_blacklistarrayNo["/home/admin/private*.log"]The blacklist of file paths, which must be absolute paths. You can use asterisks (*) as wildcard characters to match multiple files.

For example, if you specify /home/admin/private*.log, all files whose name starts with private and ends with .log in the /home/admin/ directory are skipped during log collection.

Configurations that are specific to Logtail for text log collection

Basic parameters

ParameterTypeRequiredExampleDescription
logTypestringYescommon_reg_logThe mode in which logs are collected. Valid values:
  • json_log: collects logs in JSON mode.
  • common_reg_log: collects logs in full regex mode.
  • delimiter_log: collects logs in delimiter mode.
logPathstringYes/var/log/http/The log file path.
filePatternstringYesaccess*.logThe log file name.
topicFormatstringYesnoneThe method that is used to generate a topic. Valid values:
  • none: No log topics are generated.
  • default: The log file path is used as the topic of collected logs.
  • group_topic: The topic of the machine group to which the Logtail configuration is applied is used as the topic of collected logs.
  • Regular expression that is used to match a log file path: A part of the log file path is used as the topic of collected logs. Example: /var/log/(.*).log.

For more information, see Log topics.

timeFormatstringNo%Y/%m/%d %H:%M:%SThe format of the log time. For more information, see Time formats.
preservebooleanNotrueSpecifies whether to use the timeout mechanism on log files. If a log file is not updated within a specified period of time, Logtail considers the file to be timed out. Valid values:
  • true: Log files never time out. This is the default value.
  • false: If a log file is not updated within 30 minutes, Logtail considers the file to be timed out and no longer monitors the file.
preserveDepthintegerNo1The maximum levels of directories whose files are monitored by using the timeout mechanism. If you set the preserve parameter to false, you must configure this parameter. Valid values: 1 to 3.
fileEncodingstringNoutf8The encoding format of log files. Valid values: utf8 and gbk.
discardUnmatchbooleanNotrueSpecifies whether to discard the logs that fail to be matched. Valid values:
  • true
  • false
maxDepthintNo100The maximum levels of directories that are monitored. Valid values: 0 to 1000. The value 0 indicates that only the specified log file directory is monitored.
delaySkipBytesintNo0The threshold that is used to determine whether to discard data if the data is not collected within a specified period of time. Valid values:
  • 0: Data is not discarded. This is the default value.
  • Other values: Data is discarded. For example, if the size of data that is not collected within a specified period of time exceeds the specified threshold, such as 1024 KB, the data is discarded.
dockerFilebooleanNofalseSpecifies whether to collect logs from container files. Default value: false.
dockerIncludeLabelJSON objectNoNoneThe container label whitelist. The whitelist specifies the containers from which logs or stdout and stderr are collected. This parameter is empty by default, which indicates that logs or stdout and stderr are collected from all containers. When you configure the container label whitelist, the LabelKey parameter is required, and the LabelValue parameter is optional.
  • If the LabelValue parameter is empty, containers whose container labels contain the keys specified by LabelKey are matched.
  • If the LabelValue parameter is not empty, containers whose container labels consist of the key-value pairs specified by LabelKey and LabelValue are matched.

    By default, string matching is performed for the values of the LabelValue parameter. Containers are matched only if the values of the container labels are the same as the values of the LabelValue parameter. If you specify a value that starts with a caret (^) and ends with a dollar sign ($) for the LabelValue parameter, regular expression matching is performed. For example, if you set the LabelKey parameter to io.kubernetes.container.name and set the LabelValue parameter to ^(nginx|cube)$, a container named nginx and a container named cube are matched.

Note
  • Do not specify duplicate values for the LabelKey parameter. If you specify duplicate values for the LabelKey parameter, only one of the values takes effect.
  • Multiple key-value pairs are evaluated by using a logical OR. If the labels of a container match one of the specified key-value pairs, the container is matched.
dockerExcludeLabelJSON objectNoNoneThe container label blacklist. The blacklist specifies the containers from which logs or stdout and stderr are not collected. This parameter is empty by default, which indicates that logs or stdout and stderr are collected from all containers. When you configure the container label blacklist, the LabelKey parameter is required, and the LabelValue parameter is optional.
  • If the LabelValue parameter is empty, containers whose container labels contain the keys specified by LabelKey are filtered out.
  • If the LabelValue parameter is not empty, containers whose container labels consist of the key-value pairs specified by LabelKey and LabelValue are filtered out.

    By default, string matching is performed for the values of the LabelValue parameter. Containers are matched only if the values of the container labels are the same as the values of the LabelValue parameter. If you specify a value that starts with a caret (^) and ends with a dollar sign ($) for the LabelValue parameter, regular expression matching is performed. For example, if you set the LabelKey parameter to io.kubernetes.container.name and set the LabelValue parameter to ^(nginx|cube)$, a container named nginx and a container named cube are matched.

Note
  • Do not specify duplicate values for the LabelKey parameter. If you specify duplicate values for the LabelKey parameter, only one of the values takes effect.
  • Multiple key-value pairs are evaluated by using a logical OR. If the labels of a container match one of the specified key-value pairs, the container is filtered out.
dockerIncludeEnvJSON objectNoNoneThe environment variable whitelist. The whitelist specifies the containers from which logs or stdout and stderr are collected. This parameter is empty by default, which indicates that logs or stdout and stderr are collected from all containers. When you configure the environment variable whitelist, the EnvKey parameter is required, and the EnvValue parameter is optional.
  • If the EnvValue parameter is empty, containers whose environment variables contain the keys specified by EnvKey are matched.
  • If the EnvValue parameter is not empty, containers whose environment variables consist of the key-value pairs specified by EnvKey and EnvValue are matched.

    By default, string matching is performed for the values of the EnvValue parameter. Containers are matched only if the values of the environment variables are the same as the values of the EnvValue parameter. If you specify a value that starts with a caret (^) and ends with a dollar sign ($) for the EnvValue parameter, regular expression matching is performed. For example, if you set the EnvKey parameter to NGINX_SERVICE_PORT and set the EnvValue parameter to ^(80|6379)$, containers whose port number is 80 and containers whose port number is 6379 are matched.

Note Multiple key-value pairs are evaluated by using a logical OR. If the environment variables of a container match one of the specified key-value pairs, the container is matched.
dockerExcludeEnvJSON objectNoNoneThe environment variable blacklist. The blacklist specifies the containers from which logs or stdout and stderr are not collected. This parameter is empty by default, which indicates that logs or stdout and stderr are collected from all containers. When you configure the environment variable blacklist, the EnvKey parameter is required, and the EnvValue parameter is optional.
  • If the EnvValue parameter is empty, containers whose environment variables contain the keys specified by EnvKey are filtered out.
  • If the EnvValue parameter is not empty, containers whose environment variables consist of the key-value pairs specified by EnvKey and EnvValue are filtered out.

    By default, string matching is performed for the values of the EnvValue parameter. Containers are matched only if the values of the environment variables are the same as the values of the EnvValue parameter. If you specify a value that starts with a caret (^) and ends with a dollar sign ($) for the EnvValue parameter, regular expression matching is performed. For example, if you set the EnvKey parameter to NGINX_SERVICE_PORT and set the EnvValue parameter to ^(80|6379)$, containers whose port number is 80 and containers whose port number is 6379 are matched.

Note Multiple key-value pairs are evaluated by using a logical OR. If the environment variables of a container match one of the specified key-value pairs, the container is filtered out.

Configurations that are specific to log collection in full regex mode and simple mode

  • Parameters
    ParameterTypeRequiredExampleDescription
    keyarrayYes ["content"]The fields that are specified for raw logs.
    logBeginRegexstringNo.*The regular expression that is used to match the beginning of the first line of a log.
    regexstringNo(.*)The regular expression that is used to extract the value of a field.
  • Configuration example
    {
        "configName": "logConfigName", 
        "outputType": "LogService", 
        "inputType": "file", 
        "inputDetail": {
            "logPath": "/logPath", 
            "filePattern": "*", 
            "logType": "common_reg_log", 
            "topicFormat": "default", 
            "discardUnmatch": false, 
            "enableRawLog": true, 
            "fileEncoding": "utf8", 
            "maxDepth": 10, 
            "key": [
                "content"
            ], 
            "logBeginRegex": ".*", 
            "regex": "(.*)"
        }, 
        "outputDetail": {
            "projectName": "test-project", 
            "logstoreName": "test-logstore"
        }
    }

Configurations that are specific to log collection in JSON mode

ParameterTypeRequiredExampleDescription
timeKeystringNotimeThe key that is used to specify the time field.

Configurations that are specific to log collection in delimiter mode

  • Parameters
    ParameterTypeRequiredExampleDescription
    separatorstringNo,The delimiter. You must select a delimiter based on the format of logs that you want to collect. For more information, see Collect logs in delimiter mode.
    quotestringYes\The quote. If a log field contains delimiters, you must specify a quote to enclose the field. Log Service parses the content that is enclosed in a pair of quotes into a complete field. You must select a quote based on the format of logs that you want to collect. For more information, see Collect logs in delimiter mode.
    keyarrayYes[ "ip", "time"]The fields that are specified for raw logs.
    timeKeystringYestimeThe time field. You must specify a field in the value of key as the time field.
    autoExtendbooleanNotrueSpecifies whether to upload a log if the number of fields parsed from the log is less than the number of specified keys.
    For example, if you specify a vertical bar (|) as the delimiter, the log 11|22|33|44|55 is parsed into the following fields: 11, 22, 33, 44, and 55. You can set the keys to A, B, C, D, and E.
    • true: The log 11|22|33|55 is uploaded to Log Service, and 55 is uploaded as the value of the D key.
    • false: The log 11|22|33|55 is discarded because the number of fields parsed from the log does not match the number of specified keys.
  • Configuration example
    {
        "configName": "logConfigName", 
        "logSample": "testlog", 
        "inputType": "file", 
        "outputType": "LogService", 
        "inputDetail": {
            "logPath": "/logPath", 
            "filePattern": "*", 
            "logType": "delimiter_log", 
            "topicFormat": "default", 
            "discardUnmatch": true, 
            "enableRawLog": true, 
            "fileEncoding": "utf8", 
            "maxDepth": 999, 
            "separator": ",", 
            "quote": "\"", 
            "key": [
                "ip", 
                "time"
            ], 
            "autoExtend": true
        }, 
        "outputDetail": {
            "projectName": "test-project", 
            "logstoreName": "test-logstore"
        }
    }

Configurations that are specific to Logtail plug-ins

  • Parameters

    The following table describes the configurations that are specific to log collection by using plug-ins.

    ParameterTypeRequiredExampleDescription
    pluginJSON objectYesNoneIf you use a Logtail plug-in to collect logs, you must configure this parameter. For more information, see Use Logtail plug-ins to collect data.
  • Configuration example
    {
        "configName": "logConfigName", 
        "outputType": "LogService", 
        "inputType": "plugin",
        "inputDetail": {
            "plugin": {
                "inputs": [
                    {
                        "detail": {
                            "ExcludeEnv": null, 
                            "ExcludeLabel": null, 
                            "IncludeEnv": null, 
                            "IncludeLabel": null, 
                            "Stderr": true, 
                            "Stdout": true
                        }, 
                        "type": "service_docker_stdout"
                    }
                ]
            }
        }, 
        "outputDetail": {
            "projectName": "test-project", 
            "logstoreName": "test-logstore"
        }
    }

outputDetail

outputDetail is used to specify the project and Logstore that store the collected logs.
ParameterTypeRequiredExampleDescription
projectNamestringYesmy-projectThe name of the project. The name must be the same as the name of the project that you specify in the API request.
logstoreNamestringYesmy-logstoreThe name of the Logstore.

Additional information: ExactlyOnce write feature

After you enable the ExactlyOnce write feature, Logtail records fine-grained checkpoints by file to the disk of the server on which Logtail is installed. If exceptions such as a process error or a server restart occur during log collection, Logtail uses the checkpoints to determine the scope of data that must be processed in each file when log collection is resumed, and then uses the incremental sequence numbers that are provided by Log Service to prevent duplicate data from being sent. However, the ExactlyOnce write feature consumes disk write resources. Limits:
  • Checkpoints are stored in the disk of the server on which Logtail is installed. If checkpoints are lost because the disk has no available space or becomes faulty, the checkpoints cannot be recovered.
  • Checkpoints record only the metadata information of a file. Checkpoints do not record the data of a file. If the file is deleted or modified, the checkpoints may not be recovered.
  • The ExactlyOnce write feature is based on the current write sequence numbers that are recorded by Log Service. Each shard supports only 10,000 records. If the limit is exceeded, the previous records are replaced. To ensure reliability, make sure that the value that is calculated by using the following formula does not exceed 9500: Value = Number of active files that are written to the same Logstore × Number of Logtail instances. We recommend that you reserve a gap between the value and 9500.
    • Number of active files: the number of files that are being read and sent. Files that are generated during log file rotation and have the same logical file name are sent in serial mode. These files are considered as one active file.
    • Number of Logtail instances: the number of Logtail processes. By default, each server hosts one Logtail instance. The number of Logtail instances is the same as the number of servers.

By default, the sync command is not run when Logtail writes checkpoints to disk. This helps ensure performance. However, if buffered data fails to be written to disk when the server restarts, the checkpoints may be lost. To enable the sync-based write feature, you can add "enable_checkpoint_sync_write": true, to the startup configuration file /usr/local/ilogtail/ilogtail_config.json of Logtail. For more information, see Configure the startup parameters of Logtail.