A Logtail configuration is a set of policies that are used by Logtail to collect logs. You can configure parameters such as a data source and a collection mode to customize a Logtail configuration. This topic describes how to configure parameters for a Logtail configuration when you use the Log Service API to collect logs.

Basic parameters

The following table describes the basic parameters of a Logtail configuration.
Table 1. Basic parameters of a Logtail configuration
Parameter Type Required Example Description
configName string Yes config-sample The name of the Logtail configuration. The name must be unique in the project to which the Logtail configuration belongs. After you create a Logtail configuration, you cannot change the name of the Logtail configuration.

The name must meet the following requirements:

  • The name can contain only lowercase letters, digits, hyphens (-), and underscores (_).
  • The name must start and end with a lowercase letter or a digit.
  • The name must be 2 to 128 characters in length.
inputType string Yes file The type of the data source. Valid values:
  • plugin: Logs such as MySQL binary logs are collected by using Logtail plug-ins.
  • file: Text logs are collected by using existing modes, which include the full regex mode and delimiter mode.
inputDetail JSON object Yes None The detailed configuration of the data source. For more information, see inputDetail.
outputType string Yes LogService The output type of collected logs. The value is fixed as LogService. Collected logs can be uploaded only to Log Service.
outputDetail JSON object Yes None The detailed configuration of collected logs. For more information, see outputDetail.
logSample string No None The sample log.

inputDetail

The inputDetail parameter is used to configure the details about the data source.

  • Basic parameters of inputDetail
    Table 2. Basic parameters of inputDetail
    Parameter Type Required Example Description
    filterKey array No ["ip"] The fields that are used to filter logs. A log is collected only when the values of the fields in the log match the regular expressions that are specified in the filterRegex parameter.
    filterRegex array No ["^10.*"] The regular expressions that are used to match the values of the fields. The fields are specified in the filterKey parameter. The number of elements in the value of the filterRegex parameter must be the same as the number of elements in the value of the filterKey parameter.
    shardHashKey array No ["__topic__"] The mode that is used to write data. By default, data is written to Log Service in load balancing mode. For more information, see Load balancing mode.

    If you configure this parameter, data is written to Log Service in shard mode. For more information, see Shard mode. Valid values: __topic__, __hostname__, and __source__.

    enableRawLog boolean No false Specifies whether to upload raw logs. Valid values:
    • true
    • false
    sensitive_keys array No None The configuration that is used to mask data. For more information, see Table 3.
    mergeType string No topic The method that is used to aggregate data. Valid values:
    • topic: Data is aggregated by topic. This is the default value.
    • logstore: Data is aggregated by Logstore.
    delayAlarmBytes int No 209715200 The alert threshold of log collection progress. Default value: 209715200. This value specifies that an alert is triggered if 200 MB of data is not collected within a specified period of time.
    adjustTimezone boolean No false Specifies whether to change the time zone of logs. This parameter is valid only if the time parsing feature is enabled. For example, if you configure the timeFormat parameter, you can also configure the adjustTimeZone parameter.
    logTimezone string No GMT+08:00 The offset of the time zone. Format: GMT+HH:MM or GMT-HH:MM. For example, if you want to collect logs whose time zone is UTC+8, set the value to GMT+08:00.
    advanced JSON object No None The advanced features. For more information, see Table 4.
    The following table describes the parameters of the sensitive_keys parameter.
    • Parameter settings
      Table 3. Parameters of sensitive_keys
      Parameter Type Required Example Description
      key string Yes content The name of the log field.
      type string Yes const The method that is used to mask the content of the log field. Valid values:
      • const: The sensitive content is replaced by the value of the const field.
      • md5: The sensitive content is replaced by the MD5 value that is generated for the content.
      regex_begin string Yes 'password':' The regular expression that is used to match the prefix of sensitive content. The regular expression is used to find sensitive content. You must use the RE2 syntax to specify the regular expression. For more information, see RE2 syntax.
      regex_content string Yes [^']* The regular expression that is used to match sensitive content. You must use the RE2 syntax to specify the regular expression. For more information, see RE2 syntax.
      all boolean Yes true Specifies whether to replace all sensitive content in the log field. Valid values:
      • true: replaces all sensitive content in the log field. This is the recommended value.
      • false: replaces only the sensitive content that the specified regular expressions match for the first time in the log field.
      const string No "********" If you set the type parameter to const, you must configure this parameter.
    • Configuration example
      If you want to mask password in the log field content whose value is [{'account':'1812213231432969','password':'04a23f38'}, {'account':'1812213685634','password':'123a'}] and replace the values of password with ********, use the following settings for the sensitive_keys parameter:
      "key" : "content"
      "type" : "const"
      "regex_begin" : "'password':'"
      "regex_content" : "[^']*"
      "all" : true
      "const" : "********"
                                                      
    • Sample log
      [{'account':'1812213231432969','password':'********'}, {'account':'1812213685634','password':'********'}]
    The following table describes the parameters of the advanced parameter.
    Table 4. Parameters of advanced
    Parameter Type Required Example Description
    enable_root_path_collection boolean No false Specifies whether to allow data collection from Windows root directories, such as D:\log*. Valid values:
    • true
    • false (default)
    Notice
    • This parameter is a global parameter. If you set this parameter to true for a specific Logtail configuration, Logtail is allowed to collect data from root directories based on all the Logtail configurations that are applied on the same server as the specific Logtail configuration. The setting of this parameter is valid until Logtail on the server restarts.
    • This parameter is available only if you install Logtail V1.0.0.22 or later on a Windows server.
    exactly_once_concurrency int No 1 Specifies whether to enable the ExactlyOnce write feature. The ExactlyOnce write feature allows you to specify the maximum number of log groups that can be concurrently sent when Logtail collects data from a file. Valid values: 0 to 512. For more information, see Additional information: ExactlyOnce write feature. Valid values:
    • 0: The ExactlyOnce write feature is disabled.
    • Other values: The ExactlyOnce write feature is enabled. The value specifies the maximum number of log groups that can be concurrently sent when Logtail collects data from a file.
    Notice
    • If you set this parameter to a large value, more memory and disk overheads are generated. We recommend that you set this parameter based on the local write traffic.
    • If the value of this parameter is less than the number of shards in the Logstore that is used, Logtail randomly processes data to ensure that the data is evenly written to each shard. The value that you specify for this parameter can be different from the number of shards.
    • The setting takes effect only for the files that are generated after you configure this parameter.
    • Only Logtail V1.0.21 and later support this parameter.
    enable_log_position_meta boolean No true Specifies whether to add the metadata information of a source log file to each log that is collected from the file. The metadata information includes the __tag__:__inode__ and __file_offset__ fields. Valid values:
    • true
    • false
    Note Only Logtail V1.0.21 and later support this parameter.
    specified_year uint No 0 The year that is used to supplement the log time if the time of a raw log does not contain the year information. You can specify the current year or a different year. Valid values:
    • 0: The current year is used.
    • Specific year: A year other than the current year is used. Example: 2020.
    Note Only Logtail V1.0.21 and later support this parameter.
    force_multiconfig boolean No false Specifies whether Logtail can use the Logtail configuration to collect data from the files that are matched based on other Logtail configurations. Default value: false.

    If you want Logtail to collect data from a file by using different Logtail configurations, you can configure this parameter. For example, you can configure this parameter to collect data from a file to two Logstores by using two Logtail configurations.

    raw_log_tag string No __raw__ The field that is used to store raw logs that are uploaded. Default value: __raw__.
    blacklist object No None The blacklist configuration. For more information, see Table 5.
    tail_size_kb int No 1024 The size of data that is collected from a file the first time Logtail reads the file. The value determines the start position from which Logtail collects data. The first time Logtail reads a file, Logtail can read up to 1,024 KB of data in the file by default.
    • If the file size is less than 1,024 KB, Logtail collects data from the beginning of the file.
    • If the file size is greater than 1,024 KB, Logtail collects the last 1,024 KB of data in the file.

    Valid values: 0 to 10485760. Unit: KB. You can change the value.

    batch_send_interval int No 3 The interval at which aggregated data is sent. Default value: 3. Unit: seconds.
    max_rotate_queue_size int No 20 The maximum length of the queue in which a file is rotated. Default value: 20.
    enable_precise_timestamp boolean No false Specifies whether to extract time values with high precision. If you do not include this parameter in the Logtail configuration, the system uses the value false for the parameter by default. The value false specifies that Logtail does not extract time values with high precision.
    If you set this parameter to true, Logtail automatically parses the specified time values into timestamps with millisecond precision and stores the timestamps in the field specified by the precise_timestamp_key parameter.
    Note
    • Make sure that you disable Use System Time in the Logtail configuration.
    • Only Logtail V1.0.32 and later support this parameter.
    precise_timestamp_key string No "precise_timestamp" The field that stores timestamps with high precision. If you do not include this parameter in the configuration, the system uses the precise_timestamp field by default.
    precise_timestamp_unit string No "ms" The unit of the timestamp with high precision. If you do not include this parameter in the configuration, the system uses ms by default. Valid values: ms, us, and ns.
    The following table describes the parameters of the blacklist parameter.
    Table 5. Parameters of blacklist
    Parameter Type Required Example Description
    dir_blacklist array No ["/home/admin/dir1", "/home/admin/dir2*"] The blacklist of directories, which must be absolute paths. You can use asterisks (*) as wildcard characters to match multiple directories.

    For example, if you specify /home/admin/dir1, all files in the /home/admin/dir1 directory are ignored during log collection.

    filename_blacklist array No ["app*.log", "password"] The blacklist of file names. The files whose name matches a value of this parameter are ignored during log collection regardless of the directories to which the files belong. You can use asterisks (*) as wildcard characters to match multiple file names.
    filepath_blacklist array No ["/home/admin/private*.log"] The blacklist of file paths, which must be absolute paths. You can use asterisks (*) as wildcard characters to match multiple files.

    For example, if you specify /home/admin/private*.log, all files whose name starts with private and ends with .log in the /home/admin/ directory are ignored during log collection.

  • Logtail configurations used to collect text logs
    • Basic configurations
      Parameter Type Required Example Description
      logType string Yes common_reg_log The mode in which logs are collected. Valid values:
      • json_log: collects logs in JSON mode.
      • common_reg_log: collects logs in full regex mode.
      • delimiter_log: collects logs in delimiter mode.
      logPath string Yes /var/log/http/ The log file path.
      filePattern string Yes access*.log The log file name.
      topicFormat string Yes none The method that is used to generate a topic. Valid values:
      • none: No log topics are generated.
      • default: The log file path is used as the topic of collected logs.
      • group_topic: The topic of the machine group to which the Logtail configuration applies is used as the topic of collected logs.
      • Regular expression that is used to match a log file path: A part of the log file path is used as the topic of collected logs. Example: /var/log/(.*).log.

      For more information, see Log topics.

      timeFormat string No %Y/%m/%d %H:%M:%S The format of the log time. For more information, see Time formats.
      preserve boolean No true Specifies whether to use the timeout mechanism on log files. If a log file is not updated within a specified period of time, Logtail considers the file to be timed out. Valid values:
      • true: Log files never time out. This is the default value.
      • false: If a log file is not updated within 30 minutes, Logtail considers the file to be timed out and no longer monitors the file.
      preserveDepth integer No 1 The maximum levels of directories whose files are monitored by using the timeout mechanism. If you set the preserve parameter to false, you must configure this parameter. Valid values: 1 to 3.
      fileEncoding string No utf8 The encoding format of log files. Valid values: utf8 and gbk.
      discardUnmatch boolean No true Specifies whether to discard the logs that fail to be matched. Valid values:
      • true
      • false
      maxDepth int No 100 The maximum levels of directories that are monitored. Valid values: 0 to 1000. The value 0 indicates that only the specified log file directory is monitored.
      delaySkipBytes int No 0 The threshold that is used to determine whether to discard data if the data is not collected within a specified period of time. Valid values:
      • 0: Data is not discarded. This is the default value.
      • Other values: Data is discarded. For example, if the size of data that is not collected within a specified period of time exceeds the specified threshold, such as 1024 KB, the data is discarded.
      dockerFile boolean No false Specifies whether to collect logs from container files. Default value: false.
      dockerIncludeLabel JSON object No None The container label whitelist. The whitelist specifies the containers from which logs or stdout and stderr are collected. This parameter is empty by default, which indicates that logs or stdout and stderr are collected from all containers. When you configure the container label whitelist, the LabelKey parameter is required, and the LabelValue parameter is optional.
      • If the LabelValue parameter is empty, containers whose container labels contain the keys specified by LabelKey are matched.
      • If the LabelValue parameter is not empty, containers whose container labels consist of the key-value pairs specified by LabelKey and LabelValue are matched.

        By default, string matching is performed for the values of the LabelValue parameter. Containers are matched only if the values of the container labels are the same as the values of the LabelValue parameter. If you specify a value that starts with a caret (^) and ends with a dollar sign ($) for the LabelValue parameter, regular expression matching is performed. For example, if you set the LabelKey parameter to io.kubernetes.container.name and set the LabelValue parameter to ^(nginx|cube)$, a container named nginx and a container named cube are matched.

      Note
      • Do not specify duplicate values for the LabelKey parameter. If you specify duplicate values for the LabelKey parameter, only one of the values takes effect.
      • Multiple key-value pairs are evaluated by using a logical OR. If the labels of a container match one of the specified key-value pairs, the container is matched.
      dockerExcludeLabel JSON object No None The container label blacklist. The blacklist specifies the containers from which logs or stdout and stderr are not collected. This parameter is empty by default, which indicates that logs or stdout and stderr are collected from all containers. When you configure the container label blacklist, the LabelKey parameter is required, and the LabelValue parameter is optional.
      • If the LabelValue parameter is empty, containers whose container labels contain the keys specified by LabelKey are filtered out.
      • If the LabelValue parameter is not empty, containers whose container labels consist of the key-value pairs specified by LabelKey and LabelValue are filtered out.

        By default, string matching is performed for the values of the LabelValue parameter. Containers are matched only if the values of the container labels are the same as the values of the LabelValue parameter. If you specify a value that starts with a caret (^) and ends with a dollar sign ($) for the LabelValue parameter, regular expression matching is performed. For example, if you set the LabelKey parameter to io.kubernetes.container.name and set the LabelValue parameter to ^(nginx|cube)$, a container named nginx and a container named cube are matched.

      Note
      • Do not specify duplicate values for the LabelKey parameter. If you specify duplicate values for the LabelKey parameter, only one of the values takes effect.
      • Multiple key-value pairs are evaluated by using a logical OR. If the labels of a container match one of the specified key-value pairs, the container is filtered out.
      dockerIncludeEnv JSON object No None The environment variable whitelist. The whitelist specifies the containers from which logs or stdout and stderr are collected. This parameter is empty by default, which indicates that logs or stdout and stderr are collected from all containers. When you configure the environment variable whitelist, the EnvKey parameter is required, and the EnvValue parameter is optional.
      • If the EnvValue parameter is empty, containers whose environment variables contain the keys specified by EnvKey are matched.
      • If the EnvValue parameter is not empty, containers whose environment variables consist of the key-value pairs specified by EnvKey and EnvValue are matched.

        By default, string matching is performed for the values of the EnvValue parameter. Containers are matched only if the values of environment variables are the same as the values of the EnvValue parameter. If you specify a value that starts with a caret (^) and ends with a dollar sign ($) for the EnvValue parameter, regular expression matching is performed. For example, if you set the EnvKey parameter to NGINX_SERVICE_PORT and set the EnvValue parameter to ^(80|6379)$, containers whose port number is 80 and containers whose port number is 6379 are matched.

      Note Multiple key-value pairs are evaluated by using a logical OR. If the environment variables of a container match one of the specified key-value pairs, the container is matched.
      dockerExcludeEnv JSON object No None. The environment variable blacklist. The blacklist specifies the containers from which logs or stdout and stderr are not collected. This parameter is empty by default, which indicates that logs or stdout and stderr are collected from all containers. When you configure the environment variable blacklist, the EnvKey parameter is required, and the EnvValue parameter is optional.
      • If the EnvValue parameter is empty, containers whose environment variables contain the keys specified by EnvKey are filtered out.
      • If the EnvValue parameter is not empty, containers whose environment variables consist of the key-value pairs specified by EnvKey and EnvValue are filtered out.

        By default, string matching is performed for the values of the EnvValue parameter. Containers are matched only if the values of environment variables are the same as the values of the EnvValue parameter. If you specify a value that starts with a caret (^) and ends with a dollar sign ($) for the EnvValue parameter, regular expression matching is performed. For example, if you set the EnvKey parameter to NGINX_SERVICE_PORT and set the EnvValue parameter to ^(80|6379)$, containers whose port number is 80 and containers whose port number is 6379 are matched.

      Note Multiple key-value pairs are evaluated by using a logical OR. If the environment variables of a container match one of the specified key-value pairs, the container is filtered out.
    • Configurations that are specific to log collection in full regex mode and simple mode
      Table 6. Parameters in full regex mode and simple mode
      Parameter Type Required Example Description
      key array Yes ["content"] The fields that are specified for raw logs.
      logBeginRegex string No .* The regular expression that is used to match the beginning of the first line of a log.
      regex string No (.*) The regular expression that is used to extract the value of a field.

      The following sample code provides an example of a Logtail configuration that is used to collect logs in full regex mode:

      {
          "configName": "logConfigName", 
          "outputType": "LogService", 
          "inputType": "file", 
          "inputDetail": {
              "logPath": "/logPath", 
              "filePattern": "*", 
              "logType": "common_reg_log", 
              "topicFormat": "default", 
              "discardUnmatch": false, 
              "enableRawLog": true, 
              "fileEncoding": "utf8", 
              "maxDepth": 10, 
              "key": [
                  "content"
              ], 
              "logBeginRegex": ".*", 
              "regex": "(.*)"
          }, 
          "outputDetail": {
              "projectName": "test-project", 
              "logstoreName": "test-logstore"
          }
      }
    • Configurations that are specific to log collection in JSON mode
      Parameter Type Required Example Description
      timeKey string No time The key that is used to specify the time field.
    • Configurations that are specific to log collection in delimiter mode
      Parameter Type Required Example Description
      separator string No , The delimiter. You must select a delimiter based on the format of logs that you want to collect. For more information, see Additional information: Delimiters and sample logs.
      quote string Yes \ The quote. If a log field contains delimiters, you must specify a quote to enclose the field. Log Service parses the content that is enclosed in a pair of quotes into a complete field. You must select a quote based on the format of logs that you want to collect. For more information, see Additional information: Delimiters and sample logs.
      key array Yes [ "ip", "time"] The fields that are specified for raw logs.
      timeKey string Yes time The time field. You must specify a field in the value of key as the time field.
      autoExtend boolean No true Specifies whether to upload a log if the number of fields parsed from the log is less than the number of specified keys.
      For example, if you specify a vertical bar (|) as the delimiter, the log 11|22|33|44|55 is parsed into the following fields: 11, 22, 33, 44, and 55. You can set the keys to A, B, C, D, and E.
      • true: The log 11|22|33|55 is uploaded to Log Service, and 55 is uploaded as the value of the D key.
      • false: The log 11|22|33|55 is discarded because the number of fields parsed from the log does not match the number of specified keys.

      The following sample code provides an example of a Logtail configuration that is used to collect logs in delimiter mode:

      {
          "configName": "logConfigName", 
          "logSample": "testlog", 
          "inputType": "file", 
          "outputType": "LogService", 
          "inputDetail": {
              "logPath": "/logPath", 
              "filePattern": "*", 
              "logType": "delimiter_log", 
              "topicFormat": "default", 
              "discardUnmatch": true, 
              "enableRawLog": true, 
              "fileEncoding": "utf8", 
              "maxDepth": 999, 
              "separator": ",", 
              "quote": "\"", 
              "key": [
                  "ip", 
                  "time"
              ], 
              "autoExtend": true
          }, 
          "outputDetail": {
              "projectName": "test-project", 
              "logstoreName": "test-logstore"
          }
      }
  • Plug-in configurations
    Parameters specific to log collection that is based on plug-ins
    Parameter Type Required Example Description
    plugin JSON object Yes None If you use a Logtail plug-in to collect logs, you must configure this parameter. For more information, see Use Logtail plug-ins to process data.
    The following sample code provides an example of a Logtail configuration that uses a plug-in:
    {
        "configName": "logConfigName", 
        "outputType": "LogService", 
        "inputType": "plugin",
        "inputDetail": {
            "plugin": {
                "inputs": [
                    {
                        "detail": {
                            "ExcludeEnv": null, 
                            "ExcludeLabel": null, 
                            "IncludeEnv": null, 
                            "IncludeLabel": null, 
                            "Stderr": true, 
                            "Stdout": true
                        }, 
                        "type": "service_docker_stdout"
                    }
                ]
            }
        }, 
        "outputDetail": {
            "projectName": "test-project", 
            "logstoreName": "test-logstore"
        }
    }

outputDetail

The following table describes how to configure the project and the Logstore for collected logs.
Parameter Type Required Example Description
projectName string Yes my-project The name of the project. The name must be the same as the name of the project that you specify in the API request.
logstoreName string Yes my-logstore The name of the Logstore.

Additional information: ExactlyOnce write feature

After you enable the ExactlyOnce write feature, Logtail records fine-grained checkpoints by file to the disk of the server on which Logtail is installed. If exceptions such as a process error or a server restart occur during log collection, Logtail uses the checkpoints to determine the scope of data that must be processed in each file when log collection is resumed, and then uses the incremental sequence numbers that are provided by Log Service to prevent duplicate data from being sent. However, the ExactlyOnce write feature consumes disk write resources. Limits:
  • Checkpoints are stored in the disk of the server on which Logtail is installed. If checkpoints are lost because the disk has no available space or becomes faulty, the checkpoints cannot be recovered.
  • Checkpoints record only the metadata information of a file. Checkpoints do not record the data of a file. If the file is deleted or modified, the checkpoints may not be recovered.
  • The ExactlyOnce write feature is based on the current write sequence numbers that are recorded by Log Service. Each shard supports only 10,000 records. If the limit is exceeded, the previous records are replaced. To ensure reliability, make sure that the value that is calculated by using the following formula does not exceed 9500: Value = Number of active files that are written to the same Logstore × Number of Logtail instances. We recommend that you reserve a gap between the value and 9500.
    • Number of active files: the number of files that are being read and sent. Files that are generated during log file rotation and have the same logical file name are sent in serial mode. These files are considered as one active file.
    • Number of Logtail instances: the number of Logtail processes. By default, each server hosts a Logtail instance. The number of Logtail instances is the same as the number of servers.

By default, the sync command is not run when Logtail writes checkpoints to disk. This helps ensure performance. However, if buffered data fails to be written to disk when the server restarts, the checkpoints may be lost. To enable the sync-based write feature, you can add "enable_checkpoint_sync_write": true, to the startup configuration file /usr/local/ilogtail/ilogtail_config.json of Logtail. For more information, see Configure the startup parameters of Logtail.