Log Service allows you to collect logs in delimiter mode. After logs are collected, you can perform various operations on the logs. For example, you can analyze the logs in multiple dimensions, and transform and ship the logs. You can create Logtail configurations to collect logs. This topic describes how to create a Logtail configuration in delimiter mode by using the Log Service console.

Prerequisites

  • A project and a Logstore are created. For more information, see Create a project and Create a Logstore.
  • The server on which Logtail is installed can connect to port 80 and port 443 of remote servers.

Procedure

  1. Log on to the Log Service console.
  2. In the Import Data section, click Delimiter Mode - Text Log.
  3. Select the project and Logstore. Then, click Next.
  4. Create a machine group.
    • If a machine group is available, click Use Existing Machine Groups.
    • If no machine groups are available, perform the following steps to create a machine group. In this example, an Elastic Compute Service (ECS) instance is used.
      1. On the ECS Instances tab, select Manually Select Instances. Then, select the ECS instance that you want to use and click Execute Now.

        For more information, see Install Logtail on ECS instances.

        Note If you want to collect logs from an ECS instance that belongs to a different Alibaba Cloud account, a server in an on-premises data center, or a server of a third-party cloud service provider, you must manually install Logtail. For more information, see Install Logtail on a Linux server or Install Logtail on a Windows server. After you manually install Logtail, you must configure a user identifier on the server. For more information, see Configure a user identifier.
      2. After Logtail is installed, click Complete Installation.
      3. In the Create Machine Group step, configure Name and click Next.

        Log Service allows you to create IP address-based machine groups and custom identifier-based machine groups. For more information, see Create an IP address-based machine group and Create a custom ID-based machine group.

  5. Select the new machine group from the Source Server Groups section and move the machine group to the Applied Server Groups section. Then, click Next.
    Notice If you enable a machine group immediately after you create the machine group, the heartbeat status of the machine group may be FAIL. This issue occurs because the machine group is not connected to Log Service. To resolve this issue, you can click Automatic Retry. If the issue persists, see What do I do if a Logtail machine group has no heartbeats?
  6. Create a Logtail configuration and click Next.
    Parameter Description
    Config Name Enter a name for the Logtail configuration. The name must be unique in a project. After you create the Logtail configuration, you cannot change the name of the Logtail configuration.

    You can click Import Other Configuration to import an existing Logtail configuration.

    Log Path Specify the directory and name of log files.
    You can specify an exact directory and an exact name. You can also use wildcard characters to specify the directory and name. For more information, see Wildcard matching. Log Service scans all levels of the specified directory for the log files that match specified conditions. Examples:
    • If you specify /apsara/nuwa/**/*.log, Log Service collects logs from the log files whose names are suffixed by .log in the /apsara/nuwa directory and the recursive subdirectories of the directory.
    • If you specify /var/logs/app_*/*.log, Log Service collects logs from the log files that meet the following conditions: The file name contains .log. The file is stored in a subdirectory under the /var/logs directory or in a recursive subdirectory of the subdirectory. The name of the subdirectory matches the app_* pattern.
    Note
    Docker File If you want to collect logs from Docker containers, you must turn on Docker File and specify the directories and tags of the containers. Logtail monitors containers to check whether containers are created or destroyed, filters containers by tag, and collects logs from the containers in the filtering result. For more information about how to collect the text logs of containers, see Use the Log Service console to collect container text logs in DaemonSet mode.
    Blacklist If you turn on Blacklist, you must configure a blacklist to specify the directories or files that you want Log Service to skip when it collects logs. You can specify exact directories and file names. You can also use wildcard characters to specify directories and file names. Examples:
    • If you select Filter by Directory from a drop-down list in the Filter Type column and enter /home/admin/dir1 for Content, all files in the /home/admin/dir1 directory are skipped.
    • If you select Filter by Directory from a drop-down list in the Filter Type column and enter /home/admin/dir* for Content, the files in all subdirectories whose names are prefixed by dir in the /home/admin/ directory are skipped.
    • If you select Filter by Directory from a drop-down list in the Filter Type column and enter /home/admin/*/dir for Content, all files in dir directories in each subdirectory of the /home/admin/ directory are skipped.

      For example, the files in the /home/admin/a/dir directory are skipped, but the files in the /home/admin/a/b/dir directory are not skipped.

    • If you select Filter by File from a drop-down list in the Filter Type column and enter /home/admin/private*.log for Content, all files whose names are prefixed by private and suffixed by .log in the /home/admin/ directory are skipped.
    • If you select Filter by File from a drop-down list in the Filter Type column and enter /home/admin/private*/*_inner.log for Content, all files whose names are suffixed by _inner.log in the subdirectories whose names are prefixed by private in the /home/admin/ directory are skipped.

      For example, the /home/admin/private/app_inner.log file is skipped, but the /home/admin/private/app.log file is not skipped.

    Note
    • When you configure this parameter, you can use only asterisks (*) or question marks (?) as wildcard characters.
    • If you use wildcard characters to configure Log Path and you want to skip some directories in the specified directory, you must configure the blacklist and enter a complete directory.

      For example, if you set Log Path to /home/admin/app*/log/*.log and you want to skip all subdirectories in the /home/admin/app1* directory, you must select Filter by Directory and enter /home/admin/app1*/** to configure the blacklist. If you enter /home/admin/app1*, the blacklist does not take effect.

    • When a blacklist is in use, computational overhead is generated. We recommend that you add up to 10 entries to the blacklist.
    Mode Select the log collection mode. By default, Delimiter Mode is displayed. You can change the mode.
    Log Sample Enter a sample log that is obtained from an actual scenario. Example:
    127.0.0.1|#|-|#|13/Apr/2020:09:44:41 +0800|#|GET /1 HTTP/1.1|#|0.000|#|74|#|404|#|3650|#|-|#|curl/7.29.0
    Note In delimiter mode, you can collect only single-line logs. If you want to collect multi-line logs, we recommend that you select Simple Mode - Multi-line or Full Regex Mode. For more information, see Collect logs in simple mode and Collect logs in full regex mode.
    Delimiter Select a delimiter based on the log format. For example, you can select a vertical bar (|) as a delimiter. For more information, see Additional information: Delimiters and sample logs.
    Note If you select Hidden Characters for Delimiter, you must enter a character in the following format: 0xHexadecimal ASCII code of the non-printable character. For example, if you want to use the non-printable character whose hexadecimal ASCII code is 01, you must enter 0x01.
    Quote Select a quote to enclose the log fields that contain delimiters. Log Service parses the content that is enclosed in a pair of quotes into a complete field. You can select the quote based on the log format.
    Note If you select Hidden Characters for Quote, you must enter a character in the following format: 0xHexadecimal ASCII code of the non-printable character. For example, if you want to use the non-printable character whose hexadecimal ASCII code is 01, you must enter 0x01.
    Extracted Content Specify the log content that can be extracted. Log Service extracts log content based on the sample log that you enter, and then delimits the log content into values by using the specified delimiter. You must specify a key for each value.
    Incomplete Entry Upload Specify whether to upload a log if the number of fields parsed from the log is less than the number of specified keys. If you turn on this switch, the log is uploaded. Otherwise, the log is discarded.
    For example, if you specify a vertical bar (|) as the delimiter, the log 11|22|33|44|55 is parsed into the following fields: 11, 22, 33, 44, and 55. You can set the keys to A, B, C, D, and E.
    • If you turn on Incomplete Entry Upload, the log 11|22|33|55 is uploaded, and 55 is uploaded as the value of the D key.
    • If you turn off Incomplete Entry Upload, the log 11|22|33|55 is discarded because the number of fields parsed from the log does not match the number of specified keys.
    Use System Time Specify whether to use the system time.
    • If you turn on Use System Time, the timestamp of a log indicates the system time when the log is collected. The system time refers to the time of the server or container on which Logtail runs.
    • If you turn off Use System Time, you must configure Specify Time Key and Time Format based on the value of the time field in logs. For more information about the time format, see Time formats.

      For example, if the time format in logs is "time": "05/May/2016:13:30:28", you can set the Specify Time Key field to time and the Time Format field to %d/%b/%Y:%H:%M:%S.

    Notice
    • The time zone of the Logtail container is UTC. If you want to collect container logs in DaemonSet mode and the time zone of the container from which you want to collect logs is not UTC, you must set Timezone to Custom in the Advanced Options section of your Logtail configuration and use the time zone of the container from which you want to collect logs. Otherwise, the log time is incorrectly offset. For example, if you select Synchronize Timezone from Node to Container when you create a container from which you want to collect logs, the time zone of the container may not be UTC.
    • The timestamp of a log in Log Service is accurate to the second by default. If the value of the time field in raw logs has a higher time precision, such as the millisecond, microsecond, or nanosecond, and you want to retain the time precision for the logs in Log Service, you can add the enable_precise_timestamp parameter in the extended settings for your Logtail and set the parameter value to true.
    Drop Failed to Parse Logs Specify whether to drop the logs that fail to be parsed.
    • If you turn on Drop Failed to Parse Logs, the logs that fail to be parsed are not uploaded to Log Service.
    • If you turn off Drop Failed to Parse Logs, the logs that fail to be parsed are still uploaded to Log Service as the value of the __raw_log__ field.
    Maximum Directory Monitoring Depth Specify the maximum number of levels of subdirectories that you want to monitor. The subdirectories are in the log file directory that you specify. Valid values: 0 to 1000. A value of 0 specifies that only the log file directory that you specify is monitored.
    You can configure advanced settings based on your business requirements. We recommend that you do not modify the advanced settings. The following table describes the parameters in the advanced settings.
    Parameter Description
    Enable Plug-in Processing If you turn on Enable Plug-in Processing, you can configure Logtail plug-ins to process logs. For more information, see Overview.
    Note If you turn on Enable Plug-in Processing, the parameters such as Upload Raw Log, Timezone, Drop Failed to Parse Logs, Filter Configuration, and Incomplete Entry Upload (Delimiter mode) become unavailable.
    Upload Raw Log If you turn on Upload Raw Log, each raw log is uploaded to Log Service as the value of the __raw__ field together with the log parsed from the raw log.
    Topic Generation Mode Select the topic generation mode. For more information, see Log topics.
    • Null - Do not generate topic: In this mode, the topic field is set to an empty string. When you query logs, you do not need to specify a topic. This is the default value.
    • Machine Group Topic Attributes: In this mode, topics are configured at the machine group level. If you want to distinguish the logs that are generated by different servers, select this mode.
    • File Path RegEx: In this mode, you must specify a regular expression in the Custom RegEx field. The part of a log path that matches the regular expression is used as the topic. If you want to distinguish the logs that are generated by different users or instances, select this mode.
    Log File Encoding Select the encoding format of log files. Valid values: utf8 and gbk.
    Timezone Select the time zone in which logs are collected. Valid values:
    • System Timezone: If you select this value, the time zone of the server or the container on which Logtail is installed is used.
    • Custom: If you select this value, you must select a time zone based on your business requirements.
    Timeout Select a timeout period of log files. If a log file is not updated within the specified period, Logtail considers the file to be timed out. Valid values:
    • Never: All log files are continuously monitored and never time out.
    • 30 Minute Timeout: If a log file is not updated within 30 minutes, Logtail considers the file to be timed out and stops monitoring the file.

      If you select 30 Minute Timeout, you must configure the Maximum Timeout Directory Depth parameter. Valid values: 1 to 3.

    Filter Configuration Specify the filter conditions that you want to use to collect logs. Only the logs that match the specified filter conditions are collected. Examples:
    • Collect the logs that match the specified filter conditions: If you set Key to level and RegEx to WARNING|ERROR, only the logs whose level is WARNING or ERROR are collected.
    • Filter out the logs that do not match the specified filter conditions. For more information, see Regular-Expressions.info.
      • If you set Key to level and RegEx to ^(?!.*(INFO|DEBUG)).*, the logs whose level contains INFO or DEBUG are not collected.
      • If you set Key to level and RegEx to ^(?!(INFO|DEBUG)$).*, the logs whose level is INFO or DEBUG are not collected.
      • If you set Key to url and RegEx to .*^(?!.*(healthcheck)).*, the logs whose url contains healthcheck are not collected. For example, if a log has the Key field of url and the Value field of /inner/healthcheck/jiankong.html, the log is not collected.

    For more information, see regex-exclude-word and regex-exclude-pattern.

    First Collection Size Specify the size of data that Logtail can collect from a log file the first time Logtail collects logs from the file. The default value of First Collection Size is 1024. Unit: KB.
    • If the file size is less than 1,024 KB, Logtail collects data from the beginning of the file.
    • If the file size is greater than 1,024 KB, Logtail collects the last 1,024 KB of data in the file.

    You can specify First Collection Size based on your business requirements. Valid values: 0 to 10485760. Unit: KB.

    More Configurations Specify extended settings for Logtail. For more information, see advanced.

    For example, if you want to use the current Logtail configuration to collect logs from log files that match a different Logtail configuration and specify the interval at which logs are aggregated and sent to Log Service, you can specify extended settings for the current Logtail.

    {
      "force_multiconfig": true,
      "batch_send_interval": 3
    }
    Click Next to complete the Logtail configuration creation. Then, Log Service starts to collect logs.
  7. Preview data, configure indexes, and then click Next.
    By default, full-text indexing is enabled for Log Service. You can also configure field indexes based on collected logs in manual or automatic mode. For more information, see Configure indexes.
    Note If you want to query and analyze logs, you must enable full-text indexing or field indexing. If you enable both full-text indexing and field indexing, the system uses only field indexes.

Additional information: Delimiters and sample logs

Logs that are in the delimiter-separated values (DSV) format use line feeds as boundaries. Each log is placed in a separate line. Each log is parsed into multiple fields by using delimiters. Both single-character and multi-character delimiters are supported. If a field contains delimiters, you can enclose the field in a pair of quotes.

  • Single-character delimiter

    The following examples show logs that use single-character delimiters:

    05/May/2016:13:30:28,10.10.*.*,"POST /PutData?Category=YunOsAccountOpLog&AccessKeyId=****************&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=******************************** HTTP/1.1",200,18204,aliyun-sdk-java
    05/May/2016:13:31:23,10.10.*.*,"POST /PutData?Category=YunOsAccountOpLog&AccessKeyId=****************&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=******************************** HTTP/1.1",401,23472,aliyun-sdk-java
    If a log uses single-character delimiters, you must specify the delimiter. You can also specify a quote.
    • Delimiter: Available single-character delimiters include the tab character (\t), vertical bar (|), space, comma (,), semicolon (;), and non-printable characters. You cannot specify a double quotation mark (") as the delimiter.

      However, a double quotation mark (") can be used as a quote. A double quotation mark (") can appear at the border of a field, or in the field. If a double quotation mark (") is included in a log field, it must be escaped as a pair of double quotation marks ("") when the log is processed. When the log is parsed, a pair of double quotation marks ("") are restored to a double quotation mark ("). For example, you can specify a comma (,) as the delimiter and a double quotation mark (") as the quote. If a log field contains the specified delimiter and quote, the field is enclosed in a pair of quotes, and the double quotation mark (") in the field is escaped as a pair of double quotation marks (""). If a processed log is in the format 1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00, the log is parsed into five fields: 1999, Chevy, Venture "Extended Edition, Very Large", an empty field, and 5000.00.

    • Quote: If a log field contains delimiters, you must specify a quote to enclose the field. Log Service parses the content that is enclosed in a pair of quotes into a complete field.

      Available quotes include the tab character (\t), vertical bar (|), space, comma (,), semicolon (;), and non-printable characters.

      For example, if you specify a comma (,) as the delimiter and a double quotation mark (") as the quote, the log 1997,Ford,E350,"ac, abs, moon",3000.00 is parsed into five fields: 1997, Ford, E350, ac, abs, moon, and 3000.00.

  • Multi-character delimiter
    The following examples show logs that use multi-character delimiters:
    05/May/2016:13:30:28&&10.200.**.**&&POST /PutData?Category=YunOsAccountOpLog&AccessKeyId=****************&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=pD12XYLmGxKQ%2Bmkd6x7hAgQ7b1c%3D HTTP/1.1&&200&&18204&&aliyun-sdk-java
    05/May/2016:13:31:23&&10.200.**.**&&POST /PutData?Category=YunOsAccountOpLog&AccessKeyId=****************&Date=Fri%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=******************************** HTTP/1.1&&401&&23472&&aliyun-sdk-java

    A multi-character delimiter can contain two or three characters, such as ||, &&&, and ^_^. Log Service parses logs based on delimiters. You do not need to use quotes to enclose log fields.

    Note Make sure that each log field does not contain the exact delimiter. Otherwise, Log Service cannot parse the logs as expected.

    For example, if you specify && as the delimiter, the log 1997&&Ford&&E350&&ac&abs&moon&&3000.00 is parsed into five fields: 1997, Ford, E350, ac&abs&moon, and 3000.00.