This topic describes how to use Logtail to collect logs of Elastic Compute Service (ECS) instances that run Linux. It also describes how to search and analyze log data in the Log Service console.

Prerequisites

  • An Alibaba Cloud account is created and passes real-name verification. For more information, see Create an Alibaba Cloud account.
  • Log Service is activated.

    When you log on to the Log Service console for the first time, activate Log Service as prompted.

  • An ECS instance that runs Linux is available in the region where you want to create a project. For more information, see Create an ECS instance.

Step 1: Create a project and a Logstore

You must create a project and a Logstore before you collect log data.

  1. Log on to the Log Service console.
  2. Create a project.
    1. Click Create Project.
    2. In the Create Project dialog box, set the following parameters.
      Parameter Description
      Project Name The unique name of the project in the selected region. The name cannot be modified after the project is created.
      Description The description of the project.
      Region The region to which the project belongs. We recommend that you select a region based on the log source.

      For example, to collect logs of ECS instances, you can select the region where the ECS instances reside. Then, you can use the internal network of Alibaba Cloud to accelerate log data collection.

      After you create a project, you cannot change its region or migrate the project to another region. For information about the endpoints of different regions, see Service endpoint.

      Service Logs If you enable the service log feature for a project, all the logs generated under the project are stored in the project that you specify for Log Storage Location.
      • If you select Detailed Logs, operations logs are stored in the specified project. The log data is billed based on the pay-as-you-go method.
      • If you select Important Logs, the consumption delay logs of consumer groups, metering logs, and Logtail status logs are stored in the specified project. The logging feature is free of charge.
      Log Storage Location After you enable the service log feature, you must specify the log storage location. Valid values:
      • Automatic creation (recommended)
      • Current Project
      • Other projects in the same region that you select for the current project
    3. Click OK.
  3. Create a Logstore.
    After you create a project, the system prompts you to create a Logstore. The following table describes the parameters that you must set when you create a Logstore.
    Parameter Description
    Logstore Name The name of the Logstore. The name must be unique in the project to which the Logstore belongs.
    WebTracking If you turn on the WebTracking switch, Log Service can collect log data from web browsers and mobile apps that run on iOS or Android to Log Service. By default, the switch is turned off.
    Permanent Storage If you turn on this switch, Log Service permanently stores collected logs. By default, the switch is turned off.
    Data Retention Period You must set Data Retention Period if you do not turn on the Permanent Storage switch.

    The retention period of collected log data. Unit: days. Valid values: 1 to 3000. Log data is deleted when the retention period exceeds the specified days.

    Shards The number of shards in the Logstore. You can create up to 10 shards for each Logstore. You can create a maximum of 200 shards for each project.
    Automatic Sharding If you turn on this switch, Log Service automatically increases the number of shards when read and write requests exceed the capacity of the existing shards. The switch is turned on by default. For more information, see Manage a shard.
    Maximum Shards The maximum number of shards after automatic sharding. This parameter is required if you turn on the Automatic Sharding switch. Maximum value: 64.
    Log Public IP If you turn on this switch, Log Service adds the following information to the tag field of the collected logs:
    • __client_ip__: the public IP address of the log source.
    • __receive_time__: the time when Log Service receives log data. The time is formatted as a Unix timestamp.

Step 2: Collect log data

This section describes how to collect delimiter-separated values (DSV) formatted logs.

  1. In the Import Data section, select Delimiter Mode - Text Log.
  2. In the Specify Logstore step, select the destination project and Logstore, and then click Next.
    You can also click Create Now to create a project and a Logstore.
  3. In the Create Machine Group step, create a machine group.
    • If a machine group is available, click Using Existing Machine Groups.
    • If no machine group is available, perform the following steps. The following steps take ECS instances as an example to describe how to create a machine group:
      1. Click the ECS Instances tab. On the tab, select the ECS instances that you want to add to a machine group and then click Install.

        If Logtail is installed on the ECS instances, click Complete Installation.

        Note
        • If the ECS instances run Linux, click Install and Logtail is automatically installed on the ECS instances.
        • If the ECS instances run Windows, you must manually install Logtail on the ECS instances. For more information, see Install Logtail in Windows.
        • If you want to collect logs from a user-created cluster, you must manually install Logtail on the servers in the cluster. For more information, see Install Logtail in Linux or Install Logtail in Windows.
      2. After the installation is completed, click Complete Installation.
      3. On the page that appears, set relevant parameters for the machine group. For more information, see Create an IP address-based machine group or Create a custom ID-based machine group.
  4. In the Machine Group Settings step, apply Logtail configurations to the machine group.
    Select the created machine group and move the group from Source Machine Groups to Applied Machine Groups.
  5. In the Logtail Config step, create a Logtail configuration file.
    Parameter Description
    Config Name The name of the Logtail configuration file. The name cannot be modified after the Logtail configuration file is created.

    You can also click Import Other Configuration to import a Logtail configuration file from another project.

    Log Path The directory and names of log files.
    The specified log file names can be complete file names or file names that contain wildcards. For more information about wildcards that can be used in patterns of directory and file names, visit Wildcard matching. The log files in all levels of subdirectories under the specified directory that match the specified pattern are monitored. Examples:
    • /apsara/nuwa/ ... /*.log indicates the files whose extension is .log in the /apsara/nuwa directory and its subdirectories are monitored.
    • /var/logs/app_* ... /*.log* indicates each file that meets the following conditions is monitored: The file name contains .log. The file is stored in a subdirectory (at any level) of the /var/logs directory. The name of the subdirectory matches the app_* pattern.
    Note
    • A log file can be collected by using only one Logtail configuration file.
    • You can include only asterisks (*) and question marks (?) as wildcard characters in the log path.
    Docker File If you collect log data from Docker containers, you can specify the internal paths and labels of containers. Then Logtail monitors the creation and destruction of the containers, filters logs of the containers based on labels, and collects the filtered logs. For more information, see Use the console to collect Kubernetes text logs in the DaemonSet mode.
    Blacklist If you turn on this switch, you can configure a blacklist in the Add Blacklist section. You can configure a blacklist to skip the specified directories or files during log data collection. The blacklist supports exact match and wildcard match for directory names and file names. Examples:
    • If you select Filter by Directory from the Filter Type drop-down list and enter /tmp/mydir in the Content column, all files in the directory are skipped.
    • If you select Filter by File from the Filter Type drop-down list and enter /tmp/mydir/file in the Content column, only the specified file in the directory are skipped.
    Mode By default, Delimiter Mode is selected. For information about other modes, see Overview.
    Log Sample Enter a sample log entry. The delimiter mode is only applicable to the collection of single-line logs.
    Delimiter Select the appropriate delimiter based on the log format. Otherwise, Log Service may fail to parse logs.
    Note If you select Hidden Characters as the delimiter, you must enter the character in the following format: 0xthe hexadecimal ASCII code of the non-printable character. For example, to use the non-printable character whose hexadecimal ASCII code is 01, you must enter 0x01.
    Quote Select a quote based on the log format. Otherwise, Log Service may fail to parse logs.
    Note If you select Hidden Characters as the quote, you must enter this character in the following format: 0xthe hexadecimal ASCII code of the non-printable character. For example, to use the non-printable character whose hexadecimal ASCII code is 01, you must enter 0x01.
    Extracted Content Log Service extracts the log content based on the specified sample log entry and delimiter. The extracted log content is delimited into values. You must specify a key for each value.
    Incomplete Entry Upload Specifies whether to upload a log entry whose number of parsed fields is less than the number of the specified keys. If you turn on this switch, the log entry is uploaded. If you turn off this switch, the log entry is dropped.
    For example, if you set the delimiter to the vertical bar (|), the log entry 11|22|33|44|55 is parsed into the following fields: 11, 22, 33, 44, and 55. You can set the keys to A, B, C, D, and E, respectively.
    • If you turn on the Incomplete Entry Upload switch, 55 is uploaded as the value of the D key when Log Service collects the log entry 11|22|33|55.
    • If you turn off the Incomplete Entry Upload switch, Log Service drops the log entry because the fields and keys do not match.
    Use System Time
    • If you turn on the Use System Time switch, the timestamp of a log entry is the system time of the server when the log entry is collected.
    • If you turn off the Use System Time switch, you must find the value that indicates time information in the Extracted Content and configure a key named time for the value. Specify the value and then click Auto Generate in the Time Conversion Format field to automatically parse the time. For more information, see Time formats.
    Drop Failed to Parse Logs
    • If you turn on the Drop Failed to Parse Logs switch, logs that fail to be parsed cannot be uploaded to Log Service.
    • If you turn off the Drop Failed to Parse Logs switch, raw logs are uploaded when the raw logs fail to be parsed.
    Maximum Directory Monitoring Depth The maximum depth at which the log directory is monitored. Valid values: 0 to 1000. The value 0 indicates that only the directory specified in the log path is monitored.
    Specify Advanced Options based on your business requirements. We recommend that you do not modify the default settings unless otherwise required.
    Parameter Description
    Enable Plug-in Processing Specifies whether to use plug-ins for Logtail to process logs. If you turn on the switch, plug-ins are used to process logs.
    Upload Raw Log If you turn on this switch, raw logs are written to the __raw__ field and uploaded with the parsed logs.
    Topic Generation Mode
    • Null - Do not generate topic: This mode is selected by default. In this mode, the topic field is set to an empty string and you can query logs without the need to enter a topic.
    • Machine Group Topic Attributes: This mode is used to differentiate log data that is generated by different servers.
    • File Path Regex: If you select File Path Regex for Topic Generation Mode, you must configure a regular expression in the Custom RegEx field. The part of a log path that matches the regular expression is used as the topic name. This mode is used to differentiate log data that is generated by different users or ECS instances.
    Log File Encoding
    • utf8: indicates UTF-8 encoding.
    • gbk: indicates GBK encoding.
    Timezone The time zone where logs are collected.
    • System Timezone: This option is selected by default. It indicates that the time zone where logs are collected is the same as the time zone to which the server belongs.
    • Custom: Select a time zone.
    Timeout If a log file is not updated within the specified period of time, Logtail considers the file to be timed out.
    • Never: All log files are continuously monitored and never time out.
    • 30 Minute Timeout: If a log file is not updated within 30 minutes, Logtail considers the log file to be timed out and no longer monitors the file.

      If you select 30 Minute Timeout, you must set Maximum Timeout Directory Depth. Valid values: 1 to 3.

    Filter Configuration Specifies to collect logs that match the filtering conditions. Examples:
    • If you want to collect only the logs with the severity level of WARNING or ERROR, set the condition Key:level Regex:WARNING|ERROR. It indicates that logs with the severity level of WARNING or ERROR are collected.
    • Filter logs that do not meet a condition:
      • Set the condition to Key:level Regex:^(?!. *(INFO|DEBUG)). * if you want to filter out the logs with the severity level of INFO or DEBUG.
      • Set the condition to Key:url Regex:. *^(?!.*(healthcheck)). * if you want to filter out the logs whose URL contains the keyword healthcheck. For example, logs in which the value of the url key is /inner/healthcheck/jiankong.html are not collected.

    For more examples, see regex-exclude-word and regex-exclude-pattern.

  6. In the Configure Query and Analysis step, set indexes.
    Indexes are configured by default. You can reconfigure the indexes as needed. For more information, see Enable and configure the index feature for a Logstore.
    Note
    • You must set Full Text Index or Field Search. If you set both of them, the settings of Field Search prevail.
    • If the data type of an index is Long or Double, the Case Sensitive and Delimiter settings are unavailable.
After you complete the preceding steps, you can use Log Service to collect logs of the ECS instances.
Note
  • You may need to wait for three minutes before the Logtail configurations take effect.
  • For information about how to troubleshoot Logtail error, see Diagnose collection errors.

Step 3: Search and analyze log data

You can use query statements to search and analyze logs in real time after logs are collected to Log Service.

  1. In the Projects section, click the project.
  2. Choose Log Management > Logstores, click the management icon of the target Logstore, and then select Management icon > Search & Analysis.
  3. Enter a query statement in the search box, select a time range, and then click Search & Analyze.
    • Log Service provides contextual query, saved search, indexing, and other features. For more information, see Query syntax.
    • Log Service provides multiple types of charts for you to visualize the results of query statements. For more information, see Analysis chart.
    • Log Service allows you to create dashboards that display data analysis results. For more information, see Dashboard.
    For example, you can query page views (PVs) that occurred within a specified period and display the query results in a table.Query log data

What to do next

  • Data shipping: You can ship collected data to OSS, Maxcompute, EMR, and other storage or computing services. For more information, see Data shipping.
  • Data consumption: You can consume collected data. For more information, see Data consumption.