All Products
Search
Document Center

Simple Log Service:Diagnose and monitor LoongCollector

Last Updated:Mar 25, 2026

When you use LoongCollector to collect logs, you might encounter issues such as regular expression parsing failures, incorrect file paths, or traffic that exceeds the processing capacity of a shard. Simple Log Service (SLS) provides a diagnostic feature to help you locate collection errors. To monitor LoongCollector in real time, you can use built-in alert monitoring rules to receive alert notifications through channels such as DingTalk.

Prerequisites

  • You have collected logs by using LoongCollector. For more information, see Continuously collect text logs from a host.

  • Enable important logs for the destination Project

    This section describes how to enable this feature. For more information about service logs, see Enable service logs.

    1. Log on to the Simple Log Service console. In the Project list, click the destination Project. On the details page of the destination Project, on the Service Log tab, click Enable Service Logs.

    2. In the Enable Service Log panel, select Important Log and Job Operational Log, and then click OK.

      • This operation automatically creates a Project named log-service-{user-ID}-{region} in the destination region.

      • The ingestion, storage, query, and analysis of important logs and job operational logs are free of charge. You are charged on a pay-as-you-go basis for operations such as data transformation and data shipping.

Runtime diagnostics

Diagnostics are available in two editions: Advanced Diagnostics and Basic Diagnostics.

  • Advanced Diagnostics (Recommended): Provides a diagnostic dashboard that clearly displays LoongCollector-related exceptions and lets you query for exception information over an extended period.

  • Basic Diagnostics: Provides information about collection exceptions that occurred within the last hour.

Scenarios

  • Abnormal LoongCollector status: heartbeat failures, inactive processes, or SSL certificate exceptions.

  • Log collection exceptions: Logs are not collected, high collection latency, or parsing failures such as regular expression matching errors.

  • Configuration errors: Incorrect file paths, mismatched machine group IP addresses, or cross-account permission issues.

  • Performance bottlenecks: The collection rate approaches or exceeds the default limit, such as 20 MB/s, which causes logs to be dropped.

  • Container log collection issues: Frequent pod restarts or rapid log rotation that leads to incomplete collection.

  • Plugin and custom collection issues: Failures in custom plugins, such as a Grok parsing plugin, or failures in HTTP data source collection.

  • Data reliability issues: Log loss that occurs when LoongCollector is not running or log rotation is too fast.

Procedure

  1. Log on to the Simple Log Service console. In the Project list, click the destination Project.

  2. Click image Log Storage. In the list of Logstores, hover the pointer over the destination Logstore, and then click the Logtail配置管理 icon.

  3. Click Advanced Diagnostics or Basic Diagnostics to view the diagnostic information.

  4. View the diagnostic information.

    Basic diagnostics

    The Log Collection Errors panel displays a list of all LoongCollector collection errors for the Logstore. You can click an error code to view its details. For more information, see Common data collection errors in Simple Log Service.

    Advanced diagnostics

    On the LoongCollector/Logtail Exception Monitoring page, view information such as Active Clients and All Error Information. For more information about the Collection Exception Monitoring dashboard, see View data reports. For more information about error codes, see Common data collection errors in Simple Log Service.

  5. After you resolve the issues, check for new errors. Historical errors continue to appear until they expire. You can ignore them. Verify that no new errors occur after you fix the issues. LoongCollector reports errors at 10-minute intervals.

    To view the full logs that were dropped due to parsing failures, you can check the LoongCollector operational logs. The paths are as follows:
    Host scenario: In the /usr/local/ilogtail/loongcollector.LOG file on the server.
    Container scenario: In the container's /usr/local/ilogtail/loongcollector.LOG file.

Runtime monitoring

Simple Log Service provides built-in alert policies to monitor LoongCollector in real time. You can configure these policies for the following monitoring purposes:

  • Monitor LoongCollector for heartbeat anomalies

    Query logs in the internal-diagnostic_log Logstore with the search condition __topic__:logtail_status to count the number of machines that have normal LoongCollector heartbeats. Then, configure an Alert Rule to trigger an alert if the heartbeat count falls below the expected value. This helps you troubleshoot machines that are down or have network issues. 

  • Create alerts for LoongCollector collection exceptions

    Run the __topic__: logtail_alarm query to analyze the number of exceptions of different types that occurred in the last 15 minutes. These exceptions can include unreadable files, insufficient permissions, and parsing failures. This helps you promptly identify and resolve configuration issues to prevent log loss. 

  • Receive early warnings for performance bottlenecks

    Use the Logtail Exception Monitoring dashboard to monitor the runtime status and resource usage of Logtail, such as CPU and memory. The dashboard displays the number of active LoongCollectors, a list of restarts, and complete error information. This helps you identify performance bottlenecks or abnormal restarts. 

  • Monitor centralized log collection

    Use the LoongCollector File Collection Monitoring dashboard to centrally manage the log collection status across multiple accounts or regions. The dashboard displays metrics such as the number of collected files, average latency, and parsing failure rate. This helps ensure collection continuity. 

Procedure

  1. Configure an Action Policy to define how notifications are sent when an alert changes status.

    1. Log on to the Simple Log Service console.

    2. In the Project list, find the Project for which you enabled important logs and click the Project name.

    3. In the left-side navigation pane, click imageAlerts. On the Alert Center page, choose Alert Management > Action Policy.

    4. In the list of action policies, find the sls.app.logtail.builtin Action Policy, and click Modify in the Actions column.

    5. In the Edit Action Policy dialog box, select and configure a notification channel based on your business requirements. For more information, see Notification methods. Then, click OK.

  2. Create an Alert Rule to specify the conditions for triggering an alert when the LoongCollector runtime status meets a threshold.

    1. On the Alert Center page, click Alert Rules, and then click the image icon next to Create Alert Rule.

    2. Click Create from Template. In the Create from Template panel, under All Templates, click Logtail Error Monitoring. Then, in the panel that appears on the right, click the card for the rule that you want to create.

    3. In the Create Alert Rule panel, review the configuration. The built-in alert monitoring rule has preset parameters. Click OK. For more information about the configuration parameters, see Create an alert rule.