×
Community Blog Logtail Heartbeat Troubleshooting (Host Scenario)

Logtail Heartbeat Troubleshooting (Host Scenario)

This article describes how to systematically troubleshoot the machine group heartbeat issue in the host scenario.

By Dumin, from Alibaba Cloud Storage Team

Preface

The machine group heartbeat is an important foundation for the normal operation of Logtail. However, the lack of machine group heartbeat is a common issue during the use of Logtail. There is a systematic process for troubleshooting this type of problem, and most problems can be solved in this troubleshooting process. Therefore, this article will focus on how to systematically troubleshoot the machine group heartbeat issue in the host scenario.

Procedure

Step 1: Check Whether Logtail Runs as Expected

Log on to the machine where Logtail is located and perform a check based on the following methods:

  • Linux

Run the following command in the command line.

ps -ef | grep ilogtail

In normal cases, the returned result contains the following two pieces of information, indicating that Logtail is running. One indicates the Logtail daemon process and the other indicates the Logtail worker process.

UID          PID    PPID  C STIME TTY          TIME CMD
...
root          12       1  0 Nov10 ?       00:00:00 /usr/local/ilogtail/ilogtail
root          14      12  0 Nov10 ?       03:07:43 /usr/local/ilogtail/ilogtail
...

Note: If three or more logtail commands are returned, multiple Logtail instances are running in the current environment, which may result in a duplicate collection. Check whether the command is as expected.

  • Windows
  1. Open the Run command window and enter services.msc to open the Services window.
  2. View the status of the LogtailDaemon and LogtailWorker services. If the services are in the Running state, Logtail is installed.

Check results

  • If Logtail is not running, see Install Logtail (Linux) or Install Logtail (Windows). When you install Log Service, make sure that the installation is based on the region and network type of your Log Service project.
  • If Logtail is running, go to the next step.

Step 2: Confirm That the IP Addresses Involved in the Machine Group Are the IP Addresses Obtained by Logtail.

Logtail obtains a server IP address by using the following methods.

  • If the hostname of the server is not bound to an IP address, Logtail obtains the IP address of the first network interface controller (NIC) of the server.
  • If the server is bound with a hostname, Logtail obtains the IP address that corresponds to the hostname. You can view the hostname and IP address in the /etc/hosts file.

Step 2.1: Find the app_info.json File

Logtail records the obtained IP addresses in the ip field of the app_info.json file. The default paths of the file in different systems are as follows.

  • Linux: /usr/local/ilogtail/app_info.json
  • 64-bit Windows: C:Program Files (x86)AlibabaLogtailapp_info.json
  • 32-bit Windows: C:Program FilesAlibabaLogtailapp_info.json
{
 "UUID" : "",
 "hostname" : "iZ8vbdlzf******azuhZ",
 "instance_id" : "E9633380-***********-00163E1AA597_172.16.2.200_166****11",
 "ip" : "172.16.2.200",
 "logtail_version" : "1.3.1",
 "os" : "Linux; 4.19.91-26.1.al7.x86_64; #1 SMP Tue Jul 26 17:52:28 CST 2022; x86_64",
 "update_time" : "2022-12-27 05:38:33"
}

Step 2.2: Confirm That the IP Address Obtained by Logtail Is Used in the Machine Group

The IP addresses involved in the machine group are all IP addresses obtained by Logtail. Please check the following:

  • If the machine group is marked as an IP address, check whether the IP address of the target Logtail is included in the IP address field of the machine group.
  • If the machine group ID is a custom ID, use the IP address obtained by Logtail to search for the status information of the machine group.

1

Check results

  • Machine Group Identified as an IP Address: If you enter another Logtail IP address (such as public endpoint) in the machine group IP Address field, change the IP address to the one obtained by Logtail and check whether the heartbeat of the machine is normal. If it is normal, end the troubleshooting process.
  • Machine Group ID: If you use another IP address when you search for the machine group status information, use the IP address obtained by Logtail again. If the search is successful, the troubleshooting process ends.
  • In other cases, perform the next check.

Step 3: Check Whether the Logtail Startup Parameters Are Correct

Step 3.1: Find the Logtail Configuration File

The ilogtail_config.json file records the startup parameters of Logtail. To find the file, check whether the storage path of the file is specified in the environment variable.

echo $ALIYUN_LOGTAIL_CONFIG

If the returned result is not empty, the storage path of the file is the value of the environment variable, which is generally /etc/ilogtail/conf//ilogtail_config.json. If the returned result is empty, the default path of the file in different systems is as follows.

  • Linux: /usr/local/ilogtail/ilogtail_config.json
  • 64-bit Windows: C:Program Files (x86)AlibabaLogtaililogtail_config.json
  • 32-bit Windows: C:Program FilesAlibabaLogtaililogtail_config.json

Step 3.2: Confirm Whether the Configuration File Parameters Are Correct

The Logtail configuration file is as follows.

{
 "config_server_address" : "http://logtail.<config_region>.log.aliyuncs.com",
 "data_server_list" :
 [
   {
     "cluster" : "<project地域>",
     "endpoint" : "<endpoint>"
   }
 ],
 ...
}

Among them, <·> represents variables, and the specific parameters are selected in the following table.

2

Step 4: Check Whether the Network Is Unobstructed

To ensure that Logtail uploads data, you must ensure that Logtail can connect to the following addresses.

  1. The address specified in the config_server_address field and its https version in the ilogtail_config.json file.
  2. http://<project name>.<endpoint>, where <endpoint> is the address specified by the data_server_list.endpoint field in the ilogtail_config.json file.
  3. http://ali-<project region>-sls-admin.<endpoint>, where <endpoint> is the address specified by the data_server_list.endpoint field in the ilogtail_config.json file.

(Where [-intranet] indicates whether the internal network is used or not. If yes, add -intranet.)

The network debugging method is as follows.

  • Linux

On the command line, call curl commands to try to connect to the preceding addresses in sequence.

curl xxx

If all addresses return content similar to the following, the network is smooth.

{"Error":{"Code":"OLSInvalidMethod","Message":"The script name is invalid : /","RequestId":"5D****09"}}
  • Windows

On the command line, call telnet commands to try to connect to the preceding addresses in sequence.

telnet xxx 80 # 443 if it is https.

If all addresses return content similar to the following, the network is smooth:

Trying 100*0*7*5...
Connected to xxx.
Escape character is '^]'.

Check results

  • If the network is not connected, check whether ports 80 and 443 in the network environment are open, whether the destination address is blocked, and check other network-side checks (such as DNS configuration and security group).
  • If the network is open, perform the next check.

Step 5: Check Whether the System Time of the Logtail Environment Is Correct

  • Linux

Run the date command on the command line to view the system time.

Wed Dec 28 06:59:26 UTC 2022

  • Windows

View the time information in the taskbar in the lower-right corner of the desktop.

Check results

  • If the system time is significantly faster or slower than the actual time, re-adjust the system time to the actual time and restart Logtail. Alternatively, modify the startup parameters and restart Logtail by adding a key-value pair enable_log_time_auto_adjust: true to the configuration file ilogtail_config.json. For more information about the path of the configuration file, see Step 3.1. For more information about how to restart Logtail, see Appendix.
  • If the system time is the same as the real time, perform the next check.

Step 6: Check Whether the User ID Is Configured

If the server where Logtail resides is an ECS instance but does not belong to the same account as Log Service, or is a server of another cloud service provider or a self-built data center, you need to configure the primary account to which the Logtail project belongs as the user ID. This indicates that the account has permission to collect logs from the server through Logtail.

To check whether Logtail has correctly configured the user ID, first, check whether the user ID has been specified in the environment variable ALIYUN_LOGTAIL_USER_ID:

echo $ALIYUN_LOGTAIL_USER_ID

If the returned result is not empty, the system compares whether the value contains the same component as the project account ID. If the returned result is empty, the default directories of user ID files in different systems are as follows.

  • Linux: /etc/ilogtail/users/.
  • In a Windows-based server: The file is stored in the C:LogtailDatausers directory.

Check results

  • If no user identifier is configured for Logtail or the user identifier is incorrectly configured,
  • If ALIYUN_LOGTAIL_USER_ID environment variables are configured, change the value of the environment variable to the ID of the Alibaba Cloud account to which the project belongs, or add the ID of the Alibaba Cloud account to which the project belongs. Separate the values with commas (,).
  • Conversely, add or modify a user identification file.
  • Linux: Run cd /etc/ilogtail/users/ && touch <uid> on the command line, where is the ID of the Alibaba Cloud account to which the project belongs.

    • Windows: Go to the C:LogtailDatausersdirectory and create an empty file named .

After the modification, you must restart Logtail. For more information about how to restart Logtail, see Appendix.

  • If the user identity is correctly configured, perform the next check.

Step 7: If the Machine Group Identity Is a Custom Identity, Check Whether the Custom Identity Is Configured

To check whether Logtail is correctly configured with a custom ID, first, check whether the custom ID has been specified in the environment variable ALIYUN_LOGTAIL_USER_DEFINED_ID.

echo $ALIYUN_LOGTAIL_USER_DEFINED_ID

If the returned result is not empty, compare whether the value contains the same component as the custom ID configured in the machine group. If the returned result is empty, the default path of the custom ID file in different systems is as follows.

  • Linux: /etc/ilogtail/user_defined_id.
  • Windows: C:LogtailDatauser_defined_id.

Check results

  • If Logtail is not configured with a custom identifier or the custom identifier is incorrectly configured:
  • If ALIYUN_LOGTAIL_DEFINED_ID environment variables are configured, modify the value of the environment variable to the machine group user-defined identifier, or add the machine group user-defined identifier to the end of the existing value, separated by commas.
  • If the custom ID file does not exist, a new file named user_defined_id is added, and the machine group user-defined ID is filled in the file; on the other hand, a new line is added to the existing file to fill in the machine group user-defined ID.

After the modification, you must restart Logtail. For more information about how to restart Logtail, see Appendix.

Following Steps

Most machine group heartbeat issues can be resolved through the preceding troubleshooting process. If the issue persists, submit a ticket.

Appendix

Restart the Logtail Method

  • Linux: Run the sudo /etc/init.d/ilogtaild restart on the command line.
  • Windows:
  1. Open the Run command window and enter services.msc to open the Services window.
  2. Restart the LogtailWorker service.

Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.

0 1 0
Share on

Alibaba Cloud Community

880 posts | 198 followers

You may also like

Comments