This topic describes how to use Logtail to collect Python logs.
The logging module of Python is a logging system that is compatible with third-party modules or applications. The logging module defines multiple log severity levels and logging methods. The logging module consists of four components: loggers, handlers, filters, and formatters.
import logging import logging.handlers LOG_FILE = 'tst.log' handler = logging.handlers.RotatingFileHandler(LOG_FILE, maxBytes = 1024*1024, backupCount = 5) # Create a handler object. %(asctime)s - %(filename)s:%(lineno)s - %(levelno)s %(levelname)s %(pathname)s %(module)s %(funcName)s %(created)f %(thread)d %(threadName)s %(process)d %(name)s - %(message)s // Define the output format of logs. formatter = logging.Formatter(fmt) # Create a formatter object. handler.setFormatter(formatter) # Add the formatter to the handler. logger = logging.getLogger('tst') # Retrieve a logger that is named tst. logger.addHandler(handler) # Add the handler to the logger. logger.setLevel(logging.DEBUG) logger.info('first info message') logger.debug('first debug message')
|%(name)s||The name of the logger that generates a log.|
|%(levelno)s||The severity level of a log in the numeric format. Valid values: 10, 20, 30, 40, and 50.|
|%(levelname)s||The severity level of a log in the text format. Valid values: DEBUG, INFO, WARNING, ERROR, and CRITICAL.|
|%(pathname)s||The full path name of the source file where the logging call is initiated.|
|%(filename)s||The name of the source file.|
|%(module)s||The name of the module where the logging call is initiated.|
|%(funcName)s||The name of the function from which the logging call is initiated.|
|%(lineno)d||The line number in the source file where the logging call is initiated.|
|%(created)f||The time when a log is created. The value is a Unix timestamp. It represents the number of seconds that have elapsed since January 1, 1970, 00:00:00 (UTC).|
|%(relativeCreated)d||The difference between the time when a log is created and the time when the logging module is loaded. Unit: milliseconds.|
|%(asctime)s||The time when a log is created. Example: 2003-07-08 16:49:45,896. The digits after the comma (,) indicate the millisecond portion of the time.|
|%(msecs)d||The millisecond portion of the time when a log is created.|
|%(thread)d||The ID of the thread.|
|%(threadName)s||The name of the thread.|
|%(process)d||The ID of the process.|
|%(message)s||The log content.|
2015-03-04 23:21:59,682 - log_test.py:16 - tst - first info message 2015-03-04 23:21:59,682 - log_test.py:17 - tst - first debug message
- Log on to the Log Service console.
- On the page that appears, click RegEx - Text Log in the Import Data section.
- In the Specify Logstore step, select the project and Logstore, and then click Next.You can also click Create Now to create a project and a Logstore.
- In the Create Machine Group step, create a machine group.
- If a machine group is available, click Using Existing Machine Groups.
- This section uses ECS instances as an example to describe how to create a machine
group. To create a machine group, perform the following steps:
- Install Logtail on ECS instances. For more information, see Install Logtail on ECS instances.
If Logtail is installed on the ECS instances, click Complete Installation.
- After the installation is complete, click Complete Installation.
- On the page that appears, specify the parameters for the machine group. For more information, see Create an IP address-based machine group or Create a custom ID-based machine group.
- Install Logtail on ECS instances. For more information, see Install Logtail on ECS instances.
- In the Machine Group Settings step, apply the configurations to the machine group.Select the created machine group and move the group from Source Server Groups to Applied Server Groups.
- In the Logtail Config step, create a Logtail configuration file.
Parameter Description Config Name The name of the Logtail configuration file. The name cannot be modified after the Logtail configuration file is created.
You can also click Import Other Configuration to import a Logtail configuration file from another project.
Log Path The directories and files from which log data is collected.The file names can be complete names or names that contain wildcards. For more information, visit Wildcard matching. The log files in all levels of subdirectories under a specified directory are monitored if the log files match the specified pattern. Examples:
- /apsara/nuwa/ … /*.log indicates that the files whose extension is .log in the /apsara/nuwa directory and its subdirectories are monitored.
- /var/logs/app_* … /*.log* indicates that each file that meets the following conditions is monitored: The file name contains .log. The file is stored in a subdirectory (at all levels) of the /var/logs directory. The name of the subdirectory matches the app_* pattern.
- Each log file can be collected by using only one Logtail configuration file.
- You can include only asterisks (*) and question marks (?) as wildcard characters in the log path.
Blacklist If you turn on this switch, you can configure a blacklist in the Add Blacklist field. You can configure a blacklist to skip the specified directories or files during log data collection. You can use exact match or wildcard match to specify directories and files. Example:
- If you select Filter by Directory from the Filter Type drop-down list and enter /tmp/mydir in the Content column, all files in the directory are skipped.
- If you select Filter by File from the Filter Type drop-down list and enter /tmp/mydir/file in the Content column, only the specified file is skipped.
Docker File If you collect logs from Docker containers, you can configure the paths and tags of the containers. Logtail monitors the creation and destruction of the containers, filters the logs of the containers by tag, and collects the filtered logs. For more information, see Use the console to collect Kubernetes text logs in the DaemonSet mode. Mode Set the value to Full Regex Mode. Singleline Turn on the Singleline switch. The single-line mode indicates that each line contains one log entry. Log Sample Enter the following sample log entry in the Log Sample field:
2016-02-19 11:06:52,514 - test.py:19 - 10 DEBUG test.py test <module> 1455851212.514271 139865996687072 MainThread 20193 tst - first debug message
Extract Field If you turn on the Extract Field switch, you can use a regular expression to extract field values from logs. RegEx Set the value to (\d+-\d+-\d+\s\S+)\s-\s([^:]+):(\d+)\s+-\s+(\d+)\s+(\w+)\s+(\S+)\s+(\w+)\s+(\S+)\s+(\S+)\s+(\d+)\s+(\w+)\s+(\d+)\s+(\w+)\s+-\s+(. *). You can configure a regular expression based on one of the following methods:
- Automatically generate a regular expression
In the Log Sample field, select the field values to be extracted, and click Generate Regular Expression. A regular expression is automatically generated.
- Manually enter a regular expression
Click Manual. In the RegEx field, enter a regular expression. After you enter a regular expression in the field, click Validate to check whether the regular expression can parse the log content. For more information, see How do I modify a regular expression?.
Extracted Content The field is available only after you turn on the Extract Field switch.
After you use a regular expression to extract field values, you must specify a key for each value.
Use System Time The field is available only after you turn on the Extract Field switch.
- Specifies whether to the use system time. If you enable the Use System Time feature, the timestamp of a log entry is the system time of the server when the log entry is collected.
- If you disable the Use System Time feature, you must find the value that indicates time information in the Extracted Content and configure a key named time for the value. Specify the value and then click Auto Generate in the Time Conversion Format field to automatically parse the time. For more information, see Time formats.
Drop Failed to Parse Logs
- Specifies whether to drop failed-to-parse logs. If you enable the Drop Failed to Parse Logs feature, logs that fail to be parsed are not uploaded to Log Service.
- If you disable the Drop Failed to Parse Logs feature, raw logs are uploaded to Log Service when the raw logs fail to be parsed.
Maximum Directory Monitoring Depth The maximum depth at which the specified log directory is monitored. Valid values: 0 to 1000. The value 0 indicates that only the directory that is specified in the log path is monitored.You can configure advanced options based on your business requirements. We recommend that you do not modify the settings. The following table describes the parameters in the advanced options. Parameter Description Enable Plug-in Processing Specifies whether to enable the plug-in processing feature. If you enable this feature, plug-ins are used to process logs. For more information, see Process data. Upload Raw Log Specifies whether to upload raw logs. If you enable this feature, raw logs are written to the __raw__ field and uploaded together with the parsed logs. Topic Generation Mode
- Null - Do not generate topic: This mode is selected by default. In this mode, the topic field is set to an empty string. You can query logs without the need to enter a topic.
- Machine Group Topic Attributes: This mode is used to differentiate logs that are generated by different servers.
- File Path Regex: In this mode, you must configure a regular expression in the Custom RegEx field. The part of a log path that matches the regular expression is used as the topic name. This mode is used to differentiate logs that are generated by different users or instances.
Log File Encoding
- utf8: indicates that UTF-8 encoding is used.
- gbk: indicates that GBK encoding is used.
Timezone The time zone where logs are collected. Valid values:
- System Timezone: This option is selected by default. It indicates that the time zone where logs are collected is the same as the time zone to which the server belongs.
- Custom: Select a time zone.
Timeout The timeout period of log files. If a log file is not updated within the specified period, Logtail considers the file to be timed out. Valid values:
- Never: All log files are continuously monitored and never time out.
- 30 Minute Timeout: If a log file is not updated within 30 minutes, Logtail considers
the file to be timed out and no longer monitors the file.
If you select 30 Minute Timeout, you must specify the Maximum Timeout Directory Depth parameter. Valid values: 1 to 3.
Filter Configuration The filter conditions that are used to collect logs. Only logs that match the specified filter conditions are collected. Examples:
- Collect logs that meet a condition: Specify the filter condition to Key:level Regex:WARNING|ERROR if you need to collect only logs of only the WARNING or ERROR severity level.
- Filter out logs that do not meet a condition:
- Specify the filter condition to Key:level Regex:^(?!. *(INFO|DEBUG)). * if you need to filter out logs of the INFO or DEBUG severity level.
- Specify the filter condition to Key:url Regex:. *^(?!.*(healthcheck)). * if you need to filter out logs whose URL contains the keyword healthcheck. For example, logs in which the value of the url key is /inner/healthcheck/jiankong.html are not collected.
After you complete the Logtail configurations, Log Service starts to collect Python logs.
- In the Configure Query and Analysis step, configure the indexes.Indexes are configured by default. You can re-configure the indexes based on your business requirements. For more information, see Enable and configure the index feature for a Logstore.Note
- You must configure Full Text Index or Field Search. If you configure both of them, the settings of Field Search are applied.
- If the data type of index is long or double, the Case Sensitive and Delimiter settings are unavailable.