This topic describes how to use Logtail to collect WordPress logs.

Background information

WordPress is a blog platform that is developed in the PHP programming language and paired with a MySQL database. WordPress has evolved into a software application for content management. The following example shows a sample log entry:
10.10.10.10 - - [07/Jan/2016:21:06:39 +0800] "GET /wp-admin/js/password-strength-meter.min.js? ver=4.4 HTTP/1.0" 200 776 "http://wordpress.c4a1a0aecdb1943169555231dcc4adfb7.cn-hangzhou.alicontainer.com/wp-admin/install.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/10.10.10.10 Safari/537.36"

Procedure

  1. Log on to the Log Service console.
  2. On the page that appears, click RegEx - Text Log in the Import Data section.
  3. Select the project and Logstore. Then, click Next.
  4. Create a machine group.
    • If a machine group is available, click Using Existing Machine Groups.
    • If no machine groups are available, perform the following steps to create a machine group. In this example, an Elastic Compute Service (ECS) instance is used.
      1. On the ECS Instances tab, select Manually Select Instances. Then, select the ECS instance that you want to use and click Execute Now.

        For more information, see Install Logtail on ECS instances.

        Note If you want to collect logs from self-managed clusters or servers from third-party cloud service providers, you must manually install Logtail. For more information, see Install Logtail on a Linux server or Install Logtail on a Windows server.
      2. After Logtail is installed, click Complete Installation.
      3. In the Create Machine Group step, configure Name and click Next.

        Log Service allows you to create IP address-based machine groups and custom identifier-based machine groups. For more information, see Create an IP address-based machine group and Create a custom ID-based machine group.

  5. Select the newly created machine group and move it from the Source Server Groups section to the Applied Server Groups section. Then, click Next.
    Notice If you apply a machine group immediately after it is created, the heartbeat status of the machine group may be FAIL. This issue occurs because the machine group is not connected to Log Service. In this case, you can click Automatic Retry. If the issue persists, see What do I do if no heartbeat connections are detected on Logtail?
  6. In the Logtail Config step, create a Logtail configuration file.
    Parameter Description
    Config Name The name of the Logtail configuration file. The name cannot be modified after the Logtail configuration file is created.

    You can also click Import Other Configuration to import a Logtail configuration file from another project.

    Log Path The directories and files from which log data is collected.
    The file name can be a complete name or a name that contains wildcards. For more information, see Wildcard matching. Log Services scans all levels of the specified directory to match log files. Example:
    • If you specify /apsara/nuwa/…/*.log, Log Service matches the files whose name is suffixed by .log in the /apsara/nuwa directory and its recursive subdirectories.
    • If you specify /var/logs/app_*/*.log, Log Service matches the files that meet the following conditions: The file name contains .log. The file is stored in a subdirectory under /var/logs or in a recursive subdirectory of the subdirectory. The name of the subdirectory matches the app_* pattern.
    Note
    • By default, logs in each log file can be collected by using only one Logtail configuration.
    • You can use only asterisks (*) or question marks (?) as wildcards in the log path.
    Blacklist If you turn on this switch, you can configure a blacklist in the Add Blacklist field. You can configure a blacklist to skip the specified directories or files during log data collection. You can use exact match or wildcard match to specify directories and files. Example:
    • If you select Filter by Directory from a drop-down list in the Filter Type column and enter /home/admin/dir1 for Content, all files in the /home/admin/dir1 directory are skipped.
    • If you select Filter by Directory from a drop-down list in the Filter Type column and enter /home/admin/dir* for Content, the files in all subdirectories whose names are prefixed by dir in the /home/admin/ directory are skipped.
    • If you select Filter by Directory from a drop-down list in the Filter Type column and enter /home/admin/*/dir for Content, all files in dir directories in each subdirectory of the /home/admin/ directory are skipped.

      For example, the files in the /home/admin/a/dir directory are skipped, but the files in the /home/admin/a/b/dir directory are not skipped.

    • If you select Filter by File from a drop-down list in the Filter Type column and enter /home/admin/private*.log for Content, all files whose names are prefixed by private and suffixed by .log in the /home/admin/ directory are skipped.
    • If you select Filter by File from a drop-down list in the Filter Type column and enter /home/admin/private*/*_inner.log for Content, all files whose names are suffixed by _inner.log in the subdirectories whose names are prefixed by private in the /home/admin/ directory are skipped.

      For example, the /home/admin/private/app_inner.log file is skipped, but the /home/admin/private/app.log file is not skipped.

    Docker File If you collect logs from Docker containers, you can configure the paths and tags of the containers. Logtail monitors the creation and destruction of the containers, filters the logs of the containers by tag, and collects the filtered logs. For more information, see Use the Log Service console to collect container text logs in DaemonSet mode.
    Mode Set the value to Full Regex Mode.
    Singleline Turn off the Singleline switch.
    Log Sample Enter the following sample log entry in the Log Sample field:
    10.10.10.10 - - [07/Jan/2016:21:06:39 +0800] "GET /wp-admin/js/password-strength-meter.min.js? ver=4.4 HTTP/1.0" 200 776 "http://wordpress.c4a1a0aecdb1943169555231dcc4adfb7.cn-hangzhou.alicontainer.com/wp-admin/install.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/10.10.10.10 Safari/537.36"
    Regex to Match First Line After you enter the sample log entry, click Auto Generate. A regular expression is generated to match the first line of a log entry. The sample log entry starts with an IP address. Therefore, the generated regular expression is \d+\.\d+\.\d+\.\d+\s-\s. *.
    Extract Field If you turn on the Extract Field switch, you can use a regular expression to extract field values from logs.
    RegEx Set the value to (\S+) - - \[([^\]]*)] "(\S+) ([^"]+)" (\S+) (\S+) "([^"]+)" "([^"]+)". You can configure a regular expression based on one of the following methods:
    • Automatically generate a regular expression

      In the Log Sample field, select the field values to be extracted, and click Generate Regular Expression. A regular expression is automatically generated.

    • Manually enter a regular expression

      Click Manual. In the RegEx field, enter a regular expression. After you enter a regular expression in the field, click Validate to check whether the regular expression can parse the sample log entry. For more information, see How do I modify a regular expression?.

    Extracted Content The field is available only after you turn on the Extract Field switch.

    After you use a regular expression to extract field values, you must specify a key for each value.

    Use System Time Turn off the Use System Time switch. Configure the time field in the %d/%b/%Y:%H:%M:%S format. You can use one of the following methods to configure the time field:
    • If you turn on Use System Time, the timestamp of a log indicates the system time when the log is collected. The system time refers to the time of the server on which Logtail runs.
    • If you turn off Use System Time, you must configure Specified Time Key and Time Format based on the value of the time field specified in Extracted Content. For more information about the time format, see Time formats.

      For example, if you set Specify Time Key to time_local and Time Format to %d/%b/%Y:%H:%M:%S, the timestamp of a log is the value of the time_local field.

    Drop Failed to Parse Logs
    • If you turn on Drop Failed to Parse Logs, the logs that fail to be parsed are not uploaded to Log Service.
    • If you turn off Drop Failed to Parse Logs, the logs that fail to be parsed are still uploaded to Log Service as the value of the __raw_log__ field.
    Maximum Directory Monitoring Depth The maximum depth at which the specified log directory is monitored. Valid values: 0 to 1000. The value 0 indicates that only the directory that is specified in the log path is monitored.
    You can configure advanced settings based on your business requirements. We recommend that you do not modify the advanced settings. The following table describes the parameters in the advanced settings.
    Parameter Description
    Enable Plug-in Processing If you turn on Enable Plug-in Processing, you can configure Logtail plug-ins to process logs. For more information, see Overview.
    Note If you turn on Enable Plug-in Processing, the parameters such as Upload Raw Log, Timezone, Drop Failed to Parse Logs, Filter Configuration, and Incomplete Entry Upload (Delimiter mode) become unavailable.
    Upload Raw Log If you turn on Upload Raw Log, each raw log is uploaded to Log Service as the value of the __raw__ field together with the log parsed from the raw log.
    Topic Generation Mode Select the topic generation mode. For more information, see Log topics.
    • Null - Do not generate topic: In this mode, the topic field is set to an empty string. When you query logs, you do not need to specify a topic. This is the default value.
    • Machine Group Topic Attributes: In this mode, topics are configured at the machine group level. If you want to distinguish the logs that are generated by different servers, select this mode.
    • File Path RegEx: In this mode, you must specify a regular expression in the Custom RegEx field. The part of a log path that matches the regular expression is used as the topic. If you want to distinguish the logs that are generated by different users or instances, select this mode.
    Log File Encoding Select the encoding format of log files. Valid values: utf8 and gbk.
    Timezone Select the time zone in which logs are collected. Valid values:
    • System Timezone: If you select this value, the time zone to which the server belongs is used. This is the default value.
    • Custom: If you select this value, you must select a time zone based on your business requirements.
    Timeout Select a timeout period of log files. If a log file is not updated within the specified period, Logtail considers the file to be timed out. Valid values:
    • Never: All log files are continuously monitored and never time out.
    • 30 Minute Timeout: If a log file is not updated within 30 minutes, Logtail considers the file to be timed out and stops monitoring the file.

      If you select 30 Minute Timeout, you must configure the Maximum Timeout Directory Depth parameter. Valid values: 1 to 3.

    Filter Configuration Specify the filter conditions that you want to use to collect logs. Only the logs that match the specified filter conditions are collected. Examples:
    • Collect the logs that match the specified filter conditions: If you set Key to level and RegEx to WARNING|ERROR, only the logs whose level is WARNING or ERROR are collected.
    • Filter out the logs that do not match the specified filter conditions. For more information, see Regular-Expressions.info.
      • If you set Key to level and RegEx to ^(?!.*(INFO|DEBUG)).*, the logs whose level contains INFO or DEBUG are not collected.
      • If you set Key to level and RegEx to ^(?!(INFO|DEBUG)$).*, the logs whose level is INFO or DEBUG are not collected.
      • If you set Key to url and RegEx to .*^(?!.*(healthcheck)).*, the logs whose url contains healthcheck are not collected. For example, if a log has the Key field of url and the Value field of /inner/healthcheck/jiankong.html, the log is not collected.

    For more information, see regex-exclude-word and regex-exclude-pattern.

    First Collection Size Specify the size of data that Logtail can collect from a log file the first time Logtail collects logs from the file. The default value of First Collection Size is 1024. Unit: KB.
    • If the file size is less than 1,024 KB, Logtail collects data from the beginning of the file.
    • If the file size is greater than 1,024 KB, Logtail collects the last 1,024 KB of data in the file.

    You can specify First Collection Size based on your business requirements. Valid values: 0 to 10485760. Unit: KB.

    More Configurations Specify extended settings for Logtail. For more information, see advanced.

    For example, if you want to use the current Logtail configuration to collect logs from log files that match a different Logtail configuration and specify the interval at which logs are aggregated and sent to Log Service, you can specify extended settings for the current Logtail.

    {
      "force_multiconfig": true,
      "batch_send_interval": 3
    }

    After you complete the Logtail configurations, Log Service starts to collect WordPress logs.

  7. Preview data, configure indexes, and then click Next.
    By default, full-text indexing is enabled for Log Service. You can also configure field indexes based on collected logs in manual or automatic mode. For more information, see Configure indexes.
    Note If you want to query and analyze logs, you must enable full-text indexing or field indexing. If you enable both full-text indexing and field indexing, the system uses only field indexes.