You can create a Logtail configuration file in the Log Service to collect HTTP data from specified URLs. After the Logtail configuration file is synchronized to the server on which Logtail is installed, Logtail sends requests at a regular interval to the specified URLs. Then, Logtail uploads the content of the response body as a data source to Log Service. This topic describes how to configure Logtail in the Log Service console to collect HTTP data.

Prerequisites

Logtail is installed on the server that you use to collect HTTP data. For more information, see Install Logtail on a Linux server or Install Logtail on a Windows server.
Note Servers that run Linux support Logtail 0.16.0 or later. Servers that run Windows support Logtail 1.0.0.8 or later.

Implementation

Logtail initiates regular HTTP requests based on the URLs, methods, headers, and bodies specified in the Logtail configurations. After Logtail receives a response, Logtail uploads the response status code, the content of the response body, and the response time to Log Service.

Implementation

Features

  • Supports multiple URLs.
  • Allows you to set multiple HTTP methods.
  • Allows you to set the interval at which HTTP requests are initiated.
  • Allows you to customize request headers.
  • Supports HTTPS.
  • Allows you to check whether the content of the request body matches a fixed pattern.

Scenarios

  • Monitor application status by using HTTP APIs.
    • NGINX
    • Docker
    • Elasticsearch
    • HAProxy
    • Other services that provide monitoring HTTP APIs
  • Monitor service availability.

    Logtail monitors the availability of a service by sending requests at a regular interval to the service and checking the response status code and latency.

  • Retrieve data such as tweets and the number of followers at a regular interval.

Limits

  • A URL must start with http or https.
  • Custom certificates are not supported.
  • Interactive communications are not supported.

Procedure

The following procedure shows how to collect data about the NGINX status module. Requests are sent to the URL http://127.0.0.1/ngx_status every 1,000 milliseconds. A regular expression is used to extract the status information from the response body.

  1. Log on to the Log Service console.
  2. In the Import Data section, select Custom Data Plug-in.
  3. Select the project and Logstore. Then, click Next.
  4. Create a machine group.
    • If a machine group is available, click Use Existing Machine Groups.
    • If no machine groups are available, perform the following steps to create a machine group. In this example, an Elastic Compute Service (ECS) instance is used.
      1. On the ECS Instances tab, select Manually Select Instances. Then, select the ECS instance that you want to use and click Execute Now.

        For more information, see Install Logtail on ECS instances.

        Note If you want to collect logs from an ECS instance that belongs to a different Alibaba Cloud account, a server in an on-premises data center, or a server of a third-party cloud service provider, you must manually install Logtail. For more information, see Install Logtail on a Linux server or Install Logtail on a Windows server. After you manually install Logtail, you must configure a user identifier on the server. For more information, see Configure a user identifier.
      2. After Logtail is installed, click Complete Installation.
      3. In the Create Machine Group step, configure Name and click Next.

        Log Service allows you to create IP address-based machine groups and custom identifier-based machine groups. For more information, see Create an IP address-based machine group and Create a custom ID-based machine group.

  5. Select the newly created machine group and move it from the Source Server Groups section to the Applied Server Groups section. Then, click Next.
    Notice If you apply a machine group immediately after it is created, the heartbeat status of the machine group may be FAIL. This issue occurs because the machine group is not connected to Log Service. In this case, you can click Automatic Retry. If the issue persists, see What do I do if no heartbeat connections are detected on Logtail?
  6. In the Specify Data Source step, set the Config Name and Plug-in Config parameters.
    • inputs: Required. The Logtail configurations for log collection.
      Note You can configure only one type of data source in the inputs field.
    • processors: Optional. The Logtail configurations for data processing. You can configure one or more processing methods in the processors field. For more information, see Overview.
    {
     "inputs": [
         {
             "type": "metric_http",
             "detail": {
                 "IntervalMs": 1000,
                 "Addresses": [
                     "http://127.0.0.1/ngx_status"
                 ],
                 "IncludeBody": true
             }
         }
     ],
     "processors" : [
         {
             "type": "processor_regex",
             "detail" : {
                 "SourceKey": "content",
                 "Regex": "Active connections: (\\d+)\\s+server accepts handled requests\\s+(\\d+)\\s+(\\d+)\\s+(\\d+)\\s+Reading: (\\d+) Writing: (\\d+) Waiting: (\\d+). *",
                 "Keys": [
                     "connection",
                     "accepts",
                     "handled",
                     "requests",
                     "reading",
                     "writing",
                     "waiting"
                 ],
                 "FullMatch": true,
                 "NoKeyError": true,
                 "NoMatchError": true,
                 "KeepSource": false
             }
         }
     ]
    }
    Parameter Type Required Description
    type String. Yes The type of the data source. Set the value to metric_http.
    Addresses String. array. Yes The URLs to which requests are sent.
    Note A URL must start with http or https.
    IntervalMs Int. Yes The interval between two successive requests. Unit: milliseconds.
    Method String. No The request method. Default value: GET. The value must be uppercase letters.
    Body String. No The content of the HTTP request body. Default value: null.
    Headers Key: string. Value: string map. No The content of the HTTP request header. Default value: null.
    PerAddressSleepMs Int. No The interval at which requests are sent to URLs that are specified by the Addresses parameter. Unit: milliseconds. Default value: 100.
    ResponseTimeoutMs Int. No The timeout period for a request. Unit: milliseconds. Default value: 5000.
    IncludeBody Boolean. No Specifies whether to collect the request body. Default value: false. If you set the value to true, the content of the request body is stored in the field named content.
    FollowRedirects Boolean. No Specifies whether to automatically process URL redirects. Default value: false.
    InsecureSkipVerify Boolean. No Specifies whether to skip the HTTPS security check. Default value: false.
    ResponseStringMatch String. No Specifies whether to match the response body by using a regular expression. The result is saved to the field named _response_match_. If the response body matches the regular expression, the value of the field is yes. Otherwise, the value is no.

Result

After data is collected, you can view the data in the Log Service console. In addition to the data that is parsed by using the regular expression, you can view the HTTP request method, request URL, response time, status code, and request result.
"Index" : "7"  
"connection" : "1"  
"accepts" : "6079"  
"handled" : "6079"  
"requests" : "11596"  
"reading" : "0"  
"writing" : "1"  
"waiting" : "0"
"_method_" : "GET"  
"_address_" : "http://127.0.0.1/ngx_status"  
"_response_time_ms_" : "1.320"  
"_http_response_code_" : "200"  
"_result_" : "success"
By default, the following fields are uploaded for each request.
Field Description
_address_ The request URL.
_method_ The request method.
_response_time_ms_ The response latency. Unit: milliseconds.
_http_response_code_ The HTTP status code.
_result_ The request result. Valid values: success, invalid_body, match_regex_invalid, mismatch, and timeout.
_response_match_ Specifies whether the content of the response body matches the regular expression that is specified by the ResponseStringMatch parameter. If the ResponseStringMatch parameter is not specified, the value of this field is null. If the ResponseStringMatch parameter is specified, the value is yes or no.