NGINX provides a built-in status page that allows you to monitor NGINX status. You can use Logtail plug-ins to collect NGINX monitoring logs. You can also perform various operations on the collected logs, such as querying and analyzing the logs and configuring alerts for the logs. This way, you can monitor your NGINX cluster in a comprehensive manner.

Prerequisites

Logtail is installed on your server. For more information, see Install Logtail on a Linux server or Install Logtail on a Windows server.
Note Linux Logtail V0.16.0 and later are supported. Windows Logtail V1.0.0.8 and later are supported.

Step 1: Prepare the environment

Perform the following steps to enable the NGINX status module:

  1. Run the following command to check whether the NGINX status module is supported. For more information, see Module ngx_http_stub_status_module.
    nginx -V 2>&1 | grep -o with-http_stub_status_module
    with-http_stub_status_module 

    If with-http_stub_status_module is returned, the NGINX status module is supported.

  2. Configure the NGINX status module.
    Enable the NGINX status module in the NGINX configuration file. The default NGINX configuration file is /etc/nginx/nginx.conf. The following sample code provides an example on how to enable the NGINX status module. For more information, see Enable Nginx Status Page.
    Note In the following example, allow 10.10.XX.XX specifies that only a server whose IP address is 10.10.XX.XX can access the NGINX status module.
         location /private/nginx_status {
           stub_status on;
           access_log   off;
           allow 10.10.XX.XX;
           deny all;
         }                       
  3. Run the following command to check whether the server on which Logtail is installed can access the NGINX status module:
    $curl http://10.10.XX.XX/private/nginx_status
    If the following information is returned, the NGINX status module is enabled:
    Active connections: 1
    server accepts handled requests
    2507455 2507455 2512972
    Reading: 0 Writing: 1 Waiting: 0                       

Step 2: Collect NGINX monitoring logs

  1. Log on to the Log Service console.
  2. In the Import Data section, click Custom Data Plug-in.
  3. Select the project and Logstore. Then, click Next.
  4. Create a machine group.
    • If a machine group is available, click Use Existing Machine Groups.
    • If no machine groups are available, perform the following steps to create a machine group. In this example, an Elastic Compute Service (ECS) instance is used.
      1. On the ECS Instances tab, select Manually Select Instances. Then, select the ECS instance that you want to use and click Create.

        For more information, see Install Logtail on ECS instances.

        Important If you want to collect logs from an ECS instance that belongs to a different Alibaba Cloud account, a server in an on-premises data center, or a server of a third-party cloud service provider, you must manually install Logtail. For more information, see Install Logtail on a Linux server or Install Logtail on a Windows server.

        After you manually install Logtail, you must configure a user identifier for the server. For more information, see Configure a user identifier.

      2. After Logtail is installed, click Complete Installation.
      3. In the Create Machine Group step, configure the Name parameter and click Next.

        Log Service allows you to create IP address-based machine groups and custom identifier-based machine groups. For more information, see Create an IP address-based machine group and Create a custom identifier-based machine group.

  5. Select the new machine group from Source Server Groups and move the machine group to Applied Server Groups. Then, click Next.
    Important If you apply a machine group immediately after you create the machine group, the heartbeat status of the machine group may be FAIL. This issue occurs because the machine group is not connected to Log Service. To resolve this issue, you can click Automatic Retry. If the issue persists, see What do I do if no heartbeat connections are detected on Logtail?
  6. In the Specify Data Source step, configure Config Name and Plug-in Config. Then, click Next.
    • inputs specifies the collection configurations of your data source. This parameter is required.
      Important You can specify only one type of data source in the inputs parameter.
    • processors specifies the processing configurations that are used to parse data. You can extract fields, extract log time, desensitize data, and filter logs. This parameter is optional. You can specify one or more processing methods. For more information, see Overview.
    {
    "inputs": [
     {
          "type": "metric_http",
          "detail": {
              "IntervalMs": 60000,
              "Addresses": [
                  "http://10.10.XX.XX/private/nginx_status",
                  "http://10.10.XX.XX/private/nginx_status",
                  "http://10.10.XX.XX/private/nginx_status"
              ],
              "IncludeBody": true
          }
     }
    ],
    "processors": [
     {
          "type": "processor_regex",
          "detail": {
              "SourceKey": "content",
              "Regex": "Active connections: (\\d+)\\s+server accepts handled requests\\s+(\\d+)\\s+(\\d+)\\s+(\\d+)\\s+Reading: (\\d+) Writing: (\\d+) Waiting: (\\d+)[\\s\\S]*",
              "Keys": [
                  "connection",
                  "accepts",
                  "handled",
                  "requests",
                  "reading",
                  "writing",
                  "waiting"
              ],
              "FullMatch": true,
              "NoKeyError": true,
              "NoMatchError": true,
              "KeepSource": false
          }
     }
    ]
    }                                

    The following table describes the key parameters.

    ParameterTypeRequiredDescription
    typeStringYesThe type of the data source. Set the value to metric_http.
    IntervalMsIntYesThe interval between two consecutive requests. Unit: milliseconds.
    AddressesArrayYesThe URLs that you want to monitor.
    IncludeBodyBooleanNoSpecifies whether to collect the body information of requests. Default value: false. If you set this parameter to true, the body information is collected and stored in the content field.
You can view the collected logs 1 minute after the Logtail configuration is created. The following example shows a collected log. By default, Log Service generates the nginx_status dashboard to display the results of query and analysis on the collected logs.
_address_:http://10.10.XX.XX/private/nginx_status  
_http_response_code_:200  
_method_:GET  
_response_time_ms_:1.83716261897  
_result_:success  
accepts:33591200  
connection:450  
handled:33599550  
reading:626  
requests:39149290  
waiting:68  
writing:145                  

Step 3: Query and analyze logs

  1. Log on to the Log Service console.
  2. In the Projects section, click the project that you want to view.
  3. Choose Log Storage > Logstores. On the Logstores tab, click the Logstore that you want to view.
  4. Enter a query statement in the search box, and then specify a time range.

    A query statement consists of a search statement and an analytic statement in the Search statement|Analytic statement format. For more information, see Search syntax and SQL syntax and functions.

    • Query logs
      • Query the information about an IP address.
        _address_ : 10.10.0.0
      • Query the requests whose response time is greater than 100 ms.
        _response_time_ms_ > 100
      • Query the requests whose HTTP status code is not 200.
        not _http_response_code_ : 200
    • Analyze logs
      • Obtain the average numbers of waiting connections, reading connections, writing connections, and connections at 5-minute intervals.
        *| select  avg(waiting) as waiting, avg(reading)  as reading,  avg(writing)  as writing,  avg(connection)  as connection,  from_unixtime( __time__ - __time__ % 300) as time group by __time__ - __time__ % 300 order by time limit 1440                       
      • Obtain the top 10 servers that have the largest number of waiting connections.
        *| select  max(waiting) as max_waiting, address, from_unixtime(max(__time__)) as time group by address order by max_waiting desc limit 10                        
      • Obtain the number of IP addresses.
        * | select  count(distinct(address)) as total                       
      • Obtain the number of IP addresses from which failed requests are initiated.
        not _result_ : success | select  count(distinct(address))                        
      • Obtain the IP addresses from which the most recent 10 failed requests are initiated.
        not _result_ : success | select _address_ as address, from_unixtime(__time__) as time  order by __time__ desc limit 10                       
      • Obtain the total number of requests at 5-minute intervals.
        *| select  avg(handled) * count(distinct(address)) as total_handled, avg(requests) * count(distinct(address)) as total_requests,  from_unixtime( __time__ - __time__ % 300) as time group by __time__ - __time__ % 300 order by time limit 1440                       
      • Obtain the average request latency at 5-minute intervals.
        *| select  avg(_response_time_ms_) as avg_delay,  from_unixtime( __time__ - __time__ % 300) as time group by __time__ - __time__ % 300 order by time limit 1440                      
      • Obtain the numbers of successful requests and failed requests.
        not _http_response_code_ : 200  | select  count(1)                     
        _http_response_code_ : 200  | select  count(1)