NGINX provides a built-in status page to help you monitor the status of NGINX, similar to PHP-FPM, Docker, and Apache. This topic describes how to use the Logtail feature of Log Service to collect NGINX status information. The topic also describes how to query and analyze the collected status information, create dashboards, and customize alerts to monitor your NGINX cluster.

Prepare the environment

Perform the following operations to enable the NGINX status page:

  1. Check whether NGINX supports the status feature. For more information, visit Module ngx_http_stub_status_module.
    To check whether NGINX supports the status feature, run the following command:
    nginx -V 2>&1 | grep -o with-http_stub_status_module
    with-http_stub_status_module 

    If the message with-http_stub_status_module is returned, NGINX supports the status feature.

  2. Configure the NGINX status feature. For more information, visit Enable Nginx Status Page.
    Enable the status feature in the NGINX configuration file. This file is stored in the /etc/nginx/nginx.conf directory. Use the following example to configure the status feature:
         location /private/nginx_status {
           stub_status on;
           access_log   off;
           allow 10.10.XX.XX;
           deny all;
         }
    						
    Note In the preceding example, only the server whose IP address is 10.10.XX.XX is allowed to access the Nginx status page.
  3. Verify that the machine on which Logtail is installed has access to the Nginx status page.
    Run the following command:
    $curl http://10.10.XX.XX/private/nginx_status
    Active connections: 1
    server accepts handled requests
    2507455 2507455 2512972
    Reading: 0 Writing: 1 Waiting: 0
    						

Collect data

  1. Install Logtail.
    For more information, see Install Logtail in Linux. Ensure that the version is 0.16.0 or later. If the version is earlier than 0.16.0, upgrade it to the latest version as prompted.
  2. Complete the data collection configurations.
    Collection process
    1. Create a Logstore in the Log Service console, and then click Import Data. On the dialog box that appears, select Nginx - Text Log.
    2. Set the NGINX monitoring URLs and relevant parameters as prompted to collect NGINX monitoring logs through HTTP.
      Note
      • Change the value of the Addresses field in the sample code to the URL list that you want to monitor.
      • If the NGINX status message returned is different from the default one, modify the processors field to parse the body of the returned HTTP message.
      Use the following sample code:
      {
      "inputs": [
       {
            "type": "metric_http",
            "detail": {
                "IntervalMs": 60000,
                "Addresses": [
                    "http://10.10.XX.XX/private/nginx_status",
                    "http://10.10.XX.XX/private/nginx_status",
                    "http://10.10.XX.XX/private/nginx_status"
                ],
                "IncludeBody": true
            }
       }
      ],
      "processors": [
       {
            "type": "processor_regex",
            "detail": {
                "SourceKey": "content",
                "Regex": "Active connections: (\\d+)\\s+server accepts handled requests\\s+(\\d+)\\s+(\\d+)\\s+(\\d+)\\s+Reading: (\\d+) Writing: (\\d+) Waiting: (\\d+)[\\s\\S]*",
                "Keys": [
                    "connection",
                    "accepts",
                    "handled",
                    "requests",
                    "reading",
                    "writing",
                    "waiting"
                ],
                "FullMatch": true,
                "NoKeyError": true,
                "NoMatchError": true,
                "KeepSource": false
            }
       }
      ]
      }
      									

Preview data

After the collection configurations are complete, wait for one minute, and then click Refresh next to Preview Data in the collection configuration wizard to view the collected status data. If you use Logtail to collect logs through HTTP, Logtail parses and uploads the body of each returned HTTP message. In addition, Logtail uploads the URL, HTTP status code, HTTP method name, response time, and request result.
Note If no data is collected, check whether the configurations are in a valid JSON format.
_address_:http://10.10.XX.XX/private/nginx_status  
_http_response_code_:200  
_method_:GET  
_response_time_ms_:1.83716261897  
_result_:success  
accepts:33591200  
connection:450  
handled:33599550  
reading:626  
requests:39149290  
waiting:68  
writing:145  
				

Query and analyze data

Before you can query and analyze data, enable and configure the index of the Logstore. For more information, see Enable and configure the index of a Logstore.

Perform custom queries

For information about how to query data, see Overview.
  • To query the status of the server whose IP address is 10.168.0.0, use the _address_ : 10.168.0.0 statement.
  • To query the requests whose response time is greater than 100 ms, use the _response_time_ms_ > 100 statement.
  • To query the requests whose HTTP status code is not 200, use the not _http_response_code_ : 200 statement.

Perform statistical analysis

For information about the statistical analysis syntax, see Syntax description.
  • To query the average number of waiting connections, reading connections, writing connections, and connections every five minutes, use the following statement:

    *| select  avg(waiting) as waiting, avg(reading)  as reading,  avg(writing)  as writing,  avg(connection)  as connection,  from_unixtime( __time__ - __time__ % 300) as time group by __time__ - __time__ % 300 order by time limit 1440
    						
  • To query servers whose maximum number of waiting connections ranks top 10, use the following statement:

    *| select  max(waiting) as max_waiting, address, from_unixtime(max(__time__)) as time group by address order by max_waiting desc limit 10
    						
  • To query the number of servers in your NGINX cluster and the number of invalid requests on these servers, use the following statement:

    * | select  count(distinct(address)) as total
    						
    not _result_ : success | select  count(distinct(address))
    						
  • To query the latest 10 failed requests, use the following statement:

    not _result_ : success | select _address_ as address, from_unixtime(__time__) as time  order by __time__ desc limit 10
    						
  • To query the total number of handled requests every 5 minutes, use the following statement:

    *| select  avg(handled) * count(distinct(address)) as total_handled, avg(requests) * count(distinct(address)) as total_requests,  from_unixtime( __time__ - __time__ % 300) as time group by __time__ - __time__ % 300 order by time limit 1440
    						
  • To query the average latency of requests every 5 minutes, use the following statement:

    *| select  avg(_response_time_ms_) as avg_delay,  from_unixtime( __time__ - __time__ % 300) as time group by __time__ - __time__ % 300 order by time limit 1440
    						
  • To query the number of valid and invalid requests, use the following statement:

    not _http_response_code_ : 200  | select  count(1)
    						
    _http_response_code_ : 200  | select  count(1)
    						

Create a dashboard

Configure an alert

  1. Save the query statement as a search.
    Save the following query statement as a search named invalid_nginx_status:
    not _http_response_code_ : 200 | select count(1) as invalid_count
  2. Configure an alert for the saved search. For more information, see Configure an alert. The following table provides example alert configurations.
    Parameter Value
    Alert Name invalid_nginx_alarm
    Add to New Dashboard nginx_status
    Chart Name total
    Query * | select count(distinct(__source__)) as total
    Search Period 15 minutes
    Frequency 15 minutes
    Trigger Condition total>100
    Notification Trigger Threshold 1
    Notifications Notifications
    Content Failed to obtain the NGINX status. Log on to the Log Service console to view the error message. Project: xxxxxxxx, Logstore: nginx_status.