Like PHP-FPM, Docker, and Apache, Nginx provides a built-in status page to help you monitor the Nginx status. This topic describes how to use Logtail of Log Service to collect Nginx status information, query and analyze the collected status information, create dashboards, and customize alerts for monitoring your Nginx cluster in a comprehensive manner.

Prepare the environment

Perform the following steps to enable the Nginx status page:

  1. Check whether Nginx supports the status feature.

    Run the following command to check whether Nginx supports the status feature:

    nginx -V 2>&1 | grep -o with-http_stub_status_module
    with-http_stub_status_module
    					

    If the message with-http_stub_status_module is returned, Nginx supports the status feature.

  2. Configure the Nginx status feature.

    Enable the status feature in the Nginx configuration file, which is /etc/nginx/nginx.conf by default. Use the following sample code to configure the feature:

         location /private/nginx_status {
           stub_status on;
           access_log   off;
           allow 10.10.XX.XX;
           deny all;
         }
    					
    Note The preceding configuration allows only the server whose IP address is 10.10.XX.XX to access the Nginx status feature.
  3. Check whether the server where Logtail is installed has access to the Nginx status feature.

    Run the following command to test the feature:

    $curl http://10.10.XX.XX/private/nginx_status
    Active connections: 1
    server accepts handled requests
    2507455 2507455 2512972
    Reading: 0 Writing: 1 Waiting: 0
    					

Collect data

  1. Install Logtail.

    After installing Logtail, verify that the Logtail version is 0.16.0 or later. If the version is earlier than 0.16.0, upgrade it to the latest version as prompted.

  2. Create a collection configuration.
    Create a collection configuration
    1. Create a Logstore in the Log Service console. To create a collection configuration, select Nginx - Text Log in the Import Data dialog box.
    2. Configure the URLs and relevant parameters of Nginx monitoring as prompted to collect Nginx monitoring logs by using HTTP methods.
      Note
      • Change the value of the Addresses field in the sample code to the URL list that you want to monitor.
      • If your Nginx status message returned is different from the default one, modify the processors field to parse the body of the returned HTTP message.

      Use the following sample code:

      {
      "inputs": [
       {
            "type": "metric_http",
            "detail": {
                "IntervalMs": 60000,
                "Addresses": [
                    "http://10.10.XX.XX/private/nginx_status",
                    "http://10.10.XX.XX/private/nginx_status",
                    "http://10.10.XX.XX/private/nginx_status"
                ],
                "IncludeBody": true
            }
       }
      ],
      "processors": [
       {
            "type": "processor_regex",
            "detail": {
                "SourceKey": "content",
                "Regex": "Active connections: (\\d+)\\s+server accepts handled requests\\s+(\\d+)\\s+(\\d+)\\s+(\\d+)\\s+Reading: (\\d+) Writing: (\\d+) Waiting: (\\d+)[\\s\\S]*",
                "Keys": [
                    "connection",
                    "accepts",
                    "handled",
                    "requests",
                    "reading",
                    "writing",
                    "waiting"
                ],
                "FullMatch": true,
                "NoKeyError": true,
                "NoMatchError": true,
                "KeepSource": false
            }
       }
      ]
      }
      							

Preview data

After the collection configuration is created, wait for 1 minute and then click Refresh next to Preview Data in the collection configuration wizard. You can view the status data that is collected. If you use Logtail to collect logs by using HTTP methods, it not only parses and uploads the body of each returned HTTP message, but also uploads the URL, HTTP status code, HTTP method name, response time, and request result.
Note If no data is collected, check whether the configuration is in valid JSON format.
_address_:http://10.10.XX.XX/private/nginx_status  
_http_response_code_:200  
_method_:GET  
_response_time_ms_:1.83716261897  
_result_:success  
accepts:33591200  
connection:450  
handled:33599550  
reading:626  
requests:39149290  
waiting:68  
writing:145  
			

Query and analyze data

Before querying and analyzing data, enable and create indexes.

Customize queries

For more information about how to query data, see Overview. For example, you can query the following data:
  1. Query the status of the server whose IP address is 10.168.0.0: _address_ : 10.168.0.0.
  2. Query the requests whose response time is greater than 100 ms: _response_time_ms_ > 100.
  3. Query the requests whose HTTP status code is not 200: not _http_response_code_ : 200.

Perform statistical analysis

For more information about the statistical analysis syntax, see Syntax description. For example, you can use the following statements:

  • Query the average number of waiting connections, reading connections, writing connections, and connections every 5 minutes:

    *| select  avg(waiting) as waiting, avg(reading)  as reading,  avg(writing)  as writing,  avg(connection)  as connection,  from_unixtime( __time__ - __time__ % 300) as time group by __time__ - __time__ % 300 order by time limit 1440
    					
  • Query servers whose maximum number of waiting connections ranks top 10:

    *| select  max(waiting) as max_waiting, address, from_unixtime(max(__time__)) as time group by address order by max_waiting desc limit 10
    					
  • Query the current number of servers in your Nginx cluster and the number of invalid requests on these servers:

    * | select  count(distinct(address)) as total
    					
    not _result_ : success | select  count(distinct(address))
    					
  • Query the latest 10 failed requests:

    not _result_ : success | select _address_ as address, from_unixtime(__time__) as time  order by __time__ desc limit 10
    					
  • Query the total number of handled requests every 5 minutes:

    *| select  avg(handled) * count(distinct(address)) as total_handled, avg(requests) * count(distinct(address)) as total_requests,  from_unixtime( __time__ - __time__ % 300) as time group by __time__ - __time__ % 300 order by time limit 1440
    					
  • Query the average latency of requests every 5 minutes:

    *| select  avg(_response_time_ms_) as avg_delay,  from_unixtime( __time__ - __time__ % 300) as time group by __time__ - __time__ % 300 order by time limit 1440
    					
  • Query the number of valid and invalid requests:

    not _http_response_code_ : 200  | select  count(1)
    					
    _http_response_code_ : 200  | select  count(1)
    					

Create a dashboard

By default, Log Service provides dashboards for you to view Nginx monitoring data. You can create a dashboard for the monitored Nginx status. For more information, see Create and delete a dashboard.

Configure an alert

  1. Save the following query statement as a saved search named invalid_nginx_status: not _http_response_code_ : 200 | select count(1) as invalid_count.
  2. Configure an alert for this saved search. The following table provides an alert configuration example.
    Parameter Value
    Alert Name invalid_nginx_alarm
    Add to New Dashboard nginx_status
    Chart Name total
    Query * | select count(distinct(__source__)) as total
    Search Period 15 minutes
    Check Frequency 15 minutes
    Trigger Condition total>100
    Notification Trigger Threshold 1
    ActionType Notifications
    Content An error occurred while obtaining the Nginx status. Log on to the Log Service console to view the error message. Project: xxxxxxxx, Logstore: nginx_status.