All Products
Search
Document Center

Simple Log Service:Use SPL regular expressions to parse NGINX logs

Last Updated:Nov 10, 2025

NGINX access logs record detailed information about user access. Parsing these logs is crucial for business operations and maintenance (O&M). This topic describes how to use regular expression functions to parse NGINX access logs.

Parse standard NGINX logs

Simple Log Service lets you parse NGINX logs using Structured Process Language (SPL) regular expressions. The following example shows how to use regular expressions to parse a successful NGINX access log.

  • Raw log

    __source__:  192.168.0.1
    __tag__:__client_ip__:  192.168.254.254
    __tag__:__receive_time__:  1563443076
    content: 192.168.0.2 - - [04/Jan/2019:16:06:38 +0800] "GET http://example.aliyundoc.com/_astats?application=&inf.name=eth0 HTTP/1.1" 200 273932 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.example.com/bot.html)"
  • Parsing requirements

    • Requirement 1: Extract the code, ip, datetime, protocol, request, sendbytes, referer, useragent, and verb fields from the NGINX log.

    • Requirement 2: Further parse the request field to extract the uri_proto, uri_domain, and uri_param fields.

    • Requirement 3: Further parse the extracted uri_param field to extract the uri_path and uri_query fields.

  • SLS SPL orchestration

    • Complete orchestration

      * | parse-regexp content, '(\d+\.\d+\.\d+\.\d+) - - \[([\s\S]+)\] \"([A-Z]+) ([\S]*) ([\S]+)["] (\d+) (\d+) ["]([\S]*)["] ["]([\S\s]+)["]' as ip, datetime,verb,request,protocol,code,sendbytes,refere,useragent
        | parse-regexp request, '^(\w+):\/\/([^\/]+)(\/.*)$' as uri_proto, uri_domain, uri_param
        | parse-regexp uri_param, '([^?]*)\?(.*)' as uri_path, uri_query
    • Orchestration breakdown and corresponding results

      • The SPL orchestration for Requirement 1 is as follows.

        * | parse-regexp content, '(\d+\.\d+\.\d+\.\d+) - - \[([\s\S]+)\] \"([A-Z]+) ([\S]*) ([\S]+)["] (\d+) (\d+) ["]([\S]*)["] ["]([\S\s]+)["]' as ip, datetime,verb,request,protocol,code,sendbytes,refere,useragent

        Corresponding result:

        __source__:  192.168.0.1
        __tag__:  __receive_time__:  1563443076
        code:  200
        content:  192.168.0.2 - - [04/Jan/2019:16:06:38 +0800] "GET http://example.aliyundoc.com/_astats?application=&inf.name=eth0 HTTP/1.1" 200 273932 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.example.com/bot.html)"httpversion:  1.1
        datetime:  04/Jan/2019:16:06:38 +0800
        ip:  192.168.0.2
        protocol:  HTTP/1.1
        refere:  -
        request:  http://example.aliyundoc.com/_astats?application=&inf.name=eth0
        sendbytes:  273932
        useragent:  Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.example.com/bot.html)
        verb:  GET
    • The SPL orchestration for Requirement 2 is as follows.

      * | parse-regexp request, '^(\w+):\/\/([^\/]+)(\/.*)$' as uri_proto, uri_domain, uri_param

      Corresponding result:

      uri_param: /_astats?application=&inf.name=eth0
      uri_domain: example.aliyundoc.com
      uri_proto: http
    • The SPL orchestration for Requirement 3 is as follows.

      * | parse-regexp uri_param, '([^?]*)\?(.*)' as uri_path, uri_query

      Corresponding result:

      uri_path: /_astats
      uri_query: application=&inf.name=eth0
  • Final SPL processing result

    __source__:  192.168.0.1
    __tag__:  __receive_time__:  1563443076
    code:  200
    content:  192.168.0.2 - - [04/Jan/2019:16:06:38 +0800] "GET http://example.aliyundoc.com/_astats?application=&inf.name=eth0 HTTP/1.1" 200 273932 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.example.com/bot.html)"httpversion:  1.1
    datetime:  04/Jan/2019:16:06:38 +0800
    ip:  192.168.0.2
    protocol:  HTTP/1.1
    refere:  -
    request:  http://example.aliyundoc.com/_astats?application=&inf.name=eth0
    sendbytes:  273932
    uri_domain:  example.aliyundoc.com
    uri_proto:  http
    uri_param: /_astats?application=&inf.name=eth0
    uri_path: /_astats
    uri_query: application=&inf.name=eth0
    useragent:  Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.example.com/bot.html)
    verb:  GET

Parse non-standard NGINX logs

Use case 1: Extract keywords from the middle of a log

You can use a regular expression with the parse-regexp function to extract the `Time`, `Level`, `Server`, and `Info` fields from the middle of the `message` field.

  • Example

    • Raw log

      {"message": "[2024-10-11 10:30:34.917962]\t[info]\t[SingleWorldService]\t[ResourceManager:testOut for 2, srvClusterId=1009]\t[[]     ...ewEntities/ResourceServiceComponent/ResourceManager.out:190]"}
    • SPL orchestration

      *| parse-regexp message, '\[([^[\]]+)\]\s+\[([^[\]]+)\]\s+\[([^[\]]+)\]\s+\[([^[\]]+)\]' as Time,Level,Server,Info
    • Processing result

      Time:2024-10-11 10:30:34.917962
      Level:info
      Server:SingleWorldService
      Info:ResourceManager:testOut for 2, srvClusterId=1009
      message:[2024-10-11 10:30:34.917962]	[info]	[SingleWorldService]	[ResourceManager:testOut for 2, srvClusterId=1009]	[[]     ...ewEntities/ResourceServiceComponent/ResourceManager.out:190]