The practice cases in this topic are based on the actual data transformation requirements submitted in tickets during daily work. This topic describes how to use LOG domain specific language (DSL) orchestration to transform logs to meet the requirements.

Scenario: convert non-standard JSON objects to JSON data and spread the objects

Assume that you want to perform secondary nesting on the collected dictionary data and spread the data. You can convert the dictionary data to JSON data and then use the e_json function to spread the data.
  • Raw log
    content: {
      'referer': '-',
      'request': 'GET /phpMyAdmin',
      'status': 404,
      'data-1': {
        'aaa': 'Mozilla',
        'bbb': 'asde'
      },
      'data-2': {
        'up_adde': '-',
        'up_host': '-'
      }
    }
  • LOG DSL orchestration
    1. Convert data in the content field to the JSON format.
      e_set("content_json",str_replace(ct_str(v("content")),"'",'"'))
      The log after processing is as follows:
      content: {
        'referer': '-',
        'request': 'GET /phpMyAdmin',
        'status': 404,
        'data-1': {
          'aaa': 'Mozilla',
          'bbb': 'asde'
        },
        'data-2': {
          'up_adde': '-',
          'up_host': '-'
        }
      }
      content_json:  {
        "referer": "-",
        "request": "GET /phpMyAdmin",
        "status": 404,
        "data-1": {
          "aaa": "Mozilla",
          "bbb": "asde"
        },
        "data-2": {
          "up_adde": "-",
          "up_host": "-"
        }
      }
    2. Spread the standardized content_json data generated after the preceding processing. For example, set depth in the JSON data to 1 to spread data at the first layer.
      e_json("content_json",depth=1,fmt='full')
      The log after spreading is as follows:
      content_json.data-1.data-1:  {"aaa": "Mozilla", "bbb": "asde"}
      content_json.data-2.data-2:  {"up_adde": "-", "up_host": "-"}
      content_json.referer:  -
      content_json.request:  GET /phpMyAdmin
      content_json.status:  404
      If you set depth to 2, the log after spreading is as follows:
      content_json.data-1.aaa:  Mozilla
      content_json.data-1.bbb:  asde
      content_json.data-2.up_adde:  -
      content_json.data-2.up_host:  -
      content_json.referer:  -
      content_json.request:  GET /phpMyAdmin
      content_json.status:  404
    3. To sum up, use the following LOG DSL rules:
      e_set("content_json",str_replace(ct_str(v("content")),"'",'"'))
      e_json("content_json",depth=2,fmt='full')
  • Log after transformation
    After the log is transformed by setting depth to 2, the following log is generated:
    content:  {
      'referer': '-',
      'request': 'GET /phpMyAdmin',
      'status': 404,
      'data-1': {
        'aaa': 'Mozilla',
        'bbb': 'asde'
      },
      'data-2': {
        'up_adde': '-',
        'up_host': '-'
      }
    }
    content_json:  {
      "referer": "-",
      "request": "GET /phpMyAdmin",
      "status": 404,
      "data-1": {
        "aaa": "Mozilla",
        "bbb": "asde"
      },
      "data-2": {
        "up_adde": "-",
        "up_host": "-"
      }
    }
    content_json.data-1.aaa:  Mozilla
    content_json.data-1.bbb:  asde
    content_json.data-2.up_adde:  -
    content_json.data-2.up_host:  -
    content_json.referer:  -
    content_json.request:  GET /phpMyAdmin
    content_json.status:  404

Convert logs in other text formats to JSON and spread the data

To spread non-standard JSON data, you can flexibly combine rules.
  • Raw log
    content : {
      "pod" => {
        "name" => "crm-learning-follow-7bc48f8b6b-m6kgb"
      }, "node" => {
        "name" => "tw5"
      }, "labels" => {
        "pod-template-hash" => "7bc48f8b6b", "app" => "crm-learning-follow"
      }, "container" => {
        "name" => "crm-learning-follow"
      }, "namespace" => "testing1"
    }
  • LOG DSL orchestration
    1. Convert the log to the JSON format by using the str_logtash_config_normalize function.
      e_set("normalize_data",str_logtash_config_normalize(v("content")))
    2. Use a JSON function to spread the data.
      e_json("normalize_data",depth=1,fmt='full')
    3. To sum up, use the following LOG DSL rules:
      e_set("normalize_data",str_logtash_config_normalize(v("content")))
      e_json("normalize_data",depth=1,fmt='full')
  • Log after transformation
    content : {
      "pod" => {
        "name" => "crm-learning-follow-7bc48f8b6b-m6kgb"
      }, "node" => {
        "name" => "tw5"
      }, "labels" => {
        "pod-template-hash" => "7bc48f8b6b", "app" => "crm-learning-follow"
      }, "container" => {
        "name" => "crm-learning-follow"
      }, "namespace" => "testing1"
    }
    normalize_data:  {
      "pod": {
        "name": "crm-learning-follow-7bc48f8b6b-m6kgb"
      },
      "node": {
        "name": "tw5"
      },
      "labels": {
        "pod-template-hash": "7bc48f8b6b",
        "app": "crm-learning-follow"
      },
      "container": {
        "name": "crm-learning-follow"
      },
      "namespace": "testing1"
    }
    normalize_data.container.container:  {"name": "crm-learning-follow"}
    normalize_data.labels.labels:  {"pod-template-hash": "7bc48f8b6b", "app": "crm-learning-follow"}
    normalize_data.namespace:  testing1
    normalize_data.node.node:  {"name": "tw5"}
    normalize_data.pod.pod:  {"name": "crm-learning-follow-7bc48f8b6b-m6kgb"}

Convert text written in special encoding formats

Hexadecimal characters that are recorded in daily work need to be decoded before they can be read. Use the str_hex_escape_encode function to perform the escape operation on hexadecimal characters.
  • Raw log
    content : "\xe4\xbd\xa0\xe5\xa5\xbd"
  • LOG DSL orchestration
    e_set("hex_encode",str_hex_escape_encode(v("content")))
  • Log after transformation
    content : "\xe4\xbd\xa0\xe5\xa5\xbd"
    hex_encode : "Hello"

Spread XML fields

You may encounter various types of data during your daily work, such as XML data. To convert XML data to JSON, use the xml_to_json function.
  • Test log
    str : <? xmlversion="1.0"? >
    <data>
        <countryname="Liechtenstein">
            <rank>1</rank>
            <year>2008</year>
            <gdppc>141100</gdppc>
            <neighborname="Austria"direction="E"/>
            <neighborname="Switzerland"direction="W"/>
        </country>
        <countryname="Singapore">
            <rank>4</rank>
            <year>2011</year>
            <gdppc>59900</gdppc>
            <neighborname="Malaysia"direction="N"/>
        </country>
        <countryname="Panama">
            <rank>68</rank>
            <year>2011</year>
            <gdppc>13600</gdppc>
            <neighborname="Costa Rica"direction="W"/>
            <neighborname="Colombia"direction="E"/>
        </country>
    </data>
  • LOG DSL orchestration
    e_set("str_json",xml_to_json(v("str")))
  • Log after transformation
    str : <? xmlversion="1.0"? >
    <data>
        <countryname="Liechtenstein">
            <rank>1</rank>
            <year>2008</year>
            <gdppc>141100</gdppc>
            <neighborname="Austria"direction="E"/>
            <neighborname="Switzerland"direction="W"/>
        </country>
        <countryname="Singapore">
            <rank>4</rank>
            <year>2011</year>
            <gdppc>59900</gdppc>
            <neighborname="Malaysia"direction="N"/>
        </country>
        <countryname="Panama">
            <rank>68</rank>
            <year>2011</year>
            <gdppc>13600</gdppc>
            <neighborname="Costa Rica"direction="W"/>
            <neighborname="Colombia"direction="E"/>
        </country>
    </data>
    str_dict :{
      "data": {
        "country": [{
          "@name": "Liechtenstein",
          "rank": "1",
          "year": "2008",
          "gdppc": "141100",
          "neighbor": [{
            "@name": "Austria",
            "@direction": "E"
          }, {
            "@name": "Switzerland",
            "@direction": "W"
          }]
        }, {
          "@name": "Singapore",
          "rank": "4",
          "year": "2011",
          "gdppc": "59900",
          "neighbor": {
            "@name": "Malaysia",
            "@direction": "N"
          }
        }, {
          "@name": "Panama",
          "rank": "68",
          "year": "2011",
          "gdppc": "13600",
          "neighbor": [{
            "@name": "Costa Rica",
            "@direction": "W"
          }, {
            "@name": "Colombia",
            "@direction": "E"
          }]
        }]
      }
    }