All Products
Search
Document Center

Simple Log Service:Mask data using regular expressions

Last Updated:Dec 12, 2025

Data masking effectively reduces the exposure of sensitive data during processes such as transformation, transmission, and use. This practice lowers the risk of sensitive data leakage and protects user privacy. This topic describes common data masking scenarios, the corresponding methods, and provides examples for data transformation in Simple Log Service (SLS).

Background information

Data masking is commonly used for sensitive information such as phone numbers, bank card numbers, email addresses, IP addresses, AccessKeys (AKs), ID card numbers, URLs, and strings. In the SLS data transformation service, a common masking method is to use the regex_replace regular expression function.

Use case 1: Mask phone numbers

  • Data masking methods

    If a log contains a phone number that you do not want to expose, use the regex_replace function with a regular expression to mask it.

  • Example 1: Show the first three and last four digits of a phone number and hide the middle digits.

    • Raw log

      message:{"data":{"receivePhoneNo":"13812345678"}}
    • SPL orchestration rule

      * | extend message1 = regexp_replace(message, 'receivePhoneNo":\s*"([\+86]*1[3-9]{1}\d{1})\d{4}(\d{4,11})','receivePhoneNo":"\1****\2')
    • Transformation result

      message:{"data":{"receivePhoneNo":"13812345678"}}
      message1:{"data":{"receivePhoneNo":"138****5678"}}
  • Example 2: For Hong Kong and Macao phone numbers, show the first two and last two digits and hide the middle digits.

    • Raw log

      message:{"data":{"receivePhoneNo":"59092819"}}
    • SPL orchestration rule

      * | extend message1 = regexp_replace(message, 'receivePhoneNo":\s*"(5|6|7|8|9)(\d{1})(\d{4})(\d{2})\"','receivePhoneNo":"\1\2****\4"')
    • Results

      message:{"data":{"receivePhoneNo":"59092819"}}
      message1:{"data":{"receivePhoneNo":"59****19"}}
  • Example 3: For Taiwan phone numbers, show the first two and last two digits and hide the middle digits.

    • Raw log

      message:{"data":{"receivePhoneNo":"020928198"}}
    • SPL orchestration rule

      * | extend message1 = regexp_replace(message, 'receivePhoneNo":\s*"(0[2-9])(\d{5,6})(\d{2})\"','receivePhoneNo":"\1******\3"')
    • Transformation result

      message:{"data":{"receivePhoneNo":"020928198"}}
      message1:{"data":{"receivePhoneNo":"02******98"}}

Use case 2: Mask bank card information

  • Desensitization methods

    If a log contains bank card or credit card information, use the regex_replace function with a regular expression to mask it.

  • Example

    • Raw log

      content: bank number is 491648411333978312 and credit card number is 4916484113339780
    • SPL orchestration rule

      * | extend bank_number=regexp_replace(content, '([1-9]{1})(\d{11}|\d{13}|\d{14})(\d{4})', '****\3')
    • Transformation result

      content: bank number is 491648411333978312 and credit card number is 4916484113339780
      bank_number: bank number is ****978312 and credit card number is ***9780

Use case 3: Mask email addresses

  • Desensitization methods

    If a log contains an email address, use the regex_replace function with a regular expression to mask it.

  • Example 1: Hide the email prefix.

    • Raw log

      content: email is twiss2345@aliyun.com
    • SPL orchestration rule

      * | extend email_encrypt=regexp_replace(content, '[A-Za-z\d]+([-_.][A-Za-z\d]+)*(@([A-Za-z\d]+[-.])+[A-Za-z\d]{2,4})', '****\2')
    • Transformation result

      content: email is twiss2345@aliyun.com
      email_encrypt: email is ****@aliyun.com
  • Example 2: Mask an email address where the prefix before the at sign (@) has fewer than three characters and the suffix is fixed.

    • Raw log

      message:{"data":{"email":"tt@1111.com","icon":"ee@2.png"}}
    • SPL orchestration rule

      * | extend message1 = regexp_replace(message,'":\s*"([A-Za-z0-9._%+-]{1,2})(@\w+\.)(com|net|org)\"','":"\1**\2\3"')
    • Processed data

      message:{"data":{"email":"tt@1111.com","icon":"ee@2.png"}}
      message1:{"data":{"email":"tt**@1111.com","icon":"ee@2.png"}}
  • Example 3: Mask an email address where the prefix before the at sign (@) has more than three characters.

    • Raw log

      message:{"data":{"email":"ttewew@1111.com","icon":"esdse@2.png"}}
    • SPL orchestration rule

      * | extend message1 = regexp_replace(message, 'email":\s*"([A-Za-z0-9._%+-]{3})([A-Za-z0-9._%+-]*)(@)(\w+\.\w+)"','email":"\1**\3\4"')
    • Transformation result

      message:{"data":{"email":"ttewew@1111.com","icon":"esdse@2.png"}}
      message1:{"data":{"email":"tte**@1111.com","icon":"esdse@2.png"}}

Use case 4: Mask an AK

  • Data Masking

    If a log contains AK information, use the regex_replace function with a regular expression to mask it.

  • Example

    • Raw log

      content: ak id is rDhc9qxjhIhlBiyphP7buo5yg5h6Eq and ak key is XQr1EPtfnlZLYlQc
    • SPL orchestration rule

      * | extend akid_encrypt=regexp_replace(content, '([a-zA-Z0-9]{4})(([a-zA-Z0-9]{26})|([a-zA-Z0-9]{12}))', '\1****')
    • Transformation result

      content: ak id is rDhc9qxjhIhlBiyphP7buo5yg5h6Eq and ak key is XQr1EPtfnlZLYlQc
      akid_encrypt: ak id is rDhc**** and ak key is XQr1****

Use case 5: Mask an IP address

  • Data masking methods

    If a log contains an IP address, use the regex_replace function with a regular expression to mask it.

  • Example

    • Raw log

      content: ip is 192.168.1.1
    • SPL orchestration rule

      * | extend ip_encrypt=regexp_replace(content, '(\w+\s+\w+\s+)\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}', '\1****')
    • Transformation result

      content: ip is 192.168.1.1
      ip_encrypt: ip is ****

Use case 6: Mask an ID card number

  • Data masking methods

    If a log contains ID card information, use the regex_replace function with a regular expression to mask it.

  • Example 1: Show the first six digits of an ID card number and hide the rest.

    • Raw log

      content: Id card is 11010519491231002X
    • SPL orchestration rule

      * | extend id_encrypt=regexp_replace(content, '([\d]{4})[\d]{11}([\d]{2}[\d|Xx])', '\1****')
    • Transformation result

      content: Id card is 11010519491231002X
      id_encrypt: Id card is 110105****
  • Example 2: Show the first four and last three digits of an ID card number and hide the middle digits.

    • Raw log

      message:{"data":{"cardNumber":"410106171821090234","cardNumber":"E138123451","receivePhoneNo":"13812345678"}}
    • SPL orchestration rule

      * | extend message1 = regexp_replace(message, 'cardNumber":\s*"([\d]{4})[\d]{11}([\d]{2}[\d|Xx])\"','cardNumber":"\1****\2"')
    • Transformation result

      message:{"data":{"cardNumber":"410106171821090234","cardNumber":"E138123451","receivePhoneNo":"13812345678"}}
      message1:{"data":{"cardNumber":"4101****234","cardNumber":"E138123451","receivePhoneNo":"13812345678"}}
  • Example 3: For a passport number, show the first letter and the last three digits and hide the middle part.

    • Raw log

      message:{"data":{"cardNumber":"410106171821090234","cardNumber":"E138123451","receivePhoneNo":"13812345678"}}
    • SPL orchestration rule

      * | extend message1 = regexp_replace(message, 'cardNumber":\s*"([G|E|H|M|P|B|D])\d{6}(\d{3})\"','cardNumber":"\1****\2"')
    • Transformation result

      message:{"data":{"cardNumber":"410106171821090234","cardNumber":"E138123451","receivePhoneNo":"13812345678"}}
      message1:{"data":{"cardNumber":"410106171821090234","cardNumber":"E****451","receivePhoneNo":"13812345678"}}
  • Example 4: For a Hong Kong and Macao travel permit number, show the fifth to eighth digits and hide the preceding digits.

    • Raw log

      message:{"data":{"cardNumber":"18210902","cardNumber":"E138123451","receivePhoneNo":"13812345678"}}
    • SPL orchestration rule

      * | extend message1 = regexp_replace(message, 'cardNumber":\s*"([\d]{4})([\d]{4})\"','cardNumber":"****\2"')
    • Transformation result

      message:{"data":{"cardNumber":"18210902","cardNumber":"E138123451","receivePhoneNo":"13812345678"}}
      message1:{"data":{"cardNumber":"****0902","cardNumber":"E138123451","receivePhoneNo":"13812345678"}}
  • Example 5: Show only the first two and last two characters and hide the middle part.

    • Raw log

      message:{"data":{"cardNumber":"18210902","cardNumber":"E138123451","receivePhoneNo":"13812345678"}}
    • SPL orchestration rule

      * | extend message1 = regexp_replace(message, 'cardNumber":\s*"([A-Z])(\d{1})([\d]{6})([\d]{2})\"','cardNumber":"\1\2******\4"')
    • Transformation result

      message:{"data":{"cardNumber":"18210902","cardNumber":"E138123451","receivePhoneNo":"13812345678"}}
      message1:{"data":{"cardNumber":"18210902","cardNumber":"E1******51","receivePhoneNo":"13812345678"}}

Use case 7: Mask a URL

  • Masking method

    To mask a URL in a log, use the url_encode function to perform URL encoding.

  • Example

    • Raw log

      url: https://www.aliyun.com/sls?logstore
    • SPL orchestration rule

      * | extend encode_url=url_encode(url)
    • Transformation result

      url: https://www.aliyun.com/sls?logstore
      encode_url: https%3A%2F%2Fwww.aliyun.com%2Fsls%3Flogstore

Use case 8: Mask an order number

  • Data masking methods

    To mask an order number in a log and prevent others from decoding it, use the md5 encoding function.

  • Example

    • Raw log

      orderId: 15121412314
    • SPL orchestration rule

      * | extend md5_orderId=to_hex(md5(to_utf8(orderId)))
    • Transformation result

      orderId: 15121412314
      md5_orderId: 852751F9AA48303A5691B0D020E52A0A

Use case 9: Mask a name

  • Data masking methods

    If a log contains a name, use the regex_replace function with a regular expression to mask it.

  • Example: For an English name, show only the first letter of each part.

    • Raw log

      message:{"data":{"name":"Sam Alice"}}
    • SPL orchestration rule

      * | extend message1 = regexp_replace(message, 'name":\s*"([a-zA-Z])[a-zA-Z]+\s+([a-zA-Z])[a-zA-Z]+','name":"\1**** \2****"')
    • Transformation result

      message:{"data":{"name":"Sam Alice"}}
      message1:{"data":{"name":"S**** A****""}}