All Products
Search
Document Center

Simple Log Service:Use the mask function for data masking

Last Updated:Dec 21, 2025

Alibaba Cloud Simple Log Service (SLS) provides the mask function for data masking. This function overcomes the limitations of traditional regular expression methods and is more versatile and effective for protecting sensitive data and meeting compliance requirements. This topic describes how to use the function in typical use cases.

Limits

The mask function is available only for the data transformation (new version) and ingest processor.

Overview of data masking solutions

Background

In the age of AI, data is growing rapidly. Protecting personal information has become a key part of business compliance. Laws such as the General Data Protection Regulation (GDPR) have strict requirements for handling sensitive data.

Regular expression masking

Alibaba Cloud SLS provides a comprehensive data masking system. It supports flexible pipelines that combine data ingestion and masking to meet a variety of business needs.

  • LoongCollector client-side masking:

    • Configure the data masking plugin in the collection configuration to replace sensitive fields using regular expression matching.

    • Use the regexp_replace function in a Structured Process Language (SPL) statement to perform high-performance masking on the server side.

  • Combined masking with LoongCollector/SDK and ingest processors:

    • The LoongCollector client or a software development kit (SDK) handles data ingestion. An ingest processor then uses the regexp_replace function to perform masking. This method prevents high resource usage on the client side.

Upgraded masking solution: mask function

Regular expression masking can be complex, slow, and difficult to adapt. To address these issues, SLS provides the new mask function. This function offers a simpler, faster, and more intelligent data masking solution.

mask function

Function syntax

mask(field, varchar params)

Parameters

Parameter

Description

field

The name of the source field that contains the data to mask.

param

This parameter is a JSON array. You can define one or more param rules.

param rules

Parameter

Required

Description

mode

Yes

Select a masking mode:

  • keyword: Keyword matching. Intelligently identifies sensitive information in common key-value pair formats, such as "key":"value", 'key':'value', or key=value, within any text.

  • buildin: Built-in rule matching.

types

Required if mode is set to buildin.

A list of built-in rules:

  • CREDIT_CARD: Matches credit or debit card numbers. These numbers have 16 to 19 digits. The rule supports formats for Visa (starts with 4), Mastercard (starts with 51-55 or 2221-2720), Amex (starts with 34 or 37), Discover (starts with 6011 or 65), and UnionPay (starts with 62).

  • IP_ADDRESS: Matches IPv4 addresses. The format is xxx.xxx.xxx.xxx.

  • EMAIL: Matches standard email addresses.

    • Format: local-part@domain.tld.

    • Rule: The local part supports letters, numbers, underscores, hyphens, and periods. It must contain exactly one at sign (@). The domain part allows letters, numbers, and hyphens. It must contain a period, and the top-level domain (TLD) at the end must have at least 2 characters.

keys

Required if mode is set to keyword.

Defines a list of keywords to match, such as ["userName", "wallet"].

maskChar

No

The character used for masking. The default is *.

keepPrefix

No

The number of characters to keep at the beginning of the field. For example, keepPrefix:3 keeps the first 3 characters.

keepSuffix

No

The number of characters to keep at the end of the field. For example, keepSuffix:3 keeps the last 3 characters.

Examples

Example 1: Mask transaction data

A DeFi platform processes thousands of on-chain transactions every day. Each transaction generates a detailed log. These logs contain sensitive data, such as wallet addresses, transaction hashes, and user personas. To comply with data protection regulations, you must mask this sensitive data while ensuring the logs can still be used for business analysis and troubleshooting. This example masks the wallet address, address, source IP, phone number, and transaction hash fields, retaining the first 3 and last 3 characters of each field for traceability.

  • Raw data

    2025-08-20 18:04:40,998 INFO  blockchain-event-poller-3 [10.0.1.20] [com.service.listener.TransactionStatusListener:65] [TransactionStatusListener#handleSuccessfulTransaction]{"message":"On-chain transaction successfully confirmed","confirmationDetails":{"transactionHash":"0x2baf892e9a164b1979","status":"success","blockNumber":45101239,"gasUsed":189543,"effectiveGasPrice":"58.2 Gwei","userProfileSnapshot":{"wallet":"0x71C7656EC7a5f6d8A7C4","sourceIp":"203.0.113.55","phone":"19901012345","address":"No. 1000 Wenming Road, Pudong New Area, Shanghai","birthday":null}}}
  • SPL statement

    Use the following SPL statement in the data processor.

    *| extend content =  mask(content,'[
               {"mode":"keyword","keys":["wallet","address","sourceIp","phone","transactionHash"], "maskChar":"*","keepPrefix":3,"keepSuffix":3}
             ]')
  • Output

    2025-08-20 18:04: 40, 998 INFO blockchain-event-poller-3 [10.0.1.20][com.service.listener.TransactionStatusListener: 65]][TransactionStatusListener#handleSuccessfulTransaction]{"message": "On-chain transaction successfully confirmed", "confirmationDetails": {"transactionHash": "0×2**************979", "status": "success", "blockNumber": 45101239, "gasUsed": 189543, "effectiveGasPrice": "58.2 Gwei", "userProfileSnapshot": {"wallet": "0x7****************7C4", "sourceIp": "203******.55", "phone": "199*****345", "address": "Shanghai*********No. 00", "birthday": null}}}

Example 2: Mask sensitive URI parameters in NGINX logs

An e-commerce platform's API gateway processes millions of requests every day. This example demonstrates how to mask the uid and token parameters in the request URI, retaining the first 2 and last 2 characters of each parameter.

  • Raw data

    This is a typical API access log URI that contains user identity and session authentication information.

    http_protocol: HTTP/1.1
    remote_addrs: 127.0.0.1
    request_time:  5000
    status: 302
    time_local: 2025-08-19T18:52:03+08: 00
    uri: "uid=user12345&token=bf81639a41d604&from=web"
    user_agent: Mozilla/5.0(Windows NT 5.2; WOW64))AppleWebKit/535.1(
    (KHTML, like Gecko) Chrome/13.0.782.41 Safari/535.1
  • SPL statement

    Use the keyword mode in the data processor to find and selectively mask the URI parameters.

    * | extend uri=mask(uri, '[
    {"mode": "keyword", "keys": ["uid", "loginIp", "token"],  "maskChar": "*", "keepPrefix": 2, "keepSuffix": 2}
    ]')
  • Output

    http_protocol: HTTP/1.1
    remote_addrs: 127.0.0.1
    request_time:  5000
    status: 302
    time_local: 2025-08-19T18:52:03+08: 00
    uri: uid=us*****45&token=bf**********04&from=web
    user_agent: Mozilla/5.0(Windows NT 5.2; WOW64))AppleWebKit/535.1(
    (KHTML, like Gecko) Chrome/13.0.782.41 Safari/535.1