Alibaba Cloud Simple Log Service (SLS) provides the mask function for data masking. This function overcomes the limitations of traditional regular expression methods and is more versatile and effective for protecting sensitive data and meeting compliance requirements. This topic describes how to use the function in typical use cases.
Limits
The mask function is available only for the data transformation (new version) and ingest processor.
Overview of data masking solutions
Background
In the age of AI, data is growing rapidly. Protecting personal information has become a key part of business compliance. Laws such as the General Data Protection Regulation (GDPR) have strict requirements for handling sensitive data.
Regular expression masking
Alibaba Cloud SLS provides a comprehensive data masking system. It supports flexible pipelines that combine data ingestion and masking to meet a variety of business needs.
LoongCollector client-side masking:
Configure the data masking plugin in the collection configuration to replace sensitive fields using regular expression matching.
Use the regexp_replace function in a Structured Process Language (SPL) statement to perform high-performance masking on the server side.
Combined masking with LoongCollector/SDK and ingest processors:
The LoongCollector client or a software development kit (SDK) handles data ingestion. An ingest processor then uses the regexp_replace function to perform masking. This method prevents high resource usage on the client side.
Upgraded masking solution: mask function
Regular expression masking can be complex, slow, and difficult to adapt. To address these issues, SLS provides the new mask function. This function offers a simpler, faster, and more intelligent data masking solution.
mask function
Function syntax
mask(field, varchar params)Parameters
Parameter | Description |
field | The name of the source field that contains the data to mask. |
param | This parameter is a JSON array. You can define one or more param rules. |
param rules
Parameter | Required | Description |
mode | Yes | Select a masking mode:
|
types | Required if | A list of built-in rules:
|
keys | Required if | Defines a list of keywords to match, such as |
maskChar | No | The character used for masking. The default is |
keepPrefix | No | The number of characters to keep at the beginning of the field. For example, |
keepSuffix | No | The number of characters to keep at the end of the field. For example, |
Examples
Example 1: Mask transaction data
A DeFi platform processes thousands of on-chain transactions every day. Each transaction generates a detailed log. These logs contain sensitive data, such as wallet addresses, transaction hashes, and user personas. To comply with data protection regulations, you must mask this sensitive data while ensuring the logs can still be used for business analysis and troubleshooting. This example masks the wallet address, address, source IP, phone number, and transaction hash fields, retaining the first 3 and last 3 characters of each field for traceability.
Raw data
2025-08-20 18:04:40,998 INFO blockchain-event-poller-3 [10.0.1.20] [com.service.listener.TransactionStatusListener:65] [TransactionStatusListener#handleSuccessfulTransaction]{"message":"On-chain transaction successfully confirmed","confirmationDetails":{"transactionHash":"0x2baf892e9a164b1979","status":"success","blockNumber":45101239,"gasUsed":189543,"effectiveGasPrice":"58.2 Gwei","userProfileSnapshot":{"wallet":"0x71C7656EC7a5f6d8A7C4","sourceIp":"203.0.113.55","phone":"19901012345","address":"No. 1000 Wenming Road, Pudong New Area, Shanghai","birthday":null}}}SPL statement
Use the following SPL statement in the data processor.
*| extend content = mask(content,'[ {"mode":"keyword","keys":["wallet","address","sourceIp","phone","transactionHash"], "maskChar":"*","keepPrefix":3,"keepSuffix":3} ]')Output
2025-08-20 18:04: 40, 998 INFO blockchain-event-poller-3 [10.0.1.20][com.service.listener.TransactionStatusListener: 65]][TransactionStatusListener#handleSuccessfulTransaction]{"message": "On-chain transaction successfully confirmed", "confirmationDetails": {"transactionHash": "0×2**************979", "status": "success", "blockNumber": 45101239, "gasUsed": 189543, "effectiveGasPrice": "58.2 Gwei", "userProfileSnapshot": {"wallet": "0x7****************7C4", "sourceIp": "203******.55", "phone": "199*****345", "address": "Shanghai*********No. 00", "birthday": null}}}
Example 2: Mask sensitive URI parameters in NGINX logs
An e-commerce platform's API gateway processes millions of requests every day. This example demonstrates how to mask the uid and token parameters in the request URI, retaining the first 2 and last 2 characters of each parameter.
Raw data
This is a typical API access log URI that contains user identity and session authentication information.
http_protocol: HTTP/1.1 remote_addrs: 127.0.0.1 request_time: 5000 status: 302 time_local: 2025-08-19T18:52:03+08: 00 uri: "uid=user12345&token=bf81639a41d604&from=web" user_agent: Mozilla/5.0(Windows NT 5.2; WOW64))AppleWebKit/535.1( (KHTML, like Gecko) Chrome/13.0.782.41 Safari/535.1SPL statement
Use the
keywordmode in the data processor to find and selectively mask the URI parameters.* | extend uri=mask(uri, '[ {"mode": "keyword", "keys": ["uid", "loginIp", "token"], "maskChar": "*", "keepPrefix": 2, "keepSuffix": 2} ]')Output
http_protocol: HTTP/1.1 remote_addrs: 127.0.0.1 request_time: 5000 status: 302 time_local: 2025-08-19T18:52:03+08: 00 uri: uid=us*****45&token=bf**********04&from=web user_agent: Mozilla/5.0(Windows NT 5.2; WOW64))AppleWebKit/535.1( (KHTML, like Gecko) Chrome/13.0.782.41 Safari/535.1