The message data cleansing feature provides common message processing templates, such as message splitting, dynamic routing, and message enrichment. You can directly use these templates to process messages, or modify the code as needed. This topic describes the types and usage of message data cleansing templates for ApsaraMQ for RocketMQ.
Background information
A message data cleansing task provides basic operator capabilities and uses Function Compute for its underlying logic. After you create an ApsaraMQ for RocketMQ message data cleansing task, you can log on to the Function Compute console to customize the code and modify the corresponding function configuration.
Operator | Operator Capabilities |
Message filtering | Match message content using a regular expression. Send matched messages to the target. For more information, see event patterns. |
Message transformation | Replace message content based on string matching. For example, convert characters between uppercase and lowercase. Send the transformed messages to the target. For more information, see event content transformation. |
Split message content using a regular expression. Send each split message to the target. | |
Match message content using a regular expression. Route matched messages to their corresponding targets. Route unmatched messages to the default target. | |
Enrich message content using an enrichment source. For example, if the original message contains an AccountID, query a database using that AccountID. Then add the customer region to the message body before sending it to the target service. | |
Map message content using a regular expression. For example, mask sensitive fields or reduce message size to the minimum standard. |
Content splitting
Example
For example, consider the following student list.
message:
[Zhao San, male, 17, Class 4; Li Si, female, 17, Class 3; Wang Wu, male, 17, Class 4]You can split this message into individual student records. Then send each record as a separate message to its target service. The result is as follows.
message:
[Zhao San, male, 17, Class 4]
message:
[Li Si, female, 17, Class 3]
message:
[Wang Wu, male, 17, Class 4]Procedure
Log on to the ApsaraMQ for RocketMQ console.
In the navigation pane on the left, choose . Then, in the top menu bar, select a region.
On the Message sink page, click Create task.
In the Create message sink panel, configure the following items. Then click OK.
The key configuration items are described below. Keep all other settings at their default values.
Basic information
Configuration item
Description
Task name
Enter a task name.
Egress Type
Select RocketMQ. Supported message services include RocketMQ, RabbitMQ, Simple Message Queue (formerly MNS), and Kafka.
Resource configuration
Configuration item
Description
Source
Region
Select China (Hangzhou).
Version
Select a RocketMQ version. This example uses RocketMQ 5.x.
RocketMQ instance
Select the RocketMQ instance that produces messages.
Topic
Select the topic from the source instance.
Target
Version
Select the RocketMQ version that receives messages. This example uses RocketMQ 5.x.
Instance ID
Select the instance ID of the RocketMQ instance that receives messages.
Topic
Select the topic from the target instance.
Data processing
Message filtering: Select No filtering.
Message transformation: Select Custom configuration. For Message body (body), select Data cleansing. Then select Create function template. For Function template, select Content splitting transform_split. Then, modify the function code as needed.
After creation, you can log on to the Function Compute console to view the automatically created service and function.
Dynamic routing
Example
For example, consider the following toothpaste inventory list.
message:
[BrandA, toothpaste, $12.98, 100g
BrandB, toothpaste, $7.99, 80g
BrandC, toothpaste, $1.99, 100g]You can route this list to target topics using a custom dynamic rule. The rule is as follows.
If a message starts with BrandA, send it to BrandA-item-topic and BrandA-discount-topic.
If a message starts with BrandB, send it to BrandB-item-topic and BrandB-discount-topic.
You can send all other messages to Unknown-brand-topic.
The JSON representation of the rule is as follows.
{
"defaultTopic": "Unknown-brand-topic",
"rules": [
{
"regex": "^BrandA",
"targetTopics": [
"BrandA-item-topic",
"BrandA-discount-topic"
]
},
{
"regex": "^BrandB",
"targetTopics": [
"BrandB-item-topic",
"BrandB-discount-topic"
]
}
]
}Procedure
For detailed steps, see the procedure for content splitting. For Function template, select Dynamic routing dynamic_routing.
Content enrichment
Example
This example demonstrates how to enrich an IP address segment. Assume the access log for a service is as follows.
{
"accountID": "164901546557****",
"hostIP": "192.168.XX.XX"
}You must identify the origin of the IP address. You can store the mapping in a MySQL database.
CREATE TABLE `tb_ip` (
-> `IP` VARCHAR(256) NOT NULL,
-> `Region` VARCHAR(256) NOT NULL,
-> `ISP` VARCHAR(256) NOT NULL,
-> PRIMARY KEY (`IP`)
-> );The enriched message is as follows.
{
"accountID": "164901546557****",
"hostIP": "192.168.XX.XX",
"region": "beijing"
}Procedure
For detailed steps, see the procedure for content splitting. For Function template, select Content enrichment transform_enrichment.
Content mapping
Example
For example, consider the following employee registration data for a company. It includes private information, such as employee IDs and phone numbers.
Zhao San, ID 1, 131 1111 1111
Li Si, ID 2, 132 2222 2222
Wang Wu, ID 3, 133 3333 3333You can mask the private information in the messages. Then send them to the target service. The result is as follows.
Zhang*, Employee ID *, *** **** ****
Li*, Employee ID *, *** **** ****
Wang*, Employee ID *, *** **** ****Procedure
For detailed steps, see the procedure for content splitting. For Function template, select Content mapping transform_projection.