This topic describes the best practices for using mapping tables.
A mapping table stores static data. It has the following features:
- Get the target field by statically join the cleansed fields after the log cleansing.
- Performs combined query based on the dimension table during dataset query.
Assuming that a user log is in the following format:
The following figure shows the original splitting logic. The split fields include
The business requirement is to count the traffic volume of each country per minute, but the splitting model does not have a country field. Therefore, we can use a mapping table to perform the join operation. The mapping table defines the mapping relationship between the IP address and the country, province, and city. The following figure shows the logic after the mapping table is used.
In the preceding figure, the country, province, and city fields are obtained from the original IP field. The question is, how to set up the mapping table ID?
In the left-side navigation pane of the console, choose Custom Monitoring > Mapping Tables to enter the Instance List page. Click Create Mapping Table in the upper-right corner of the page.
In the Create Mapping Table dialog box, enter the mapping table name.
Enter the schema information of the mapping table: Given that the business requirement is to have the IP converted to the country, province, and city, the schema is basically a mapping relationship between the source field (name and type) and the target field (name and type).
NOTE: Only the String, Long, and Double types are supported.
Select the resource type of the mapping table. Currently, only the text type TEXT is supported. It’s planned to support more resource types.
If the mapping table type is set to TEXT, you must add the text content by referencing the sample data. As shown in the preceding figure, only the IP address mapping is input in the text based on the aforementioned requirement. The text content must strictly follow the format set in the schema. Otherwise, it cannot be saved.
After the mapping table is saved, the system generates a unique dimension table ID. Enter the ID in the building block.
In the preceding case, the mapping relationship is only configured for IP address 42.**.**.** in the text of the mapping table, and the user’s job is already running while the real log contains other IP addresses. You may wonder: does it still work to perform the static join operation with that mapping table in this case?
Don’t worry. You don’t need to stop your job. Instead, you only need to update the text in the mapping table.
When modifying a mapping table, do not modify the schema and resource type of the mapping table.
When the resource type is set to TEXT, you only need to modify the text content.
In the left-side navigation pane of the console, choose Custom Monitoring > Datasets, and click Query Data on the right.
Here is a simple example:
The preceding dataset contains the dimension _hostIp. Select “Drill Down”. The result is displayed as follows:
To query the detailed data of each province and address, perform the join query on _hostIp, because the detailed mapping between the IP address and the region has been configured in the local dimension table.
The detailed data obtained through the query is as follows:
NOTE: A combined query only supports the join operation on the dimension, but not other fields currently.