edit-icon download-icon

Best practices for using mapping tables

Last Updated: Apr 28, 2018

This topic describes the best practices for using mapping tables.

A mapping table stores static data. It has the following features:

  • Get the target field by statically join the cleansed fields after the log cleansing.
  • Performs combined query based on the dimension table during dataset query.

Cases of the static join operation

Create and use a mapping table

Assuming that a user log is in the following format:

  1. 2017-01-09 16:02:49|42.**.**.**|Kevin.yang

The following figure shows the original splitting logic. The split fields include date, ip, and username.

Original Splitting

The business requirement is to count the traffic volume of each country per minute, but the splitting model does not have a country field. Therefore, we can use a mapping table to perform the join operation. The mapping table defines the mapping relationship between the IP address and the country, province, and city. The following figure shows the logic after the mapping table is used.

Splitting with Static Join

In the preceding figure, the country, province, and city fields are obtained from the original IP field. The question is, how to set up the mapping table ID?

  1. In the left-side navigation pane of the console, choose Custom Monitoring > Mapping Tables to enter the Instance List page. Click Create Mapping Table in the upper-right corner of the page.

  2. In the Create Mapping Table dialog box, enter the mapping table name.

    Create Mapping Table

  3. Enter the schema information of the mapping table: Given that the business requirement is to have the IP converted to the country, province, and city, the schema is basically a mapping relationship between the source field (name and type) and the target field (name and type).

    NOTE: Only the String, Long, and Double types are supported.

  4. Select the resource type of the mapping table. Currently, only the text type TEXT is supported. It’s planned to support more resource types.

  5. If the mapping table type is set to TEXT, you must add the text content by referencing the sample data. As shown in the preceding figure, only the IP address mapping is input in the text based on the aforementioned requirement. The text content must strictly follow the format set in the schema. Otherwise, it cannot be saved.

  6. After the mapping table is saved, the system generates a unique dimension table ID. Enter the ID in the building block.

Update a mapping table

In the preceding case, the mapping relationship is only configured for IP address 42.**.**.** in the text of the mapping table, and the user’s job is already running while the real log contains other IP addresses. You may wonder: does it still work to perform the static join operation with that mapping table in this case?

Don’t worry. You don’t need to stop your job. Instead, you only need to update the text in the mapping table.

Edit Mapping Table

  • When modifying a mapping table, do not modify the schema and resource type of the mapping table.

  • When the resource type is set to TEXT, you only need to modify the text content.

Perform combined query on a dataset based on a mapping table

In the left-side navigation pane of the console, choose Custom Monitoring > Datasets, and click Query Data on the right.

Here is a simple example:

The preceding dataset contains the dimension _hostIp. Select “Drill Down”. The result is displayed as follows:

555

To query the detailed data of each province and address, perform the join query on _hostIp, because the detailed mapping between the IP address and the region has been configured in the local dimension table.

Mapping Table Settings

The detailed data obtained through the query is as follows:

666

NOTE: A combined query only supports the join operation on the dimension, but not other fields currently.

Thank you! We've received your feedback.