This topic describes the syntax and parameters of mapping and enrichment functions. This topic also provides examples on how to use the functions.
Functions
Category | Function | Description |
---|---|---|
Field-based mapping | e_dict_map | Maps the value of an input field to a value in a specified data dictionary and returns
a new field.
This function can be used together with other functions. For more information, see Use the e_dict_map and e_search_dict_map functions to enrich log data. |
e_table_map | Maps the value of an input field to a row in a specified table and returns a new field.
This function can be used together with other functions. For more information, see Use the e_table_map function to enrich HTTP response status codes. |
|
e_tablestore_map | Enriches a raw log by using a data table in Tablestore as the dimension table. | |
e_redis_map | Enriches a raw log by using a data table in ApsaraDB for Redis as the dimension table. | |
Search-based mapping | e_search_dict_map | Searches the keywords in a specified data dictionary for a raw log field, maps the
field to a value in the data dictionary, and returns a new field. The keywords must
be query strings.
This function can be used together with other functions. For more information, see Use the e_dict_map and e_search_dict_map functions to enrich log data. |
e_search_table_map | Searches a specified column in a specified table for a raw log field, maps the field to a row in the table, and returns a new field. The values of the column must be query strings. |
e_dict_map
The e_dict_map function maps the value of an input field to a value in a specified data dictionary and returns a new field.
-
Syntax
e_dict_map(data, field, output_field, case_insensitive=True, missing=None, mode="overwrite")
-
Parameters
Parameter Type Required Description data Dict Yes The data dictionary that is used for mapping. The value of this parameter must be in the {key01:value01,key01:value02,...}
standard format. The keys must be strings. Example:{"1": "TCP", "2": "UDP", "3": "HTTP", "*": "Unknown"}
.field String or string list Yes One or more field names. If the value of this parameter contains multiple field names, the system performs the following operations: - The system performs mapping on the field names in sequence.
- If the system matches multiple values for the fields and the mode parameter is set to overwrite, the system returns the value that is last matched.
- If the system matches no values for the fields, the system returns the value of the missing parameter.
output_field String Yes The name of the field that you want the function to return. case_insensitive Boolean No Specifies whether to disable case sensitivity during mapping. - True: disables case sensitivity. This is the default value.
- False: enables case sensitivity.
Note If the data dictionary contains multiple keys that differ only in letter cases and the case_insensitive parameter is set to True, the system first maps the value of the input field to a key that uses the same cases as the value. If the key that uses the same cases does not exist, the system randomly maps the value to one of the multiple keys.missing String No The value that is assigned to the field specified by output_field when no match is found for the input field. Default value: None, which indicates that no assignment is performed. Note If the data dictionary contains a key of an asterisk (*), themissing
parameter becomes invalid. This is because an asterisk (*) has a higher priority than themissing
parameter.mode String No The overwrite mode of fields. Default value: overwrite. For more information, see Field extraction check and overwrite modes. -
Response
A log that contains a new field is returned.
-
Examples
- Example 1: Map the value of the pro field in the raw log to a value in a data dictionary and generate a new field named
protocol.
- Raw log
data: 123 pro: 1
- Transformation rule
e_dict_map( {"1": "TCP", "2": "UDP", "3": "HTTP", "6": "HTTPS", "*": "Unknown"}, "pro", "protocol", )
- Result
data: 123 pro: 1 protocol: TCP
- Raw log
- Example 2: Map the value of the status field in the raw logs to values in a data dictionary and generate a new field named
message.
- Raw logs
status: 500
status: 400
status: 200
- Transformation rule
e_dict_map({"400": "Error", "200": "Success", "*": "Other"}, "status", "message")
- Result
status: 500 message: Other
status: 400 message: Error
status: 200 message: Success
- Raw logs
- Example 1: Map the value of the pro field in the raw log to a value in a data dictionary and generate a new field named
protocol.
-
References
This function can be used together with other functions. For more information, see Use the e_dict_map and e_search_dict_map functions to enrich log data.
e_table_map
The e_table_map function maps the value of an input field to a row in a specified table and returns a new field.
-
Syntax
e_table_map(data, field, output_fields, missing=None, mode="fill-auto")
-
Parameters
Parameter Type Required Description data Table Yes The table that is used for mapping. Note If you use a resource function such as res_rds_mysql or res_log_logstore_pull as a data source, we recommend that you configure the primary_keys parameter to improve the performance of data transformation. For more information, see Resource functions.field String, string list, or tuple list Yes The input field. If a log does not contain the field, no operations are performed on the log. output_fields String, string list, or tuple list Yes The output fields. Example: ["province", "pop"]
.missing String No The value that is assigned to the fields specified by output_fields when no match is found for the input field. Default value: None, which indicates that no assignment is performed. If you want to map the input field to multiple columns, you can set the missing
parameter to a list of default values that correspond to the input field. The number of the default values must be the same as the number of the columns.Note If the table contains a column of an asterisk (*), themissing
parameter becomes invalid. This is because an asterisk (*) has a higher priority than themissing
parameter.mode String No The overwrite mode of fields. Default value: fill-auto. For more information, see Field extraction check and overwrite modes. -
Response
A log that contains new fields is returned.
-
Examples
- Example 1: Map the value of the city field to a row in a table and return the value of the province field for the row.
- Raw log
data: 123 city: nj
- Transformation rule
e_table_map( tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"), "city", "province" )
- Result
data: 123 city: nj province: js
- Raw log
- Example 2: Map the value of the city field to a row in a table and return the values of the province and pop fields for the row.
- Raw log
data: 123 city: nj
- Transformation rule
e_table_map( tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"), "city", ["province", "pop"], )
- Result
data: 123 city: nj province: js pop: 800
- Raw log
- Example 3: Use the tab_parse_csv function to build a table, map the value of the city field to a row in the table, and return the values of the province and pop fields for the row.
- Raw log
data: 123 city: nj
- Transformation rule
e_table_map( tab_parse_csv("city#pop#province\nnj#800#js\nsh#2000#sh", sep="#"), "city", ["province", "pop"], )
- Result
data: 123 city: nj province: js pop: 800
- Raw log
- Example 4: Use the tab_parse_csv function to build a table, map the value of the city field to a row in the table, and return the values of the province and pop fields for the row.
- Raw log
data: 123 city: nj
- Transformation rule
e_table_map( tab_parse_csv( "city,pop,province\n|nj|,|800|,|js|\n|shang hai|,2000,|SHANG,HAI|", quote="|" ), "city", ["province", "pop"], )
- Result
data: 123 city: nj province: js pop: 800
- Raw log
- Example 5: The input field is different from the corresponding field in the table
that is used for mapping. Find a row in the table based on the cty and city fields and return the value of the province field for the row.
- Raw log
data: 123 cty: nj
- Transformation rule
e_table_map( tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"), [("cty", "city")], "province", )
- Result
data: 123 cty: nj province: js
- Raw log
- Example 6: The input field is different from the corresponding field in the table
that is used for mapping. Map data and rename the output field.
- Raw log
data: 123 cty: nj
- Transformation rule
e_table_map( tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"), [("cty", "city")], [("province", "pro")], )
- Result
data: 123 cty: nj pro: js
- Raw log
- Example 7: Map the values of multiple fields to a row in a table.
- Raw log
data: 123 city: nj pop: 800
- Transformation rule
e_table_map( tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"), ["city", "pop"], "province", )
- Result
data: 123 city: nj pop: 800 province: js
- Raw log
- Example 8: Map the values of multiple fields to a row in a table. The input fields
are different from the corresponding fields in the table that is used for mapping.
- Raw log
data: 123 cty: nj pp: 800
- Transformation rule
e_table_map( tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"), [("cty", "city"), ("pp", "pop")], "province", )
- Result
data: 123 cty: nj pp: 800 province: js
- Raw log
- Example 1: Map the value of the city field to a row in a table and return the value of the province field for the row.
-
References
This function can be used together with other functions. For more information, see Use the e_table_map function to enrich HTTP response status codes.
e_tablestore_map
The e_tablestore_map function enriches a raw log by using a data table in Tablestore as the dimension table.
-
Syntax
e_tablestore_map( fields, endpoint, ak_id, ak_secret, instance_name, table_names, output_fields=None, output_table_name=None, encoding="utf8", mode="fill-auto", )
-
Parameters
Parameter Type Required Description fields String, number, list, or tuple list Yes The raw log fields that are used to map data between the raw log and the data table. The function maps multiple raw log fields to primary keys in the data table one by one. Examples: - If the data table contains the a primary key and the raw log contains the a field, you can use
fields="a"
. - If the data table contains the a, b, and c primary keys and the raw log contains the a, b, and c fields, you can use
fields=["a", "b", "c"]
. - If the data table contains the a, b, and c primary keys and the raw log contains the a1, b1, and c1 fields, you can use
fields=[("a1", "a"), ("b1", "b"), ("c1", "c")]
.
endpoint String Yes The endpoint of the Tablestore instance in which the data table is created. For more information, see Endpoint. Note You can use the virtual private cloud (VPC) endpoint or public endpoint of the Tablestore instance. A VPC endpoint is used for access within the same region, and a public endpoint is used for access over the Internet regardless of regions.ak_id String Yes The AccessKey ID of the account that has permissions to access the Tablestore instance. For more information, see Create an AccessKey pair. If you use a RAM user, make sure that the RAM user is granted the access permissions, such as AliyunOTSReadOnlyAccess. For more information, see Grant permissions to the RAM user.
ak_secret String Yes The AccessKey secret of the account that has permissions to access the Tablestore instance. For more information, see Create an AccessKey pair. instance_name String Yes The name of the Tablestore instance. table_names String, string list, or tuple list Yes The name of the data table. If the data table uses a secondary index, set this parameter to the name of the index. For more information about the secondary index feature, see Global secondary index. For example, if the index1 secondary index is created for the data table, set this parameter to
"index1"
.output_fields List No The output fields. You can specify the names of primary key columns or attribute columns. Example: ["province", "pop"]
. If you do not configure this parameter, all columns of the row that is matched based on the input fields are returned.Note If multiple data tables are created in the Tablestore instance, the function returns only the data in the data table that is first used for matching.output_table_name String No The name of the data table in which the returned data is stored. Default value: None, which indicates that the output fields do not contain the table name. If you set this parameter to a string, the output fields include the table name. For example, the data table named test is used, and the transformation rule includes
output_fields=["province", "pop"],output_table_name="table_name"
. If the data columns["province", "pop"]
in the test data table are matched, the output fields areprovince: xxx, pop:xxx,table_name:test
.encoding String No The encoding method of the HTTPS request parameters. Default value: utf8. mode String No The overwrite mode of fields. Default value: fill-auto. For more information, see Field extraction check and overwrite modes. - If the data table contains the a primary key and the raw log contains the a field, you can use
-
Response
A log that contains new fields is returned.
-
Examples
The following examples are based on the following table_name_test data table.city (primary key) pop (primary key) cid province region bj 300 1 bj huabei nj 800 2 js huadong sh 200 3 sh huadong - Example 1: Find a row in the data table based on the city field and return the value of the province column for the row.
- Raw log
city:sh name:maki pop:200
- Transformation rule
e_tablestore_map( "city", "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com", "LTA3****", "VIH9****", "d00s0dxa****", "table_name_test", output_fields=["province"])
- Result
city:sh name:maki pop:200 province:sh
- Raw log
- Example 2: Find a row in the data table based on the city and pop fields and return the values of the province and cid columns for the row.
- Raw log
city:sh name:maki pop:200
- Transformation rule
e_tablestore_map( ["city","pop"], "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com", "LTA3****", "VIH9****", "d00s0dxa****", "table_name_test", output_fields=["province","cid"])
- Result
city:sh name:maki pop:200 cid:3 province:sh
- Raw log
- Example 3: Map the city1 and pop1 fields in the raw log to the city and pop primary keys in the data table, find a row in the data table based on the fields,
and return the values of all columns for the row.
- Raw log
city1:sh name:maki pop1:200
- Transformation rule
e_tablestore_map( [("city1","city"), ("pop1", "pop")], "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com", "LTA3****", "VIH9****", "d00s0dxa****", "table_name_test")
- Result
city:sh name:maki pop:200 cid:3 province:sh region:huadong
- Raw log
- Example 4: Find a row in the data table based on the city and pop fields and return the values of all columns for the row. Set output_table_name to
"table_name"
. In the returned result, you can view the name of the data table in which the returned data is stored.- Raw log
city:sh name:maki pop:200
- Transformation rule
e_tablestore_map( ["city","pop"], "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com", "LTA3****", "VIH9****", "d00s0dxa****", "table_name_test", output_table_name="table_name" )
- Result
city:sh name:maki pop:200 cid:3 province:sh region:huadong table_name:table_name_test
- Raw log
- Example 5: Find a row in the table_name_test, table_name_test1, and table_name_test2
data tables based on the city and pop fields, and return the values of all columns for the row. In the returned result,
you can view only the data in the table_name_test data table that is first used for
matching.
- Raw log
city:sh name:maki pop:200
- Transformation rule
e_tablestore_map( ["city","pop"], "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com", "LTA3****", "VIH9****", "d00s0dxa****", ["table_name_test","table_name_test1","table_name_test2"], output_table_name="table_name" )
- Result
city:sh name:maki pop:200 cid:3 province:sh region:huadong table_name:table_name_test
- Raw log
- Example 6: Find a row in the data table based on the pk1 and pk2 primary keys for the index1 secondary index, and return the value of the definedcol2 predefined column for the row. The predefined column is specified for the index1
secondary index.
- Data table (index1)
pk1 (primary key) pk2 (primary key) definedcol2 (predefined column) definedcol3 (predefined column) pk1_1 pk2_1 definedcol2_1 definedcol3_1 pk1_2 pk2_2 definedcol2_2 definedcol3_2 - Raw log
pk1:pk1_1 pk2:pk2_1
- Transformation rule
e_tablestore_map( ["pk1","pk2"], "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com", "LTA3****", "VIH9****", "d00s0dxa****", "index1", output_fields= ["definedcol2"], output_table_name="table_name", )
- Result
pk1:pk1_1 pk2:pk2_1 definedcol2:definedcol2_1 table_name:index1
- Data table (index1)
- Example 1: Find a row in the data table based on the city field and return the value of the province column for the row.
e_redis_map
The e_redis_map function enriches a raw log by using a data table in ApsaraDB for Redis as the dimension table.
-
Syntax
e_redis_map(field, output_field, host, port=6379, db=0, username=None, password=None, encoding="utf-8", max_retries=5, mode="fill-auto")
-
Parameters
Parameter Type Required Description field String Yes The raw log field that is used to map data between the raw log and the data table. If the raw log does not contain the field, no operations are performed on the log. output_field String Yes The output field. host String Yes The endpoint of the ApsaraDB for Redis database. username String No The username of the account that you want to use to connect to the ApsaraDB for Redis database. This parameter is empty by default, which indicates that authentication is not performed. password String No The password of the account that you want to use to connect to the ApsaraDB for Redis database. This parameter is empty by default, which indicates that authentication is not performed. port Integer No The port of the ApsaraDB for Redis database. Default value: 6379. db Integer No The name of the ApsaraDB for Redis database. Default value: 0. encoding String No The encoding method of data in the ApsaraDB for Redis database. Default value: utf-8. max_retries Integer No The maximum number of retries allowed when a request to connect to the ApsaraDB for Redis database fails. Default value: 5. If the connection request fails after the maximum number of retries, the function skips the current log in the transformation process. Subsequent transformation is not affected.
Each interval between retries doubles that of the previous interval. The intervals range from 1s to 120s.
mode String No The overwrite mode of fields. Default value: fill-auto. For more information, see Field extraction check and overwrite modes. -
Response
A log that contains a new field is returned.
-
Examples
The following examples are based on the following data table in ApsaraDB for Redis.
Important Only the values of the string type are supported.Key Value i1001 { "name": "Orange", "price": 10 } i1002 { "name": "Apple", "price": 12 } i1003 { "name": "Mango", "price": 16 } - Example 1: Find a value in the data table based on the item field and return the value. The username and password of the account that is used
to connect to the ApsaraDB for Redis database are not specified in the transformation
rule.
- Raw log
item: i1002 count: 7
- Transformation rule
e_redis_map("item", "detail", host="r-bp1olrdor8353v4s.redis.rds.aliyuncs.com")
- Result
item: i1002 count: 7 detail: { "name": "Apple", "price": 12 }
- Raw log
- Example 2: Find a value in the data table based on the item field and return the value. The username and password of the account that is used
to connect to the ApsaraDB for Redis database are specified in the transformation
rule.
- Raw log
item: i1003 count: 7
- Transformation rule
e_redis_map("item", "detail", host="r-bp1olrdor8353v4s****.redis.rds.aliyuncs.com", username="r-bp****", password="***")
- Result
item: i1003 count: 7 detail:{ "name": "Mango", "price": 16 }
- Raw log
- Example 1: Find a value in the data table based on the item field and return the value. The username and password of the account that is used
to connect to the ApsaraDB for Redis database are not specified in the transformation
rule.
e_search_dict_map
The e_search_dict_map function searches the keywords in a specified data dictionary for a raw log field, maps the field to a value in the data dictionary, and returns a new field. The keywords must be query strings.
-
Syntax
e_search_dict_map(data, output_field, multi_match=False, multi_join=" ", missing=None, mode="overwrite")
-
Parameters
Parameter Type Required Description data Dict Yes The data dictionary that is used for mapping. The value must be in the {key01:value01,key01:value02,...}
standard format. The keys must be query strings.output_field String Yes The name of the field that you want the function to return. multi_match Boolean No Specifies whether the system can match multiple values for the input field. Default value: False, which indicates that the system does not match multiple values and returns only the value that is matched for the last value of the input field. You can configure the multi_join
parameter to concatenate multiple values that are matched for the input field.multi_join String No The character that is used to concatenate the multiple values that are matched for the input field. The default value is a space. This parameter takes effect only when the multi_match parameter is set to True. missing String No The value that is assigned to the field specified by output_field when no match is found for the input field. Default value: None, which indicates that no assignment is performed. Note If the data dictionary contains a key of an asterisk (*), themissing
parameter becomes invalid. This is because an asterisk (*) has a higher priority than themissing
parameter.mode String No The overwrite mode of fields. Default value: overwrite. For more information, see Field extraction check and overwrite modes. -
Response
The value that is matched for the input field is returned.
-
Examples
- Example 1: Map data.
- Raw log
data:123 pro:1
- Transformation rule
e_search_dict_map ({"pro==1": "TCP", "pro==2": "UDP", "pro==3": "HTTP"}, "protocol")
- Result
data:123 pro:1 protocol:TCP
- Raw log
- Example 2: Map data based on the first character of each field value.
- Raw log
status:200,300
- Transformation rule
e_search_dict_map( { "status:2??": "ok", "status:3??": "redirect", "status:4??": "auth", "status:5??": "server_error", }, "status_desc", multi_match=True, multi_join="Test", )
- Result
status:200,300 status_desc: okTestredirect
- Raw log
- Example 1: Map data.
-
References
This function can be used together with other functions. For more information, see Use the e_dict_map and e_search_dict_map functions to enrich log data.
e_search_table_map
The e_search_table_map function searches a specified column in a specified table for a raw log field, maps the field to a row in the table, and returns a new field. The values of the column must be query strings.
-
Syntax
e_search_table_map(data, inpt, output_fields, multi_match=False, multi_join=" ", missing=None, mode="fill-auto")
-
Parameters
Parameter Type Required Description data Table Yes The table that is used for mapping. The table must contain a column whose values are query strings. inpt String Yes The name of the column in which the system searches for data. The field that is indicated by the column is considered the input field. output_fields String, string list, or tuple list Yes The output fields that you want the function to return. The value of this parameter is a string, string list, or tuple list. multi_match Boolean No Specifies whether the system can match multiple values for the input field. Default value: False, which indicates that the system does not match multiple values and returns only the value that is matched for the first value of the input field. You can configure the multi_join
parameter to concatenate multiple values that are matched for the input field.multi_join String No The character that is used to concatenate the multiple values that are matched for the input field. The default value is a space. This parameter takes effect only when the multi_match
parameter is set to True.missing String No The value that is assigned to the fields specified by output_fields when no match is found for the input field. Default value: None, which indicates that no assignment is performed. Note If the table contains a column of anasterisk (*)
, themissing
parameter becomes invalid. This is because anasterisk (*)
has a higher priority than themissing
parameter.mode String No The overwrite mode of fields. Default value: fill-auto. For more information, see Field extraction check and overwrite modes. -
Response
The value that is matched for the input field is returned.
-
Examples
- Example 1: Map the value of the city field to a row in a table and return the values of the pop and province fields for the row.
- Raw log
The following table is used for mapping in this example. The values in the search column are query strings.data: 123 city: sh
search pop province city==nj
800 js city==sh
2000 sh - Transformation rule
e_search_table_map( tab_parse_csv("search,pop,province\ncity==nj,800,js\ncity==sh,2000,sh"), "search", ["pop", "province"], )
- Result
data: 123 city: sh province: sh pop: 2000
- Raw log
- Example 2: Map data in overwrite mode.
- Raw log
data: 123 city: nj province:
- Transformation rule
e_search_table_map( tab_parse_csv("search,pop,province\ncity==nj,800,js\ncity==sh,2000,sh"), "search", "province", mode="overwrite", )
- Result
data: 123 city: nj province: js
- Raw log
- Example 3: Map data by specifying a value for the missing parameter.
- Raw log
data: 123 city: wh province:
- Transformation rule
e_search_table_map( tab_parse_csv("search,pop,province\ncity==nj,800,\ncity==sh,2000,sh"), "search", "province", missing="Unknown", )
- Result
data: 123 city: wh province: Unknown
- Raw log
- Example 4: Map data by setting the multi_match parameter to True.
- Raw log
data: 123 city: nj,sh province:
- Transformation rule
e_search_table_map( tab_parse_csv("search,pop,province\ncity:nj,800,js\ncity:sh,2000,sh"), "search", "province", multi_match=True, multi_join=",", )
- Result
data: 123 city: nj,sh province: js,sh
- Raw log
- Example 1: Map the value of the city field to a row in a table and return the values of the pop and province fields for the row.