This topic describes the syntax and parameters of mapping and enrichment functions. This topic also provides examples on how to use the functions.
Functions
Category | Function | Description |
Field-based mapping | Maps the value of an input field to a value in a specified data dictionary and returns a new field. This function can be used together with other functions. For more information, see Use the e_dict_map and e_search_dict_map functions to enrich log data. | |
Maps the value of an input field to a row in a specified table and returns a new field. This function can be used together with other functions. For more information, see Use the e_table_map function to enrich HTTP response status codes. | ||
Enriches a raw log by using a data table in Tablestore as the dimension table. | ||
Enriches a raw log by using a data table in ApsaraDB for Redis as the dimension table. |
e_dict_map
The e_dict_map function maps the value of an input field to a value in a specified data dictionary and returns a new field.
Syntax
e_dict_map(data, field, output_field, case_insensitive=True, missing=None, mode="overwrite")
Parameters
Parameter
Type
Required
Description
data
Dict
Yes
The data dictionary that is used for mapping. The value of this parameter must be in the
{key01:value01,key01:value02,...}
standard format. The keys must be strings. Example:{"1": "TCP", "2": "UDP", "3": "HTTP", "*": "Unknown"}
.field
String or string list
Yes
One or more field names. If the value of this parameter contains multiple field names, the system performs the following operations:
The system performs mapping on the field names in sequence.
If the system matches multiple values for the fields and the mode parameter is set to overwrite, the system returns the value that is last matched.
If the system matches no values for the fields, the system returns the value of the missing parameter.
output_field
String
Yes
The name of the field that you want the function to return.
case_insensitive
Boolean
No
Specifies whether to disable case sensitivity during mapping.
True: disables case sensitivity. This is the default value.
False: enables case sensitivity.
NoteIf the data dictionary contains multiple keys that differ only in letter cases and the case_insensitive parameter is set to True, the system first maps the value of the input field to a key that uses the same cases as the value. If the key that uses the same cases does not exist, the system randomly maps the value to one of the multiple keys.
missing
String
No
The value that is assigned to the field specified by output_field when no match is found for the input field. Default value: None, which indicates that no assignment is performed.
NoteIf the data dictionary contains a key of an asterisk (*), the
missing
parameter becomes invalid. This is because an asterisk (*) has a higher priority than themissing
parameter.mode
String
No
The overwrite mode of fields. Default value: overwrite. For more information, see Field extraction check and overwrite modes.
Response
A log that contains a new field is returned.
Examples
Example 1: Map the value of the pro field in the raw log to a value in a data dictionary and generate a new field named protocol.
Raw log
data: 123 pro: 1
Transformation rule
e_dict_map( {"1": "TCP", "2": "UDP", "3": "HTTP", "6": "HTTPS", "*": "Unknown"}, "pro", "protocol", )
Result
data: 123 pro: 1 protocol: TCP
Example 2: Map the value of the status field in the raw logs to values in a data dictionary and generate a new field named message.
Raw logs
status: 500
status: 400
status: 200
Transformation rule
e_dict_map({"400": "Error", "200": "Success", "*": "Other"}, "status", "message")
Result
status: 500 message: Other
status: 400 message: Error
status: 200 message: Success
References
This function can be used together with other functions. For more information, see Use the e_dict_map and e_search_dict_map functions to enrich log data.
e_table_map
The e_table_map function maps the value of an input field to a row in a specified table and returns a new field.
Syntax
e_table_map(data, field, output_fields, missing=None, mode="fill-auto")
Parameters
Parameter
Type
Required
Description
data
Table
Yes
The table that is used for mapping.
NoteIf you use the resource functions
res_rds_mysql
andres_log_logstore_pull
as data sources, you must set theprimary_keys
parameter. Failing to do so will severely impact performance and may cause task delays. For more information about how to set theprimary_keys
parameter, see Resource functions.field
String, string list, or tuple list
Yes
The input field. If a log does not contain the field, no operations are performed on the log.
output_fields
String, string list, or tuple list
Yes
The output fields. Example:
["province", "pop"]
.missing
String
No
The value that is assigned to the fields specified by output_fields when no match is found for the input field. Default value: None, which indicates that no assignment is performed. If you want to map the input field to multiple columns, you can set the
missing
parameter to a list of default values that correspond to the input field. The number of the default values must be the same as the number of the columns.NoteIf the table contains a column of an asterisk (*), the
missing
parameter becomes invalid. This is because an asterisk (*) has a higher priority than themissing
parameter.mode
String
No
The overwrite mode of fields. Default value: fill-auto. For more information, see Field extraction check and overwrite modes.
Response
A log that contains new fields is returned.
Examples
Example 1: Map the value of the city field to a row in a table and return the value of the province field for the row.
Raw log
data: 123 city: nj
Transformation rule
e_table_map( tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"), "city", "province" )
Result
data: 123 city: nj province: js
Example 2: Map the value of the city field to a row in a table and return the values of the province and pop fields for the row.
Raw log
data: 123 city: nj
Transformation rule
e_table_map( tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"), "city", ["province", "pop"], )
Result
data: 123 city: nj province: js pop: 800
Example 3: Use the tab_parse_csv function to build a table, map the value of the city field to a row in the table, and return the values of the province and pop fields for the row.
Raw log
data: 123 city: nj
Transformation rule
e_table_map( tab_parse_csv("city#pop#province\nnj#800#js\nsh#2000#sh", sep="#"), "city", ["province", "pop"], )
Result
data: 123 city: nj province: js pop: 800
Example 4: Use the tab_parse_csv function to build a table, map the value of the city field to a row in the table, and return the values of the province and pop fields for the row.
Raw log
data: 123 city: nj
Transformation rule
e_table_map( tab_parse_csv( "city,pop,province\n|nj|,|800|,|js|\n|shang hai|,2000,|SHANG,HAI|", quote="|" ), "city", ["province", "pop"], )
Result
data: 123 city: nj province: js pop: 800
Example 5: The input field is different from the corresponding field in the table that is used for mapping. Find a row in the table based on the cty and city fields and return the value of the province field for the row.
Raw log
data: 123 cty: nj
Transformation rule
e_table_map( tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"), [("cty", "city")], "province", )
Result
data: 123 cty: nj province: js
Example 6: The input field is different from the corresponding field in the table that is used for mapping. Map data and rename the output field.
Raw log
data: 123 cty: nj
Transformation rule
e_table_map( tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"), [("cty", "city")], [("province", "pro")], )
Result
data: 123 cty: nj pro: js
Example 7: Map the values of multiple fields to a row in a table.
Raw log
data: 123 city: nj pop: 800
Transformation rule
e_table_map( tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"), ["city", "pop"], "province", )
Result
data: 123 city: nj pop: 800 province: js
Example 8: Map the values of multiple fields to a row in a table. The input fields are different from the corresponding fields in the table that is used for mapping.
Raw log
data: 123 cty: nj pp: 800
Transformation rule
e_table_map( tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"), [("cty", "city"), ("pp", "pop")], "province", )
Result
data: 123 cty: nj pp: 800 province: js
References
This function can be used together with other functions. For more information, see Use the e_table_map function to enrich HTTP response status codes.
e_tablestore_map
The e_tablestore_map function enriches a raw log by using a data table in Tablestore as the dimension table.
Syntax
e_tablestore_map( fields, endpoint, ak_id, ak_secret, instance_name, table_names, output_fields=None, output_table_name=None, encoding="utf8", mode="fill-auto", )
Parameters
Parameter
Type
Required
Description
fields
String, number, list, or tuple list
Yes
The raw log fields that are used to map data between the raw log and the data table. The function maps multiple raw log fields to primary keys in the data table one by one. Examples:
If the data table contains the a primary key and the raw log contains the a field, you can use
fields="a"
.If the data table contains the a, b, and c primary keys and the raw log contains the a, b, and c fields, you can use
fields=["a", "b", "c"]
.If the data table contains the a, b, and c primary keys and the raw log contains the a1, b1, and c1 fields, you can use
fields=[("a1", "a"), ("b1", "b"), ("c1", "c")]
.
endpoint
String
Yes
The endpoint of the Tablestore instance in which the data table is created. For more information, see Endpoints.
NoteYou can use the virtual private cloud (VPC) endpoint or public endpoint of the Tablestore instance. A VPC endpoint is used for access within the same region, and a public endpoint is used for access over the Internet regardless of regions.
ak_id
String
Yes
The AccessKey ID of the account that has permissions to access the Tablestore instance. For more information, see Create an AccessKey pair.
If you use a RAM user, make sure that the RAM user is granted the access permissions, such as AliyunOTSReadOnlyAccess. For more information, see Grant permissions to a RAM user.
ak_secret
String
Yes
The AccessKey secret of the account that has permissions to access the Tablestore instance. For more information, see Create an AccessKey pair.
instance_name
String
Yes
The name of the Tablestore instance.
table_names
String, string list, or tuple list
Yes
The name of the data table. If the data table uses a secondary index, set this parameter to the name of the index. For more information about the secondary index feature, see Create a secondary index.
For example, if the index1 secondary index is created for the data table, set this parameter to
"index1"
.output_fields
List
No
The output fields. You can specify the names of primary key columns or attribute columns. Example:
["province", "pop"]
. If you do not configure this parameter, all columns of the row that is matched based on the input fields are returned.NoteIf multiple data tables are created in the Tablestore instance, the function returns only the data in the data table that is first used for matching.
output_table_name
String
No
The name of the data table in which the returned data is stored. Default value: None, which indicates that the output fields do not contain the table name. If you set this parameter to a string, the output fields include the table name.
For example, the data table named test is used, and the transformation rule includes
output_fields=["province", "pop"],output_table_name="table_name"
. If the data columns["province", "pop"]
in the test data table are matched, the output fields areprovince: xxx, pop:xxx,table_name:test
.encoding
String
No
The encoding method of the HTTPS request parameters. Default value: utf-8.
mode
String
No
The overwrite mode of fields. Default value: fill-auto. For more information, see Field extraction check and overwrite modes.
Response
A log that contains new fields is returned.
Examples
The following examples are based on the following table_name_test data table.
city (primary key)
pop (primary key)
cid
province
region
bj
300
1
bj
huabei
nj
800
2
js
huadong
sh
200
3
sh
huadong
Example 1: Find a row in the data table based on the city and pop fields and return the values of the province and cid columns for the row.
Raw log
city:sh name:maki pop:200
Transformation rule
e_tablestore_map( ["city","pop"], "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com", "LTA3****", "VIH9****", "d00s0dxa****", "table_name_test", output_fields=["province","cid"])
Result
city:sh name:maki pop:200 cid:3 province:sh
Example 2: Map the city1 and pop1 fields in the raw log to the city and pop primary keys in the data table, find a row in the data table based on the fields, and return the values of all columns for the row.
Raw log
city1:sh name:maki pop1:200
Transformation rule
e_tablestore_map( [("city1","city"), ("pop1", "pop")], "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com", "LTA3****", "VIH9****", "d00s0dxa****", "table_name_test")
Result
city:sh name:maki pop:200 cid:3 province:sh region:huadong
Example 3: Find a row in the data table based on the city and pop fields and return the values of all columns for the row. Set output_table_name to
"table_name"
. In the returned result, you can view the name of the data table in which the returned data is stored.Raw log
city:sh name:maki pop:200
Transformation rule
e_tablestore_map( ["city","pop"], "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com", "LTA3****", "VIH9****", "d00s0dxa****", "table_name_test", output_table_name="table_name" )
Result
city:sh name:maki pop:200 cid:3 province:sh region:huadong table_name:table_name_test
Example 4: Find a row in the table_name_test, table_name_test1, and table_name_test2 data tables based on the city and pop fields, and return the values of all columns for the row. In the returned result, you can view only the data in the table_name_test data table that is first used for matching.
Raw log
city:sh name:maki pop:200
Transformation rule
e_tablestore_map( ["city","pop"], "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com", "LTA3****", "VIH9****", "d00s0dxa****", ["table_name_test","table_name_test1","table_name_test2"], output_table_name="table_name" )
Result
city:sh name:maki pop:200 cid:3 province:sh region:huadong table_name:table_name_test
Example 5: Find a row in the data table based on the pk1 and pk2 primary keys for the index1 secondary index, and return the value of the definedcol2 predefined column for the row. The predefined column is specified for the index1 secondary index.
Data table (index1)
pk1 (primary key)
pk2 (primary key)
definedcol2 (predefined column)
definedcol3 (predefined column)
pk1_1
pk2_1
definedcol2_1
definedcol3_1
pk1_2
pk2_2
definedcol2_2
definedcol3_2
Raw log
pk1:pk1_1 pk2:pk2_1
Transformation rule
e_tablestore_map( ["pk1","pk2"], "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com", "LTA3****", "VIH9****", "d00s0dxa****", "index1", output_fields= ["definedcol2"], output_table_name="table_name", )
Result
pk1:pk1_1 pk2:pk2_1 definedcol2:definedcol2_1 table_name:index1
e_redis_map
The e_redis_map function enriches a raw log by using a data table in ApsaraDB for Redis as the dimension table.
Syntax
e_redis_map(field, output_field, host, port=6379, db=0, username=None, password=None, encoding="utf-8", max_retries=5, mode="fill-auto")
Parameters
Parameter
Type
Required
Description
field
String
Yes
The raw log field that is used to map data between the raw log and the data table. If the raw log does not contain the field, no operations are performed on the log.
output_field
String
Yes
The output field.
host
String
Yes
The endpoint of the ApsaraDB for Redis database.
username
String
No
The username of the account that you want to use to connect to the ApsaraDB for Redis database. This parameter is empty by default, which indicates that authentication is not performed.
password
String
No
The password of the account that you want to use to connect to the ApsaraDB for Redis database. This parameter is empty by default, which indicates that authentication is not performed.
port
Integer
No
The port of the ApsaraDB for Redis database. Default value: 6379.
db
Integer
No
The name of the ApsaraDB for Redis database. Default value: 0.
encoding
String
No
The encoding method of data in the ApsaraDB for Redis database. Default value: utf-8.
max_retries
Integer
No
The maximum number of retries allowed when a request to connect to the ApsaraDB for Redis database fails. Default value: 5.
If the connection request fails after the maximum number of retries, the function skips the current log in the transformation process. Subsequent transformation is not affected.
Each interval between retries doubles that of the previous interval. The intervals range from 1s to 120s.
mode
String
No
The overwrite mode of fields. Default value: fill-auto. For more information, see Field extraction check and overwrite modes.
Response
A log that contains a new field is returned.
Examples
The following examples are based on the following data table in ApsaraDB for Redis.
ImportantOnly the values of the string type are supported.
Key
Value
i1001
{ "name": "Orange", "price": 10 }
i1002
{ "name": "Apple", "price": 12 }
i1003
{ "name": "Mango", "price": 16 }
Example 1: Find a value in the data table based on the item field and return the value. The username and password of the account that is used to connect to the ApsaraDB for Redis database are not specified in the transformation rule.
Raw log
item: i1002 count: 7
Transformation rule
e_redis_map("item", "detail", host="r-bp1olrdor8353v4s.redis.rds.aliyuncs.com")
Result
item: i1002 count: 7 detail: { "name": "Apple", "price": 12 }
Example 2: Find a value in the data table based on the item field and return the value. The username and password of the account that is used to connect to the ApsaraDB for Redis database are specified in the transformation rule.
Raw log
item: i1003 count: 7
Transformation rule
e_redis_map("item", "detail", host="r-bp1olrdor8353v4s****.redis.rds.aliyuncs.com", username="r-bp****", password="***")
Result
item: i1003 count: 7 detail:{ "name": "Mango", "price": 16 }