This topic describes the syntax and parameters of mapping and enrichment functions. This topic also provides examples on how to use the functions.

Functions

Category Function Description
Field-based mapping e_dict_map Maps the value of an input field to a value in a specified data dictionary and returns a new field.

This function can be used together with other functions. For more information, see Use the e_dict_map and e_search_dict_map functions to enrich log data.

e_table_map Maps the value of an input field to a row in a specified table and returns a new field.

This function can be used together with other functions. For more information, see Use the e_table_map function to enrich HTTP response status codes.

e_tablestore_map Enriches a raw log by using a data table in Tablestore as the dimension table.
e_redis_map Enriches a raw log by using a data table in ApsaraDB for Redis as the dimension table.
Search-based mapping e_search_dict_map Searches the keywords in a specified data dictionary for a raw log field, maps the field to a value in the data dictionary, and returns a new field. The keywords must be query strings.

This function can be used together with other functions. For more information, see Use the e_dict_map and e_search_dict_map functions to enrich log data.

e_search_table_map Searches a specified column in a specified table for a raw log field, maps the field to a row in the table, and returns a new field. The values of the column must be query strings.

e_dict_map

The e_dict_map function maps the value of an input field to a value in a specified data dictionary and returns a new field.

  • Syntax

    e_dict_map(data, field, output_field, case_insensitive=True, missing=None, mode="overwrite")
  • Parameters

    Parameter Type Required Description
    data Dict Yes The data dictionary that is used for mapping. The value of this parameter must be in the {key01:value01,key01:value02,...} standard format. The keys must be strings. Example: {"1": "TCP", "2": "UDP", "3": "HTTP", "*": "Unknown"}.
    field String or string list Yes One or more field names. If the value of this parameter contains multiple field names, the system performs the following operations:
    • The system performs mapping on the field names in sequence.
    • If the system matches multiple values for the fields and the mode parameter is set to overwrite, the system returns the value that is last matched.
    • If the system matches no values for the fields, the system returns the value of the missing parameter.
    output_field String Yes The name of the field that you want the function to return.
    case_insensitive Boolean No Specifies whether to disable case sensitivity during mapping.
    • True: disables case sensitivity. This is the default value.
    • False: enables case sensitivity.
    Note If the data dictionary contains multiple keys that differ only in letter cases and the case_insensitive parameter is set to True, the system first maps the value of the input field to a key that uses the same cases as the value. If the key that uses the same cases does not exist, the system randomly maps the value to one of the multiple keys.
    missing String No The value that is assigned to the field specified by output_field when no match is found for the input field. Default value: None, which indicates that no assignment is performed.
    Note If the data dictionary contains a key of an asterisk (*), the missing parameter becomes invalid. This is because an asterisk (*) has a higher priority than the missing parameter.
    mode String No The overwrite mode of fields. Default value: overwrite. For more information, see Field extraction check and overwrite modes.
  • Response

    A log that contains a new field is returned.

  • Examples

    • Example 1: Map the value of the pro field in the raw log to a value in a data dictionary and generate a new field named protocol.
      • Raw log
        data:  123
        pro:  1
      • Transformation rule
        e_dict_map(
            {"1": "TCP", "2": "UDP", "3": "HTTP", "6": "HTTPS", "*": "Unknown"},
            "pro",
            "protocol",
        )
      • Result
        data:  123
        pro:  1
        protocol:  TCP
    • Example 2: Map the value of the status field in the raw logs to values in a data dictionary and generate a new field named message.
      • Raw logs
        status:  500
        status:  400
        status:  200
      • Transformation rule
        e_dict_map({"400": "Error", "200": "Success", "*": "Other"}, "status", "message")
      • Result
        status:  500
        message: Other
        status:  400
        message: Error
        status:  200
        message: Success
  • References

    This function can be used together with other functions. For more information, see Use the e_dict_map and e_search_dict_map functions to enrich log data.

e_table_map

The e_table_map function maps the value of an input field to a row in a specified table and returns a new field.

  • Syntax

    e_table_map(data, field, output_fields, missing=None, mode="fill-auto")
  • Parameters

    Parameter Type Required Description
    data Table Yes The table that is used for mapping.
    Note If you use a resource function such as res_rds_mysql or res_log_logstore_pull as a data source, we recommend that you configure the primary_keys parameter to improve the performance of data transformation. For more information, see Resource functions.
    field String, string list, or tuple list Yes The input field. If a log does not contain the field, no operations are performed on the log.
    output_fields String, string list, or tuple list Yes The output fields. Example: ["province", "pop"].
    missing String No The value that is assigned to the fields specified by output_fields when no match is found for the input field. Default value: None, which indicates that no assignment is performed. If you want to map the input field to multiple columns, you can set the missing parameter to a list of default values that correspond to the input field. The number of the default values must be the same as the number of the columns.
    Note If the table contains a column of an asterisk (*), the missing parameter becomes invalid. This is because an asterisk (*) has a higher priority than the missing parameter.
    mode String No The overwrite mode of fields. Default value: fill-auto. For more information, see Field extraction check and overwrite modes.
  • Response

    A log that contains new fields is returned.

  • Examples

    • Example 1: Map the value of the city field to a row in a table and return the value of the province field for the row.
      • Raw log
        data: 123
        city: nj
      • Transformation rule
        e_table_map(
            tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"), "city", "province"
        )
      • Result
        data: 123
        city: nj
        province: js
    • Example 2: Map the value of the city field to a row in a table and return the values of the province and pop fields for the row.
      • Raw log
        data: 123
        city: nj
      • Transformation rule
        e_table_map(
            tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"),
            "city",
            ["province", "pop"],
        )
      • Result
        data: 123
        city: nj
        province: js
        pop: 800
    • Example 3: Use the tab_parse_csv function to build a table, map the value of the city field to a row in the table, and return the values of the province and pop fields for the row.
      • Raw log
        data: 123
        city: nj
      • Transformation rule
        e_table_map(
            tab_parse_csv("city#pop#province\nnj#800#js\nsh#2000#sh", sep="#"),
            "city",
            ["province", "pop"],
        )
      • Result
        data: 123
        city: nj
        province: js
        pop: 800
    • Example 4: Use the tab_parse_csv function to build a table, map the value of the city field to a row in the table, and return the values of the province and pop fields for the row.
      • Raw log
        data: 123
        city: nj
      • Transformation rule
        e_table_map(
            tab_parse_csv(
                "city,pop,province\n|nj|,|800|,|js|\n|shang hai|,2000,|SHANG,HAI|", quote="|"
            ),
            "city",
            ["province", "pop"],
        )
      • Result
        data: 123
        city: nj
        province: js
        pop: 800
    • Example 5: The input field is different from the corresponding field in the table that is used for mapping. Find a row in the table based on the cty and city fields and return the value of the province field for the row.
      • Raw log
        data: 123
        cty: nj
      • Transformation rule
        e_table_map(
            tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"),
            [("cty", "city")],
            "province",
        )
      • Result
        data: 123
        cty: nj
        province: js
    • Example 6: The input field is different from the corresponding field in the table that is used for mapping. Map data and rename the output field.
      • Raw log
        data: 123
        cty: nj
      • Transformation rule
        e_table_map(
            tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"),
            [("cty", "city")],
            [("province", "pro")],
        )
                                            
      • Result
        data: 123
        cty: nj
        pro: js
    • Example 7: Map the values of multiple fields to a row in a table.
      • Raw log
        data: 123
        city: nj
        pop: 800
      • Transformation rule
        e_table_map(
            tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"),
            ["city", "pop"],
            "province",
        )
      • Result
        data: 123
        city: nj
        pop: 800
        province: js
    • Example 8: Map the values of multiple fields to a row in a table. The input fields are different from the corresponding fields in the table that is used for mapping.
      • Raw log
        data: 123
        cty: nj
        pp: 800
      • Transformation rule
        e_table_map(
            tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"),
            [("cty", "city"), ("pp", "pop")],
            "province",
        )
      • Result
        data: 123
        cty: nj
        pp: 800
        province: js
  • References

    This function can be used together with other functions. For more information, see Use the e_table_map function to enrich HTTP response status codes.

e_tablestore_map

The e_tablestore_map function enriches a raw log by using a data table in Tablestore as the dimension table.

  • Syntax

    e_tablestore_map(
        fields,
        endpoint,
        ak_id,
        ak_secret,
        instance_name,
        table_names,
        output_fields=None,
        output_table_name=None,
        encoding="utf8",
        mode="fill-auto",
    )
  • Parameters

    Parameter Type Required Description
    fields String, number, list, or tuple list Yes The raw log fields that are used to map data between the raw log and the data table. The function maps multiple raw log fields to primary keys in the data table one by one. Examples:
    • If the data table contains the a primary key and the raw log contains the a field, you can use fields="a".
    • If the data table contains the a, b, and c primary keys and the raw log contains the a, b, and c fields, you can use fields=["a", "b", "c"].
    • If the data table contains the a, b, and c primary keys and the raw log contains the a1, b1, and c1 fields, you can use fields=[("a1", "a"), ("b1", "b"), ("c1", "c")].
    endpoint String Yes The endpoint of the Tablestore instance in which the data table is created. For more information, see Endpoint.
    Note You can use the virtual private cloud (VPC) endpoint or public endpoint of the Tablestore instance. A VPC endpoint is used for access within the same region, and a public endpoint is used for access over the Internet regardless of regions.
    ak_id String Yes The AccessKey ID of the account that has permissions to access the Tablestore instance. For more information, see Create an AccessKey pair.

    If you use a RAM user, make sure that the RAM user is granted the access permissions, such as AliyunOTSReadOnlyAccess. For more information, see Grant permissions to the RAM user.

    ak_secret String Yes The AccessKey secret of the account that has permissions to access the Tablestore instance. For more information, see Create an AccessKey pair.
    instance_name String Yes The name of the Tablestore instance.
    table_names String, string list, or tuple list Yes The name of the data table. If the data table uses a secondary index, set this parameter to the name of the index. For more information about the secondary index feature, see Global secondary index.

    For example, if the index1 secondary index is created for the data table, set this parameter to "index1".

    output_fields List No The output fields. You can specify the names of primary key columns or attribute columns. Example: ["province", "pop"]. If you do not configure this parameter, all columns of the row that is matched based on the input fields are returned.
    Note If multiple data tables are created in the Tablestore instance, the function returns only the data in the data table that is first used for matching.
    output_table_name String No The name of the data table in which the returned data is stored. Default value: None, which indicates that the output fields do not contain the table name. If you set this parameter to a string, the output fields include the table name.

    For example, the data table named test is used, and the transformation rule includes output_fields=["province", "pop"],output_table_name="table_name". If the data columns ["province", "pop"] in the test data table are matched, the output fields are province: xxx, pop:xxx,table_name:test.

    encoding String No The encoding method of the HTTPS request parameters. Default value: utf8.
    mode String No The overwrite mode of fields. Default value: fill-auto. For more information, see Field extraction check and overwrite modes.
  • Response

    A log that contains new fields is returned.

  • Examples

    The following examples are based on the following table_name_test data table.
    city (primary key) pop (primary key) cid province region
    bj 300 1 bj huabei
    nj 800 2 js huadong
    sh 200 3 sh huadong
    • Example 1: Find a row in the data table based on the city field and return the value of the province column for the row.
      • Raw log
        city:sh
        name:maki
        pop:200
      • Transformation rule
        e_tablestore_map(
            "city",
            "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com",
            "LTA3****",
            "VIH9****",
            "d00s0dxa****",
            "table_name_test",
            output_fields=["province"])
                                            
      • Result
        city:sh
        name:maki
        pop:200
        province:sh
    • Example 2: Find a row in the data table based on the city and pop fields and return the values of the province and cid columns for the row.
      • Raw log
        city:sh
        name:maki
        pop:200
      • Transformation rule
        e_tablestore_map(
            ["city","pop"],
            "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com",
            "LTA3****",
            "VIH9****",
            "d00s0dxa****",
            "table_name_test",
            output_fields=["province","cid"])
                                            
      • Result
        city:sh
        name:maki
        pop:200
        cid:3
        province:sh
    • Example 3: Map the city1 and pop1 fields in the raw log to the city and pop primary keys in the data table, find a row in the data table based on the fields, and return the values of all columns for the row.
      • Raw log
        city1:sh
        name:maki
        pop1:200
      • Transformation rule
        e_tablestore_map(
            [("city1","city"), ("pop1", "pop")],
            "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com",
            "LTA3****",
            "VIH9****",
            "d00s0dxa****",
            "table_name_test")
                                            
      • Result
        city:sh
        name:maki
        pop:200
        cid:3
        province:sh
        region:huadong
    • Example 4: Find a row in the data table based on the city and pop fields and return the values of all columns for the row. Set output_table_name to "table_name". In the returned result, you can view the name of the data table in which the returned data is stored.
      • Raw log
        city:sh
        name:maki
        pop:200
      • Transformation rule
        e_tablestore_map(
            ["city","pop"],
            "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com",
            "LTA3****",
            "VIH9****",
            "d00s0dxa****",
            "table_name_test",
            output_table_name="table_name"
        )
                                            
      • Result
        city:sh
        name:maki
        pop:200
        cid:3
        province:sh
        region:huadong
        table_name:table_name_test
    • Example 5: Find a row in the table_name_test, table_name_test1, and table_name_test2 data tables based on the city and pop fields, and return the values of all columns for the row. In the returned result, you can view only the data in the table_name_test data table that is first used for matching.
      • Raw log
        city:sh
        name:maki
        pop:200
      • Transformation rule
        e_tablestore_map(
            ["city","pop"],
            "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com",
            "LTA3****",
            "VIH9****",
            "d00s0dxa****",
            ["table_name_test","table_name_test1","table_name_test2"],
            output_table_name="table_name"
        )
      • Result
        city:sh
        name:maki
        pop:200
        cid:3
        province:sh
        region:huadong
        table_name:table_name_test
    • Example 6: Find a row in the data table based on the pk1 and pk2 primary keys for the index1 secondary index, and return the value of the definedcol2 predefined column for the row. The predefined column is specified for the index1 secondary index.
      • Data table (index1)
        pk1 (primary key) pk2 (primary key) definedcol2 (predefined column) definedcol3 (predefined column)
        pk1_1 pk2_1 definedcol2_1 definedcol3_1
        pk1_2 pk2_2 definedcol2_2 definedcol3_2
      • Raw log
        pk1:pk1_1
        pk2:pk2_1
      • Transformation rule
        e_tablestore_map(
            ["pk1","pk2"],
            "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com",
            "LTA3****",
            "VIH9****",
            "d00s0dxa****",
            "index1",
            output_fields= ["definedcol2"],
            output_table_name="table_name",
        )
                                            
      • Result
        pk1:pk1_1
        pk2:pk2_1
        definedcol2:definedcol2_1
        table_name:index1

e_redis_map

The e_redis_map function enriches a raw log by using a data table in ApsaraDB for Redis as the dimension table.

  • Syntax

    e_redis_map(field, output_field, host, port=6379, db=0, username=None,
                password=None, encoding="utf-8", max_retries=5, mode="fill-auto")
  • Parameters

    Parameter Type Required Description
    field String Yes The raw log field that is used to map data between the raw log and the data table. If the raw log does not contain the field, no operations are performed on the log.
    output_field String Yes The output field.
    host String Yes The endpoint of the ApsaraDB for Redis database.
    username String No The username of the account that you want to use to connect to the ApsaraDB for Redis database. This parameter is empty by default, which indicates that authentication is not performed.
    password String No The password of the account that you want to use to connect to the ApsaraDB for Redis database. This parameter is empty by default, which indicates that authentication is not performed.
    port Integer No The port of the ApsaraDB for Redis database. Default value: 6379.
    db Integer No The name of the ApsaraDB for Redis database. Default value: 0.
    encoding String No The encoding method of data in the ApsaraDB for Redis database. Default value: utf-8.
    max_retries Integer No The maximum number of retries allowed when a request to connect to the ApsaraDB for Redis database fails. Default value: 5.

    If the connection request fails after the maximum number of retries, the function skips the current log in the transformation process. Subsequent transformation is not affected.

    Each interval between retries doubles that of the previous interval. The intervals range from 1s to 120s.

    mode String No The overwrite mode of fields. Default value: fill-auto. For more information, see Field extraction check and overwrite modes.
  • Response

    A log that contains a new field is returned.

  • Examples

    The following examples are based on the following data table in ApsaraDB for Redis.

    Important Only the values of the string type are supported.
    Key Value
    i1001 { "name": "Orange", "price": 10 }
    i1002 { "name": "Apple", "price": 12 }
    i1003 { "name": "Mango", "price": 16 }
    • Example 1: Find a value in the data table based on the item field and return the value. The username and password of the account that is used to connect to the ApsaraDB for Redis database are not specified in the transformation rule.
      • Raw log
        item: i1002
        count: 7
      • Transformation rule
        e_redis_map("item", "detail", host="r-bp1olrdor8353v4s.redis.rds.aliyuncs.com")
      • Result
        item: i1002
        count: 7
        detail: {
           "name": "Apple",
           "price": 12
          }
    • Example 2: Find a value in the data table based on the item field and return the value. The username and password of the account that is used to connect to the ApsaraDB for Redis database are specified in the transformation rule.
      • Raw log
        item: i1003
        count: 7
      • Transformation rule
        e_redis_map("item", "detail", host="r-bp1olrdor8353v4s****.redis.rds.aliyuncs.com", username="r-bp****", password="***")
      • Result
        item: i1003
        count: 7
        detail:{
           "name": "Mango",
           "price": 16
          }

e_search_dict_map

The e_search_dict_map function searches the keywords in a specified data dictionary for a raw log field, maps the field to a value in the data dictionary, and returns a new field. The keywords must be query strings.

  • Syntax

    e_search_dict_map(data, output_field, multi_match=False, multi_join=" ", missing=None, mode="overwrite")
  • Parameters

    Parameter Type Required Description
    data Dict Yes The data dictionary that is used for mapping. The value must be in the {key01:value01,key01:value02,...} standard format. The keys must be query strings.
    output_field String Yes The name of the field that you want the function to return.
    multi_match Boolean No Specifies whether the system can match multiple values for the input field. Default value: False, which indicates that the system does not match multiple values and returns only the value that is matched for the last value of the input field. You can configure the multi_join parameter to concatenate multiple values that are matched for the input field.
    multi_join String No The character that is used to concatenate the multiple values that are matched for the input field. The default value is a space. This parameter takes effect only when the multi_match parameter is set to True.
    missing String No The value that is assigned to the field specified by output_field when no match is found for the input field. Default value: None, which indicates that no assignment is performed.
    Note If the data dictionary contains a key of an asterisk (*), the missing parameter becomes invalid. This is because an asterisk (*) has a higher priority than the missing parameter.
    mode String No The overwrite mode of fields. Default value: overwrite. For more information, see Field extraction check and overwrite modes.
  • Response

    The value that is matched for the input field is returned.

  • Examples

    • Example 1: Map data.
      • Raw log
        data:123
        pro:1
      • Transformation rule
        e_search_dict_map ({"pro==1": "TCP", "pro==2": "UDP", "pro==3": "HTTP"}, "protocol")
      • Result
        data:123
        pro:1
        protocol:TCP
    • Example 2: Map data based on the first character of each field value.
      • Raw log
        status:200,300
      • Transformation rule
        e_search_dict_map(
            {
                "status:2??": "ok",
                "status:3??": "redirect",
                "status:4??": "auth",
                "status:5??": "server_error",
            },
            "status_desc",
            multi_match=True,
            multi_join="Test",
        )
                                            
      • Result
        status:200,300
        status_desc: okTestredirect
  • References

    This function can be used together with other functions. For more information, see Use the e_dict_map and e_search_dict_map functions to enrich log data.

e_search_table_map

The e_search_table_map function searches a specified column in a specified table for a raw log field, maps the field to a row in the table, and returns a new field. The values of the column must be query strings.

  • Syntax

    e_search_table_map(data, inpt, output_fields, multi_match=False, multi_join=" ", missing=None, mode="fill-auto")
  • Parameters

    Parameter Type Required Description
    data Table Yes The table that is used for mapping. The table must contain a column whose values are query strings.
    inpt String Yes The name of the column in which the system searches for data. The field that is indicated by the column is considered the input field.
    output_fields String, string list, or tuple list Yes The output fields that you want the function to return. The value of this parameter is a string, string list, or tuple list.
    multi_match Boolean No Specifies whether the system can match multiple values for the input field. Default value: False, which indicates that the system does not match multiple values and returns only the value that is matched for the first value of the input field. You can configure the multi_join parameter to concatenate multiple values that are matched for the input field.
    multi_join String No The character that is used to concatenate the multiple values that are matched for the input field. The default value is a space. This parameter takes effect only when the multi_match parameter is set to True.
    missing String No The value that is assigned to the fields specified by output_fields when no match is found for the input field. Default value: None, which indicates that no assignment is performed.
    Note If the table contains a column of an asterisk (*), the missing parameter becomes invalid. This is because an asterisk (*) has a higher priority than the missing parameter.
    mode String No The overwrite mode of fields. Default value: fill-auto. For more information, see Field extraction check and overwrite modes.
  • Response

    The value that is matched for the input field is returned.

  • Examples

    • Example 1: Map the value of the city field to a row in a table and return the values of the pop and province fields for the row.
      • Raw log
        data: 123
        city: sh
        The following table is used for mapping in this example. The values in the search column are query strings.
        search pop province
        city==nj 800 js
        city==sh 2000 sh
      • Transformation rule
        e_search_table_map(
            tab_parse_csv("search,pop,province\ncity==nj,800,js\ncity==sh,2000,sh"),
            "search",
            ["pop", "province"],
        )
      • Result
        data: 123
        city: sh
        province: sh
        pop: 2000
    • Example 2: Map data in overwrite mode.
      • Raw log
        data: 123
        city: nj
        province:
      • Transformation rule
        e_search_table_map(
            tab_parse_csv("search,pop,province\ncity==nj,800,js\ncity==sh,2000,sh"),
            "search",
            "province",
            mode="overwrite",
        )
      • Result
        data: 123
        city: nj
        province: js
    • Example 3: Map data by specifying a value for the missing parameter.
      • Raw log
        data: 123
        city: wh
        province: 
      • Transformation rule
        e_search_table_map(
            tab_parse_csv("search,pop,province\ncity==nj,800,\ncity==sh,2000,sh"),
            "search",
            "province",
            missing="Unknown",
        )
      • Result
        data: 123
        city: wh
        province: Unknown
    • Example 4: Map data by setting the multi_match parameter to True.
      • Raw log
        data: 123
        city: nj,sh
        province: 
      • Transformation rule
        e_search_table_map(
            tab_parse_csv("search,pop,province\ncity:nj,800,js\ncity:sh,2000,sh"),
            "search",
            "province",
            multi_match=True,
            multi_join=",",
        )
      • Result
        data: 123
        city: nj,sh
        province: js,sh