All Products
Search
Document Center

Simple Log Service:Mapping and enrichment functions

Last Updated:Oct 30, 2024

This topic describes the syntax and parameters of mapping and enrichment functions. This topic also provides examples on how to use the functions.

Functions

Category

Function

Description

Field-based mapping

e_dict_map

Maps the value of an input field to a value in a specified data dictionary and returns a new field.

This function can be used together with other functions. For more information, see Use the e_dict_map and e_search_dict_map functions to enrich log data.

e_table_map

Maps the value of an input field to a row in a specified table and returns a new field.

This function can be used together with other functions. For more information, see Use the e_table_map function to enrich HTTP response status codes.

e_tablestore_map

Enriches a raw log by using a data table in Tablestore as the dimension table.

e_redis_map

Enriches a raw log by using a data table in ApsaraDB for Redis as the dimension table.

e_dict_map

The e_dict_map function maps the value of an input field to a value in a specified data dictionary and returns a new field.

  • Syntax

    e_dict_map(data, field, output_field, case_insensitive=True, missing=None, mode="overwrite")
  • Parameters

    Parameter

    Type

    Required

    Description

    data

    Dict

    Yes

    The data dictionary that is used for mapping. The value of this parameter must be in the {key01:value01,key01:value02,...} standard format. The keys must be strings. Example: {"1": "TCP", "2": "UDP", "3": "HTTP", "*": "Unknown"}.

    field

    String or string list

    Yes

    One or more field names. If the value of this parameter contains multiple field names, the system performs the following operations:

    • The system performs mapping on the field names in sequence.

    • If the system matches multiple values for the fields and the mode parameter is set to overwrite, the system returns the value that is last matched.

    • If the system matches no values for the fields, the system returns the value of the missing parameter.

    output_field

    String

    Yes

    The name of the field that you want the function to return.

    case_insensitive

    Boolean

    No

    Specifies whether to disable case sensitivity during mapping.

    • True: disables case sensitivity. This is the default value.

    • False: enables case sensitivity.

    Note

    If the data dictionary contains multiple keys that differ only in letter cases and the case_insensitive parameter is set to True, the system first maps the value of the input field to a key that uses the same cases as the value. If the key that uses the same cases does not exist, the system randomly maps the value to one of the multiple keys.

    missing

    String

    No

    The value that is assigned to the field specified by output_field when no match is found for the input field. Default value: None, which indicates that no assignment is performed.

    Note

    If the data dictionary contains a key of an asterisk (*), the missing parameter becomes invalid. This is because an asterisk (*) has a higher priority than the missing parameter.

    mode

    String

    No

    The overwrite mode of fields. Default value: overwrite. For more information, see Field extraction check and overwrite modes.

  • Response

    A log that contains a new field is returned.

  • Examples

    • Example 1: Map the value of the pro field in the raw log to a value in a data dictionary and generate a new field named protocol.

      • Raw log

        data:  123
        pro:  1
      • Transformation rule

        e_dict_map(
            {"1": "TCP", "2": "UDP", "3": "HTTP", "6": "HTTPS", "*": "Unknown"},
            "pro",
            "protocol",
        )
      • Result

        data:  123
        pro:  1
        protocol:  TCP
    • Example 2: Map the value of the status field in the raw logs to values in a data dictionary and generate a new field named message.

      • Raw logs

        status:  500
        status:  400
        status:  200
      • Transformation rule

        e_dict_map({"400": "Error", "200": "Success", "*": "Other"}, "status", "message")
      • Result

        status:  500
        message: Other
        status:  400
        message: Error
        status:  200
        message: Success
  • References

    This function can be used together with other functions. For more information, see Use the e_dict_map and e_search_dict_map functions to enrich log data.

e_table_map

The e_table_map function maps the value of an input field to a row in a specified table and returns a new field.

  • Syntax

    e_table_map(data, field, output_fields, missing=None, mode="fill-auto")
  • Parameters

    Parameter

    Type

    Required

    Description

    data

    Table

    Yes

    The table that is used for mapping.

    Note

    If you use the resource functions res_rds_mysql and res_log_logstore_pull as data sources, you must set the primary_keys parameter. Failing to do so will severely impact performance and may cause task delays. For more information about how to set the primary_keys parameter, see Resource functions.

    field

    String, string list, or tuple list

    Yes

    The input field. If a log does not contain the field, no operations are performed on the log.

    output_fields

    String, string list, or tuple list

    Yes

    The output fields. Example: ["province", "pop"].

    missing

    String

    No

    The value that is assigned to the fields specified by output_fields when no match is found for the input field. Default value: None, which indicates that no assignment is performed. If you want to map the input field to multiple columns, you can set the missing parameter to a list of default values that correspond to the input field. The number of the default values must be the same as the number of the columns.

    Note

    If the table contains a column of an asterisk (*), the missing parameter becomes invalid. This is because an asterisk (*) has a higher priority than the missing parameter.

    mode

    String

    No

    The overwrite mode of fields. Default value: fill-auto. For more information, see Field extraction check and overwrite modes.

  • Response

    A log that contains new fields is returned.

  • Examples

    • Example 1: Map the value of the city field to a row in a table and return the value of the province field for the row.

      • Raw log

        data: 123
        city: nj
      • Transformation rule

        e_table_map(
            tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"), "city", "province"
        )
      • Result

        data: 123
        city: nj
        province: js
    • Example 2: Map the value of the city field to a row in a table and return the values of the province and pop fields for the row.

      • Raw log

        data: 123
        city: nj
      • Transformation rule

        e_table_map(
            tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"),
            "city",
            ["province", "pop"],
        )
      • Result

        data: 123
        city: nj
        province: js
        pop: 800
    • Example 3: Use the tab_parse_csv function to build a table, map the value of the city field to a row in the table, and return the values of the province and pop fields for the row.

      • Raw log

        data: 123
        city: nj
      • Transformation rule

        e_table_map(
            tab_parse_csv("city#pop#province\nnj#800#js\nsh#2000#sh", sep="#"),
            "city",
            ["province", "pop"],
        )
      • Result

        data: 123
        city: nj
        province: js
        pop: 800
    • Example 4: Use the tab_parse_csv function to build a table, map the value of the city field to a row in the table, and return the values of the province and pop fields for the row.

      • Raw log

        data: 123
        city: nj
      • Transformation rule

        e_table_map(
            tab_parse_csv(
                "city,pop,province\n|nj|,|800|,|js|\n|shang hai|,2000,|SHANG,HAI|", quote="|"
            ),
            "city",
            ["province", "pop"],
        )
      • Result

        data: 123
        city: nj
        province: js
        pop: 800
    • Example 5: The input field is different from the corresponding field in the table that is used for mapping. Find a row in the table based on the cty and city fields and return the value of the province field for the row.

      • Raw log

        data: 123
        cty: nj
      • Transformation rule

        e_table_map(
            tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"),
            [("cty", "city")],
            "province",
        )
      • Result

        data: 123
        cty: nj
        province: js
    • Example 6: The input field is different from the corresponding field in the table that is used for mapping. Map data and rename the output field.

      • Raw log

        data: 123
        cty: nj
      • Transformation rule

        e_table_map(
            tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"),
            [("cty", "city")],
            [("province", "pro")],
        )
                                            
      • Result

        data: 123
        cty: nj
        pro: js
    • Example 7: Map the values of multiple fields to a row in a table.

      • Raw log

        data: 123
        city: nj
        pop: 800
      • Transformation rule

        e_table_map(
            tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"),
            ["city", "pop"],
            "province",
        )
      • Result

        data: 123
        city: nj
        pop: 800
        province: js
    • Example 8: Map the values of multiple fields to a row in a table. The input fields are different from the corresponding fields in the table that is used for mapping.

      • Raw log

        data: 123
        cty: nj
        pp: 800
      • Transformation rule

        e_table_map(
            tab_parse_csv("city,pop,province\nnj,800,js\nsh,2000,sh"),
            [("cty", "city"), ("pp", "pop")],
            "province",
        )
      • Result

        data: 123
        cty: nj
        pp: 800
        province: js
  • References

    This function can be used together with other functions. For more information, see Use the e_table_map function to enrich HTTP response status codes.

e_tablestore_map

The e_tablestore_map function enriches a raw log by using a data table in Tablestore as the dimension table.

  • Syntax

    e_tablestore_map(
        fields,
        endpoint,
        ak_id,
        ak_secret,
        instance_name,
        table_names,
        output_fields=None,
        output_table_name=None,
        encoding="utf8",
        mode="fill-auto",
    )
  • Parameters

    Parameter

    Type

    Required

    Description

    fields

    String, number, list, or tuple list

    Yes

    The raw log fields that are used to map data between the raw log and the data table. The function maps multiple raw log fields to primary keys in the data table one by one. Examples:

    • If the data table contains the a primary key and the raw log contains the a field, you can use fields="a".

    • If the data table contains the a, b, and c primary keys and the raw log contains the a, b, and c fields, you can use fields=["a", "b", "c"].

    • If the data table contains the a, b, and c primary keys and the raw log contains the a1, b1, and c1 fields, you can use fields=[("a1", "a"), ("b1", "b"), ("c1", "c")].

    endpoint

    String

    Yes

    The endpoint of the Tablestore instance in which the data table is created. For more information, see Endpoints.

    Note

    You can use the virtual private cloud (VPC) endpoint or public endpoint of the Tablestore instance. A VPC endpoint is used for access within the same region, and a public endpoint is used for access over the Internet regardless of regions.

    ak_id

    String

    Yes

    The AccessKey ID of the account that has permissions to access the Tablestore instance. For more information, see Create an AccessKey pair.

    If you use a RAM user, make sure that the RAM user is granted the access permissions, such as AliyunOTSReadOnlyAccess. For more information, see Grant permissions to a RAM user.

    ak_secret

    String

    Yes

    The AccessKey secret of the account that has permissions to access the Tablestore instance. For more information, see Create an AccessKey pair.

    instance_name

    String

    Yes

    The name of the Tablestore instance.

    table_names

    String, string list, or tuple list

    Yes

    The name of the data table. If the data table uses a secondary index, set this parameter to the name of the index. For more information about the secondary index feature, see Create a secondary index.

    For example, if the index1 secondary index is created for the data table, set this parameter to "index1".

    output_fields

    List

    No

    The output fields. You can specify the names of primary key columns or attribute columns. Example: ["province", "pop"]. If you do not configure this parameter, all columns of the row that is matched based on the input fields are returned.

    Note

    If multiple data tables are created in the Tablestore instance, the function returns only the data in the data table that is first used for matching.

    output_table_name

    String

    No

    The name of the data table in which the returned data is stored. Default value: None, which indicates that the output fields do not contain the table name. If you set this parameter to a string, the output fields include the table name.

    For example, the data table named test is used, and the transformation rule includes output_fields=["province", "pop"],output_table_name="table_name". If the data columns ["province", "pop"] in the test data table are matched, the output fields are province: xxx, pop:xxx,table_name:test.

    encoding

    String

    No

    The encoding method of the HTTPS request parameters. Default value: utf-8.

    mode

    String

    No

    The overwrite mode of fields. Default value: fill-auto. For more information, see Field extraction check and overwrite modes.

  • Response

    A log that contains new fields is returned.

  • Examples

    The following examples are based on the following table_name_test data table.

    city (primary key)

    pop (primary key)

    cid

    province

    region

    bj

    300

    1

    bj

    huabei

    nj

    800

    2

    js

    huadong

    sh

    200

    3

    sh

    huadong

    • Example 1: Find a row in the data table based on the city and pop fields and return the values of the province and cid columns for the row.

      • Raw log

        city:sh
        name:maki
        pop:200
      • Transformation rule

        e_tablestore_map(
            ["city","pop"],
            "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com",
            "LTA3****",
            "VIH9****",
            "d00s0dxa****",
            "table_name_test",
            output_fields=["province","cid"])
                                            
      • Result

        city:sh
        name:maki
        pop:200
        cid:3
        province:sh
    • Example 2: Map the city1 and pop1 fields in the raw log to the city and pop primary keys in the data table, find a row in the data table based on the fields, and return the values of all columns for the row.

      • Raw log

        city1:sh
        name:maki
        pop1:200
      • Transformation rule

        e_tablestore_map(
            [("city1","city"), ("pop1", "pop")],
            "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com",
            "LTA3****",
            "VIH9****",
            "d00s0dxa****",
            "table_name_test")
                                            
      • Result

        city:sh
        name:maki
        pop:200
        cid:3
        province:sh
        region:huadong
    • Example 3: Find a row in the data table based on the city and pop fields and return the values of all columns for the row. Set output_table_name to "table_name". In the returned result, you can view the name of the data table in which the returned data is stored.

      • Raw log

        city:sh
        name:maki
        pop:200
      • Transformation rule

        e_tablestore_map(
            ["city","pop"],
            "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com",
            "LTA3****",
            "VIH9****",
            "d00s0dxa****",
            "table_name_test",
            output_table_name="table_name"
        )
                                            
      • Result

        city:sh
        name:maki
        pop:200
        cid:3
        province:sh
        region:huadong
        table_name:table_name_test
    • Example 4: Find a row in the table_name_test, table_name_test1, and table_name_test2 data tables based on the city and pop fields, and return the values of all columns for the row. In the returned result, you can view only the data in the table_name_test data table that is first used for matching.

      • Raw log

        city:sh
        name:maki
        pop:200
      • Transformation rule

        e_tablestore_map(
            ["city","pop"],
            "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com",
            "LTA3****",
            "VIH9****",
            "d00s0dxa****",
            ["table_name_test","table_name_test1","table_name_test2"],
            output_table_name="table_name"
        )
      • Result

        city:sh
        name:maki
        pop:200
        cid:3
        province:sh
        region:huadong
        table_name:table_name_test
    • Example 5: Find a row in the data table based on the pk1 and pk2 primary keys for the index1 secondary index, and return the value of the definedcol2 predefined column for the row. The predefined column is specified for the index1 secondary index.

      • Data table (index1)

        pk1 (primary key)

        pk2 (primary key)

        definedcol2 (predefined column)

        definedcol3 (predefined column)

        pk1_1

        pk2_1

        definedcol2_1

        definedcol3_1

        pk1_2

        pk2_2

        definedcol2_2

        definedcol3_2

      • Raw log

        pk1:pk1_1
        pk2:pk2_1
      • Transformation rule

        e_tablestore_map(
            ["pk1","pk2"],
            "https://d00s0dxa****.cn-hangzhou.ots.aliyuncs.com",
            "LTA3****",
            "VIH9****",
            "d00s0dxa****",
            "index1",
            output_fields= ["definedcol2"],
            output_table_name="table_name",
        )
                                            
      • Result

        pk1:pk1_1
        pk2:pk2_1
        definedcol2:definedcol2_1
        table_name:index1

e_redis_map

The e_redis_map function enriches a raw log by using a data table in ApsaraDB for Redis as the dimension table.

  • Syntax

    e_redis_map(field, output_field, host, port=6379, db=0, username=None,
                password=None, encoding="utf-8", max_retries=5, mode="fill-auto")
  • Parameters

    Parameter

    Type

    Required

    Description

    field

    String

    Yes

    The raw log field that is used to map data between the raw log and the data table. If the raw log does not contain the field, no operations are performed on the log.

    output_field

    String

    Yes

    The output field.

    host

    String

    Yes

    The endpoint of the ApsaraDB for Redis database.

    username

    String

    No

    The username of the account that you want to use to connect to the ApsaraDB for Redis database. This parameter is empty by default, which indicates that authentication is not performed.

    password

    String

    No

    The password of the account that you want to use to connect to the ApsaraDB for Redis database. This parameter is empty by default, which indicates that authentication is not performed.

    port

    Integer

    No

    The port of the ApsaraDB for Redis database. Default value: 6379.

    db

    Integer

    No

    The name of the ApsaraDB for Redis database. Default value: 0.

    encoding

    String

    No

    The encoding method of data in the ApsaraDB for Redis database. Default value: utf-8.

    max_retries

    Integer

    No

    The maximum number of retries allowed when a request to connect to the ApsaraDB for Redis database fails. Default value: 5.

    If the connection request fails after the maximum number of retries, the function skips the current log in the transformation process. Subsequent transformation is not affected.

    Each interval between retries doubles that of the previous interval. The intervals range from 1s to 120s.

    mode

    String

    No

    The overwrite mode of fields. Default value: fill-auto. For more information, see Field extraction check and overwrite modes.

  • Response

    A log that contains a new field is returned.

  • Examples

    The following examples are based on the following data table in ApsaraDB for Redis.

    Important

    Only the values of the string type are supported.

    Key

    Value

    i1001

    { "name": "Orange", "price": 10 }

    i1002

    { "name": "Apple", "price": 12 }

    i1003

    { "name": "Mango", "price": 16 }

    • Example 1: Find a value in the data table based on the item field and return the value. The username and password of the account that is used to connect to the ApsaraDB for Redis database are not specified in the transformation rule.

      • Raw log

        item: i1002
        count: 7
      • Transformation rule

        e_redis_map("item", "detail", host="r-bp1olrdor8353v4s.redis.rds.aliyuncs.com")
      • Result

        item: i1002
        count: 7
        detail: {
           "name": "Apple",
           "price": 12
          }
    • Example 2: Find a value in the data table based on the item field and return the value. The username and password of the account that is used to connect to the ApsaraDB for Redis database are specified in the transformation rule.

      • Raw log

        item: i1003
        count: 7
      • Transformation rule

        e_redis_map("item", "detail", host="r-bp1olrdor8353v4s****.redis.rds.aliyuncs.com", username="r-bp****", password="***")
      • Result

        item: i1003
        count: 7
        detail:{
           "name": "Mango",
           "price": 16
          }