All Products
Search
Document Center

Simple Log Service:Obtain data from an ApsaraDB RDS for MySQL database for data enrichment

Last Updated:Feb 19, 2024

The data transformation feature of Simple Log Service allows you to obtain data from ApsaraDB RDS for MySQL databases and enrich the data based on data transformation rules.

Background information

When you analyze data, you may need to obtain data from different storage sources. For example, the data of user operations and user behavior is stored in Simple Log Service, and the data of user properties and registration is stored in an ApsaraDB RDS for MySQL database. In this case, you can use the data transformation feature to obtain data from the database and store the data in a Logstore.

You can use the res_rds_mysql function to obtain data from an ApsaraDB RDS for MySQL database and then use the e_table_map or e_search_table_map function to enrich the data.

Note
  • The instance on which your ApsaraDB RDS for MySQL database is created must reside in the same region as your Simple Log Service project. Otherwise, you cannot obtain data from the database.

  • You can access and obtain data from an ApsaraDB RDS for MySQL database by using an internal endpoint of the instance on which the database is created. For more information, see Obtain data from an ApsaraDB RDS for MySQL database over the internal network.

Use the e_table_map function to enrich data

In this example, the e_table_map and res_rds_mysql functions are used to enrich data.

  • Raw data

    • Sample data records in a table of an ApsaraDB RDS for MySQL database

      province

      city

      population

      cid

      eid

      Shanghai

      Shanghai

      2000

      1

      00001

      Tianjin

      Tianjin

      800

      1

      00002

      Beijing

      Beijing

      4000

      1

      00003

      Henan

      Zhengzhou

      3000

      2

      00004

      Jiangsu

      Nanjing

      1500

      2

      00005

    • Sample logs in a Simple Log Service Logstore

      time:"1566379109"
      data:"test-one"
      cid:"1"
      eid:"00001"
      
      time:"1566379111"
      data:"test_second"
      cid:"1"
      eid:"12345"
      
      time:"1566379111"
      data:"test_three"
      cid:"2"
      eid:"12345"
      
      time:"1566379113"
      data:"test_four"
      cid:"2"
      eid:"12345"
  • Transformation rule

    You can configure a transformation rule to match the cid field in the Logstore against the cid field in the table. If a value of the cid field is the same between the Logstore and table, a log matches a data record. Then, the system returns the province, city, and population fields and the field values for the matched data record in the table, and concatenates the returned data with the matched log in the Logstore to generate a new log.

    Note
    • If multiple values of a field are matched in the table, the e_table_map function obtains only the first data record. In this example, the cid field in the table has multiple values of 1.

    • The e_table_map function supports only single-row matching. If you want to implement multi-row matching and combine the matched data into a new log, you can use the e_search_table_map function. For more information, see Use the e_search_map_table function to enrich data.

    e_table_map(res_rds_mysql(address="rds-host", username="mysql-username",password="xxx",database="xxx",table="xx",refresh_interval=60),"cid",["province","city","population"])

    For more information about how to configure an ApsaraDB RDS for MySQL database in the res_rds_mysql function, see res_rds_mysql.

  • Transformation result

    time:"1566379109"
    data:"test-one"
    cid:"1"
    eid:"00001"
    province:"Shanghai"
    city:"Shanghai"
    population:"2000"
    
    time:"1566379111"
    data:"test_second"
    cid:"1"
    eid:"12345"
    province:"Shanghai"
    city:"Shanghai"
    population:"2000"
    
    time:"1566379111"
    data:"test_three"
    cid:"2"
    eid:"12345"
    province:"Henan"
    city:"Zhengzhou"
    population:"3000"
    
    time:"1566379113"
    data:"test_four"
    cid:"2"
    eid:"12345"
    province:"Henan"
    city:"Zhengzhou"
    population:"3000"

Use the e_search_map_table function to enrich data

In this example, the e_search_map_table and res_rds_mysql functions are used to enrich data.

  • Raw data

    • Sample data records in a table of an ApsaraDB RDS for MySQL database

      content

      name

      age

      city~=n*

      aliyun

      10

      province~=su$

      Maki

      18

      city:nanjing

      vicky

      20

    • Sample log in a Simple Log Service Logstore

      time:1563436326
      data:123
      city:nanjing
      province:jiangsu
  • Transformation rule

    You can configure a transformation rule to match the values of the content field in the table against the log in the Logstore. The values are key-value pairs. A key corresponds to a field name in the log. A value corresponds to a field value in the log and is a regular expression. The system concatenates the related fields and field values in the table based on the matching result with the log to generate a new log.

    Note
    • For more information about how to configure an ApsaraDB RDS for MySQL database in the res_rds_mysql function, see res_rds_mysql.

    • The content field is included in the table. When the system matches the values of the field against the log, various matching modes are supported, such as regular expression match, exact match, and fuzzy match. For more information about matching rules, see e_search.

    • Single-row matching

      The system returns the transformation result when one data record in the table matches the log.

      e_search_table_map(res_rds_mysql(address="rds-host", username="mysql-username",password="xxx",database="xxx",table="xx",refresh_interval=60),"content","name")
    • Multi-row matching

      The system traverses all data records in the table and adds all matched data to the specified field.

      Note

      The following parameter settings are required:

      • multi_match=True: enables multi-row matching.

      • multi_join=,": concatenates multiple matched values with commas (,).

      e_search_table_map(res_rds_mysql(address="rds-host", username="mysql-username",password="xxx",database="xxx",table="xx",refresh_interval=60),"content","name",multi_match=True,multi_join=",")
  • Transformation result

    • Single-row matching

      In this example, the system checks whether the value of the city field in the log matches the n* expression. If the match is successful, the system returns the name field and field value for the matched data record in the table to generate a new log.

      time:1563436326
      data:123
      city:nanjing
      province:jiangsu
      name:aliyun
    • Multi-row matching

      In this example, the system checks whether the value of the city field in the log matches the n* expression, whether the value of the province field in the log matches the su$ expression, and whether the value of the city field in the log includes nanjing. In this example, a regular expression is preceded by ~=. The colon (:) indicates whether the followed string is included. If the match is successful, the system returns the name field and three values of the field in the table, and concatenates the returned data with the log to generate a new log. The values are separated by commas (,).

      time:1563436326
      data:123
      city:nanjing
      province:jiangsu
      name:aliyun,Maki,vicky