Log Service allows you to upload a CSV file from your computer to Log Service by using an SDK. This way, the CSV file is hosted on Log Service and can be associated with a Logstore of Log Service. This topic describes how to perform a JOIN query on a CSV file that is hosted on Log Service and data in a Logstore of Log Service.

Prerequisites

  • Logs are collected. For more information, see Log collection methods.
  • Indexes are created. For more information, see Configure indexes.
  • A CSV file is created.
  • Log Service SDK for Python is installed. For more information, see Install Log Service SDK for Python.

    aliyun-log-python-sdk V0.7.3 and later are supported. You can use the pip install aliyun-log-python-sdk -U command to upgrade the SDK.

Limits

  • Only one CSV file can be associated at a time.
  • You can associate Log Service with a CSV file that contains no more than 50 MB of data. The CSV file is uploaded to Log Service after it is compressed by using the SDK. The size of the file after compression must be less than 9.9 MB.

Sample data

The Logstore stores the logon operations of a user, and the CSV file records the basic information of the user, such as the gender and age. After you associate the Logstore with the CSV file, you can analyze the metrics for user properties.

  • Logstore
    userid:100001
    action:login
    __time__:1637737306
  • CSV fileCSV file

Procedure

  1. Use Log Service SDK for Python to create an external store.
    For more information about Log Service SDK for Python, see Overview.
    from aliyun.log import *
    
    endpoint='cn-shanghai.log.aliyuncs.com'
    
    accessKeyId='test-project'
    accessKey='TAI****YDw'
    project='lr****VM'
    ext_logstore='user_meta'
    csv_file='./user.csv'
    
    
    client = LogClient(endpoint, accessKeyId, accessKey)
    
    res = client.create_external_store(project,
        ExternalStoreCsvConfig(ext_logstore, csv_file,
            [
                {"name" : "userid", "type" : "bigint"},
                {"name" : "nick", "type" : "varchar"},
                {"name" : "gender", "type" : "varchar"},
                {"name" : "province", "type" : "varchar"},
                null
            ]))
    
    res.log_print()
    Parameter Description
    endpoint The Log Service endpoint. For more information, see Endpoints.
    accessKeyId The AccessKey ID of your Alibaba Cloud account. For more information, see AccessKey pair.
    Warning We recommend that you use the AccessKey pair of a RAM user to call API operations. This reduces the risks caused by the leak of an AccessKey pair.
    accessKey The AccessKey secret of your Alibaba Cloud account. For more information, see AccessKey pair.
    project The project to which the Logstore belongs.
    ext_logstore The name of the external store, which is the same as the name of the virtual external table. The name must meet the following requirements:
    • The name can contain only lowercase letters, digits, hyphens (-), and underscores (_).
    • The name must start and end with a lowercase letter or a digit.
    • The name must be 3 to 63 characters in length.
    csv_file The path and name of the CSV file.
    Table schema The properties of the virtual external table, including the column names and data types. The following table schema is an example. You can replace the table schema based on your business requirements.
    [
         {"name" : "userid", "type" : "bigint"},
         {"name" : "nick", "type" : "varchar"},
         {"name" : "gender", "type" : "varchar"},
         {"name" : "province", "type" : "varchar"},
         null
    ]
  2. Log on to the Log Service console.
  3. In the Projects section, click the name of the project that you want to view.
  4. Choose Log Storage > Logstores. On the Logstores tab, click the Logstore that you want to view.
  5. Execute the following statement to check whether the external store is created.
    In the following statement, user_meta specifies the name of the external store. Replace the value with the name that you specify when you create the external store.
    * | SELECT * FROM user_meta
    If the content of the CSV file is returned, the external store is created. Associated CSV file
  6. Execute the following statement to perform a JOIN query on the Logstore and the CSV file.
    The Logstore is associated with the CSV file based on the userid field in the Logstore and the userid field in the CSV file. website_log is the name of the Logstore. user_meta is the name of the external store that you create. You can configure the parameters based on your business requirements.
    * | SELECT * FROM website_log JOIN user_meta ON website_log.userid = user_meta.userid
    Associated query