This topic describes how to use the data demo to generate data that simulates data in a real environment and can be used in subsequent data analytics.

Prerequisites

  • A Table Store instance is created in the China (Beijing) region. The instance name and endpoint for accessing the instance are obtained. You can log on to the Table Store console and click the instance on the Overview or All Instances page to obtain the endpoint. If you access the instance from a region different from that of the instance, we recommend that you use the public endpoint.
  • The AccessKey ID and AccessKey secret of your Alibaba Cloud account are obtained. You can log on to the Alibaba Cloud console with your Alibaba Cloud account and view the AccessKey ID and AccessKey secret on the Security Management page.
    Note The AccessKey ID and AccessKey secret of your Alibaba Cloud account are the credentials for accessing Alibaba Cloud APIs. Keep them safe.

Procedure

  1. Download the data demo package.
    You can download one of the following demo packages based on your operating system. In this topic, the Windows 7 64-bit operating system is used.
  2. Configure the data demo.
    Decompress the data demo package and edit the app.conf file in the conf directory.

    The following is an example of the content in the app.conf file:

    endpoint = "https://workshop-bj-001.cn-beijing.ots.aliyuncs.com"
    instanceName = "workshop-bj-001"
    accessKeyId = "LTAIF24u7g******"
    accessKeySecret = "CcwFeF3sWTPy0wsKULMw34Px******"
    usercount = "200"
    daysCount = "7"
    You must modify the following parameters:
    • endpoint: the endpoint used to access the Table Store instance. We recommend that you use the public endpoint.
    • instanceName: the name of the Table Store instance.
    • accessKeyId and accessKeySecret: the AccessKey used to access Alibaba Cloud APIs.
  3. Start the data demo to prepare test data.
    1. Start the Windows command-line tool, go to the directory where the data demo resides, and run the following command to view the usage of demo-related commands:
      workshop_demo.exe -h
      Start the Windows command-line tool, go to the directory where the data demo resides, and run the following command to view the usage of demo-related commands. This command lists the demo-related commands.
      workshop_demo.exe -h List demo-related commands.
      * prepare: Prepare test data, create data tables, and generate behavior logs of a week for users based on the user count specified in the app.conf file.
      * raw ${userid} ${date} ${Top log count}: Query a specified number of logs of the specified user on the specified date.
      * new/day_active/month_active/day_pv/month_pv: Query data of the specified type in the result table. new: new users. day_active: daily active users. month_active: monthly active users. day_pv: daily page views (PVs). month_pv: monthly PVs.
    2. Run the following command to generate test data:
      workshop_demo.exe prepare

    In this process, the data demo automatically creates two tables in Table Store. The following tables describe the columns in the created tables.

    • Raw log table: user_trace_log
      Column Data type Description
      md5 STRING The MD5 value of the user ID. This column is a primary key column.
      uid STRING The user ID. This column is a primary key column.
      ts BIGINT The timestamp when the user performed the operation. This column is a primary key column.
      ip STRING The IP address of the client that sends the request.
      status BIGINT The status code returned by the server.
      bytes BIGINT The number of bytes returned to the client.
      device STRING The model of the terminal used by the user.
      system STRING The version of the operating system used by the user, in the format of iosxxx or androidxxx.
      customize_event STRING The custom event, including logon, exit, purchase, registration, click, background running, user switch, and browse.
      use_time BIGINT The use duration of the application at a time. This field is available when the custom event is exit, background running, or user switch.
      customize_event_content STRING The content of the custom event.
    • Analysis result table: analysis_result
      Column Data type Description
      metric STRING The metric of data. Valid values: new, day_active, month_active, day_pv, month_pv. This column is a primary key column.
      ds STRING The data timestamp, in the format of yyyy-mm-dd or yyyy-mm. This column is a primary key column.
      num BIGINT The value of the specified metric.
  4. Verify the data.
    • Query detailed logs of the specified user.
      Run the following command to query a specified number of logs of the specified user on the specified date. In the command, set the date to that when the logs are generated.
      raw ${userid} ${date} ${Top log count}

      In the preceding command, ${userid} indicates the ID of the user, ${date} indicates the date when the logs were generated, and ${Top log count} indicates the number of logs to query. For example, if the table was created on June 15, 2019, you can run the workshop_demo.exe raw 00010 "2019-06-15" 20 command to query 20 logs for the user whose ID is 00010.

      Note Table Store is schema-free. Therefore, you do not need to pre-define attribute columns. Different events in the customize_event column have different event content. Therefore, the demo generates both a custom event and its content in a data record.
    • Query data in the analysis result table.

      You can run the workshop_demo.exe day_active command to query the number of daily active users.