All Products
Search
Document Center

MaxCompute:Use MaxCompute to perform funnel analysis in e-commerce scenarios

Last Updated:Jun 07, 2024

This topic describes how to use MaxCompute to perform offline computing and connect to Quick BI to perform funnel analysis in e-commerce scenarios.

Background information

The funnel model helps analyze the product operations based on the conversion rates of data at different stages. A conversion funnel determines the stage at which an issue occurs based on data conversion rates at different stages. This helps you continuously optimize the product. The e-commerce funnel model shows the conversion of user behavior from browsing a commodity page to purchasing a commodity. This topic describes how to perform funnel analysis on user behavior from browsing a commodity page, clicking the commodity, to purchasing the commodity. This topic also describes how to display data in a funnel chart.

Prerequisites

Procedure

  1. Use Simple Log Service to collect log data.

    For more information about how to collect log data by using Simple Log Service, see Data collection overview. In this example, test data is used. To download test data, click TestData.

  2. Use MaxCompute to build an offline computing data model.

    1. On the DataStudio page of the DataWorks console, execute the following statement to create an operational data store (ODS) table named ods_user_trace_data:

      -- The dt column is used to partition the table by time in the unit of days. 
      CREATE TABLE IF NOT EXISTS ods_user_trace_data
      (
          md5                     STRING COMMENT 'First eight characters of the MD5 value of the user ID',
          uid                     STRING COMMENT 'User ID',
          ts                      BIGINT COMMENT 'Timestamp of a user operation',
          ip                      STRING COMMENT 'IP address',
          status                  BIGINT COMMENT 'Status code returned by the server',
          bytes                   BIGINT COMMENT 'Number of bytes sent to the client',
          device_brand            STRING COMMENT 'Device brand',
          system_type             STRING COMMENT 'OS type: Android, IOS, ipad, or Windows_phone',
          customize_event         STRING COMMENT 'Custom event: logon, logoff, purchase, registration, click, background running, user switching, browsing, or comment',
          use_time                BIGINT COMMENT 'The duration for which the app is used at a time. This field is required when the custom event is logoff, background running, or user switching.',
          customize_event_content STRING COMMENT 'The content of the custom event. This field is required when the custom event is browsing or comment.'
      ) 
      PARTITIONED BY
      (
          dt STRING  
      );
      Note
      • The fields in the preceding table creation statement are constructed based on the test data. For more information about how to create a table in the DataWorks console, see Create an ODPS SQL node.

    2. Execute the following statement to add a partition to the ods_user_trace_data table:

      ALTER TABLE ods_user_trace_data ADD PARTITION (dt=${bdp.system.bizdate});
    3. Migrate the collected log data to MaxCompute.

      For more information about how to migrate the collected log data to MaxCompute, see Migrate log data to MaxCompute.

    4. On the DataStudio page of the DataWorks console, execute the following statement to create a data warehouse detail (DWD) table named dw_user_trace_data:

      -- The dt column is used to partition the table by time in the unit of days. 
      CREATE TABLE IF NOT EXISTS dw_user_trace_data
      (
          uid                     STRING COMMENT 'User ID',
          device_brand            STRING COMMENT 'Device brand',
          system_type             STRING COMMENT 'OS type: Android, IOS, ipad, or Windows_phone',
          customize_event         STRING COMMENT 'Custom event: logon, logoff, purchase, registration, click, background running, user switching, browsing, or comment',
          use_time                BIGINT COMMENT 'The duration for which the app is used at a time. This field is required when the custom event is logoff, background running, or user switching.',
          customize_event_content STRING COMMENT 'The content of the custom event. This field is required when the custom event is browsing or comment.'
      ) 
      PARTITIONED BY
      (
          dt STRING 
      );
      Note

      For more information about DWD, see Data warehouse detail layer.

    5. Execute the following statement to insert data into the dw_user_trace_data table:

      INSERT INTO dw_user_trace_data PARTITION (dt = '${bdp.system.bizdate}')
      SELECT  uid
              ,device_brand
              ,system_type
              ,customize_event
              ,use_time
              ,customize_event_content
      FROM    ods_user_trace_data
      WHERE   dt = '${bdp.system.bizdate}'
      ;
    6. On the DataStudio page of the DataWorks console, execute the following statement to create an application data service (ADS) table named rpt_user_trace_data:

      -- The dt column is used to partition the table by time in the unit of days. 
      CREATE TABLE IF NOT EXISTS rpt_user_trace_data
      (
          browse      STRING COMMENT 'Page views',
          click       STRING COMMENT 'Clicks',
          purchase    STRING COMMENT 'Purchase quantity',
          browse_rate STRING COMMENT 'Browse conversion rate',
          click_rate  STRING COMMENT 'Click conversion rate'
      ) 
      PARTITIONED BY
      (
          dt STRING 
      );
      Note

      For more information about ADS, see Divide a data warehouse into layers.

    7. Execute the following SQL statements on the DataStudio page to write business code logic:

      INSERT OVERWRITE TABLE rpt_user_trace_data PARTITION (dt=${bdp.system.bizdate})
      SELECT browse AS Page views
            ,click AS Clicks
            ,purchase AS Purchase quantity
            ,concat(round((click/browse)*100,2),'%') AS Click conversion rate
            ,concat(round((purchase/click)*100,2),'%') AS Purchase conversion rate 
      FROM
      (SELECT dt,count(1) browse FROM dw_user_trace_data WHERE customize_event='browse' 
       AND dt = ${bdp.system.bizdate} GROUP BY dt) a
      LEFT JOIN
      (SELECT dt,count(1) click FROM dw_user_trace_data WHERE customize_event='click' 
       AND dt = ${bdp.system.bizdate} GROUP BY dt) b
      ON a.dt=b.dt
      LEFT JOIN
      (SELECT dt,count(1) purchase FROM dw_user_trace_data WHERE customize_event='purchase' 
      AND dt = ${bdp.system.bizdate} GROUP BY dt)c 
      ON  a.dt=c.dt 
      ;
      Note

      The user behavior starts from browsing a commodity page, clicking a commodity, and then to purchasing the commodity. The conversion rate at each stage is the proportion of users who go from one page to the next page. For example, the click conversion rate is calculated based on the following formula: Number of users who click the commodity/Number of users who browse the page.

    8. Query the offline data calculation results in the rpt_user_trace_data table.

      • Sample statement:

        SELECT * FROM rpt_user_trace_data WHERE dt='20231126';
      • Returned results:

        +------------+------------+------------+-------------+------------+------------+
        | browse     | click      | purchase   | browse_rate | click_rate | dt         |
        +------------+------------+------------+-------------+------------+------------+
        | 35         | 16         | 2          | 45.71%      | 12.5%      | 20231126   |
        +------------+------------+------------+-------------+------------+------------+
  3. Display data on a dashboard.

    You can use Quick BI to create a dashboard that displays the table data for website user profile analysis. For more information, see Add a cloud data source MaxCompute and Funnel Chart.

    Important
    • Before you create a dataset in the Quick BI console, you must check whether the three-layer model is enabled for your MaxCompute project.

      You can run the setproject; command in MaxCompute to query the value of the odps.namespace.schema parameter in the project properties.

      • If the value of this parameter is true, the three-layer model is enabled for the project. In this case, you must use custom SQL statements to create a dataset. Quick BI does not support the three-layer model of MaxCompute. If you drag fields to create a dataset, the SQL statements automatically generated in Quick BI fail to be executed and an error message indicating that the table cannot be found appears.

      • If the value of this parameter is false, the three-layer model is not enabled for the project. You can create a dataset by using custom SQL statements or by dragging fields.

    • To generate the following funnel chart, you must select a dataset whose table object is ods_user_trace_data and drag the customize_event field to the Funnel Layering/Dimension field.

    image.png