Event time series analysis functions are used to obtain the time series of events based on the time that the events occurred and the timestamps of the events.

Perform the following steps to prepare test data (the sample table) in Data Lake Analytics (DLA):

  1. Upload the sample table to Object Storage Service (OSS). For more information, see Upload files.

  2. Create a schema in DLA.

    ​CREATE SCHEMA event_schema with DBPROPERTIES(
    catalog='oss',
    location = 'oss://bucket-name/event/'
    );​
  3. Create an OSS external table in DLA.

    ​ CREATE EXTERNAL TABLE IF NOT EXISTS sample (
         user_id bigint COMMENT '',
         event_time bigint COMMENT '',
         event_id int COMMENT '',
         event_name varchar COMMENT '',
         event_attr varchar COMMENT '',
         event_date date COMMENT ''
     )
     ROW FORMAT DELIMITED
         FIELDS TERMINATED BY '|'
     STORED AS TEXTFILE
     LOCATION 'oss://bucket-name/event/'
     TBLPROPERTIES (
         'auto.create.location' = 'true'
     );​

EVENT_SEQUENCE

​event_sequence(
  boolean desc,
  bigint limit,
  bigint timeInMS
)​
  • Description: returns an array that contains the events listed in ascending or descending order of the time these events occurred.

  • Parameters:

    • boolean desc: specifies whether the events contained in the array are listed in descending order of the time these events occurred. Default value: false. The default value indicates that the events contained in the array are listed in ascending order of the time these events occurred.

    • bigint limit: the size of the returned array that contains the events listed in ascending or descending order of the time these events occurred. Default value: 100. Maximum value: 1000.

    • bigint timeInMS: the time and timestamps when events occurred. The timestamp is of the LONG type and the time is accurate to milliseconds.

  • Return value type: ARRAY[BIGINT].

  • Examples:

    ​  SELECT event_name, event_sequence(event_time) AS a 
      FROM sample
      GROUP BY 1 
      ORDER BY 1;
      +----------|-------------------------------+
      |event_name| event_sequence                |
      +----------|-------------------------------+
      | Add a product to the shopping cart| [1483254745351]               |
      |  Add a product to favorites | [1483220795802]               |
      |  Browse a product | [1483233554546, 1483240004108]|
      |  Generate an order | [1483222204303] |
      |    Log on   | [1483220124362, 1483233099062, 1483266703322, 1483279486394]        |
      |  Unsubscribe from a product  |  [1483267998231]              |
    SELECT event_name, event_sequence(true, 3, event_time) AS a 
      FROM sample
      GROUP BY 1 
      ORDER BY 1;
      +-----------------+-----------------------------------------------+
      | event_name      | a                                             |
      +-----------------+-----------------------------------------------+
      | Add a product to the shopping cart      | [1483254745351]                               |
      | Add a product to favorites        | [1483220795802]                               |
      | Browse a product        | [1483240004108, 1483233554546]                |
      | Generate an order        | [1483222204303]                               |
      | Log on            | [1483279486394, 1483266703322, 1483233099062] |
      | Unsubscribe from a product        | [1483267998231]                               |

EVENT_INTERVAL

​event_interval(
  boolean desc,
  bigint limit, 
  bigint timeInMS,
  varchar timeZoneKey,
  varchar timeUnit
)​
  • Description: returns an array that contains time intervals between the events that occurred.

  • Parameters:

    • boolean desc: specifies whether the events contained in the array are listed in descending order of the time these events occurred. Default value: false. The default value indicates that the events contained in the array are listed in ascending order of the time these events occurred.

    • bigint limit: the size of the returned array that contains the events listed in ascending or descending order of the time these events occurred. Default value: 100. Maximum value: 1000.

    • bigint timeInMS: the time and timestamps when events occurred. The timestamp is of the LONG type and the time is accurate to milliseconds.

    • varchar timeZoneKey: the name of the specified time zone. Default value: the current time zone of the system (the region where the DLA service resides).

    • varchar timeUnit: the unit of time intervals.

      Valid values: YEAR, QUARTER, MONTH, WEEK, DAY, HOUR, MINUTE, SECOND, and MILLISECOND.

  • Return value type: ARRAY[BIGINT].

  • Examples:

    SELECT event_name,
             event_interval(event_time, 'second') as a
      FROM sample
      GROUP BY 1
      ORDER BY 1;
      +-----------------+-----------------------+
      | event_name      | a                     |
      +-----------------+-----------------------+
      | Add a product to the shopping cart      | []                    |
      | Add a product to favorites        | []                    |
      | Browse a product        | [6449]                |
      | Generate an order        | []                    |
      | Log on            | [12974, 33604, 12783] |
      | Unsubscribe from a product        | []                    |
    SELECT event_name,
             event_interval(true, 3, event_time, current_timezone(), 'millisecond') as a
      FROM sample
      GROUP BY 1
      ORDER BY 1;
      +-----------------+-----------------------------------+
      | event_name      | a                                 |
      +-----------------+-----------------------------------+
      | Add a product to the shopping cart      | []                                |
      | Add a product to favorites        | []                                |
      | Browse a product        | [-6449562]                        |
      | Generate an order        | []                                |
      | Log on            | [-12783072, -33604260, -12974700] |
      | Unsubscribe from a product        | []                                |
    ​  SELECT event_name,
             event_interval(event_time, 'UTC', 'day') as a
      FROM sample
      GROUP BY 1
      ORDER BY 1;
      +-----------------+-----------+
      | event_name      | a         |
      +-----------------+-----------+
      | Add a product to the shopping cart      | []        |
      | Add a product to favorites        | []        |
      | Browse a product        | [0]       |
      | Generate an order        | []        |
      | Log on            | [0, 0, 0] |
      | Unsubscribe from a product        | []        |