This topic describes how to use Realtime Compute to obtain the latest transaction records.

Background

The order transaction table of an e-commerce merchant may contain operation records about the same order at different times. For example, the order transaction table records a user's operations such as modifying the quantity of the product to be ordered and returning the product. The merchant only wants to obtain valid transaction records. The following section describes how to use Realtime Compute to obtain the latest transaction records.

Procedure

  • Syntax
    SELECT col1, col2, col3
    FROM (
      SELECT col1, col2, col3
        ROW_NUMBER() OVER ([PARTITION BY col1[, col2..]]
        ORDER BY col1 [asc|desc][, col2 [asc|desc]...]) AS rownum
      FROM table_name)
    WHERE rownum <= N [AND conditions]
    Parameter Description
    ROW_NUMBER() The OVER window function used to compute the number of a row. The value starts from 1.
    PARTITION BY col1[, col2..] The columns by which you want to partition the table. This parameter is optional.
    ORDER BY col1 [asc|desc][, col2 [asc|desc]...] The columns by which you want to sort your data. You can specify different orders for multiple columns.
  • Examples
    • Test data
      id (BIGINT) TIME (VARCHAR) VALUE (BIGINT)
      1 1517898096 5
      1 1517887296 44
      1 1517872896 32
      2 1517872896 10
    • Test statements
      create table source_table (
        id      BIGINT,
        `TIME`  VARCHAR,
        `VALUE` BIGINT
      )with(
        type='datahub',
        endPoint='yourEndpointURL',
        project='yourProjectName',
        topic='yourTableName',
        accessId='yourAccessId',
        accessKey='yourAccessSecret'
      );
      CREATE TABLE result_table (
           id   BIGINT,
        `TIME`  VARCHAR,
        `VALUE` BIGINT
      ) WITH (
          type= 'rds',
          url = 'yourRDSDatabaseURL',
          userName = 'yourDatabaseName',
          password = 'yourDatabasePassword',
          tableName = 'yourTableName'
      );
      INSERT INTO result_table
      SELECT id,`TIME`,`VALUE`
      FROM (
          SELECT *,
             ROW_NUMBER() OVER (PARTITION BY id ORDER BY `TIME` desc) AS rownume
          FROM 
          source_table
      ) 
      WHERE rownume <= 1                
      ;
    • Test results
      id (BIGINT) TIME (VARCHAR) VALUE (BIGINT)
      1 1517898096 5
      2 1517872896 10
  • Key points
    SELECT id,`TIME`,`VALUE`
    FROM (
        SELECT *,
           ROW_NUMBER() OVER (PARTITION BY id ORDER BY `TIME` desc) AS rownume
        FROM 
        source_table
    ) 
    WHERE rownume <= 1 

    Some orders in the order table have multiple records generated at different times. To obtain only the last generated record for each order, use row_number() OVER (PARTITION BY id ORDER BY TIME DESC) to group records by order ID and sort the records in each group in descending order of time. The computing result lists the sequential numbers (which are continuous and unique in a group) of sorted records in each group. The first record in each group is the last generated record of the corresponding order.