All Products
Search
Document Center

AnalyticDB:Path analysis functions

Last Updated:Aug 11, 2025

AnalyticDB for MySQL provides path analysis functions, such as SEQUENCE_MATCH and SEQUENCE_COUNT, to efficiently process path computation tasks. These functions match and count event sequences. You can then use the analysis results to detect and analyze complex event sequences. This topic describes how to use these path analysis functions.

AnalyticDB for MySQL supports the following path analysis functions:

  • SEQUENCE_MATCH(): checks whether a sequence of events occurs in a specific order.

  • SEQUENCE_COUNT(): counts the number of occurrences of a specific sequence of events.

Prerequisites

An AnalyticDB for MySQL cluster of V3.2.1.0 or later is created.

Note

To view and update the minor version of an AnalyticDB for MySQL cluster, log on to the AnalyticDB for MySQL console and go to the Configuration Information section of the Cluster Information page.

Test dataset

The following example shows how to prepare a dataset for testing the path analysis functions:

  1. Create an AnalyticDB for MySQL test table.

    CREATE TABLE sequence_test(
      `uid` INT COMMENT 'user id',
      `ts` DATETIME COMMENT 'date time', 
      `number` INT NULL COMMENT 'number' 
    );
  2. Insert data into the test table.

    INSERT INTO sequence_test  VALUES (1, '2022-11-02 10:41:00', 1);
    INSERT INTO sequence_test  VALUES (2, '2022-11-02 13:28:02', 2); 
    INSERT INTO sequence_test  VALUES (3, '2022-11-02 16:15:01', 1);                                      
    INSERT INTO sequence_test  VALUES (4, '2022-11-02 19:05:04', 2); 
    INSERT INTO sequence_test  VALUES (5, '2022-11-02 20:08:44', 3); 

SEQUENCE_MATCH

The SEQUENCE_MATCH() function checks whether a sequence of events occurs in a specific order. This function matches a sequence of events against the specified regular expression and returns a Boolean value. A value of 1 indicates that the sequence of events matches the specified regular expression. A value of 0 indicates that the sequence of events does not match the specified regular expression.

Syntax

SEQUENCE_MATCH(pattern, timestamp, cond1, cond2, ...)

Parameters

Parameter

Description

pattern

The regular expression used to match the order of event occurrences.

The supported syntax for pattern is '(?n)(?n)', '(?n).*(?n)', and '(?n)(?t operator value)(?n)'. The parameters are described as follows:

  • (?n): Matches a conditional event. It represents the nth condition in the list of cond constraints. The value of n cannot be 0 or exceed the number of cond constraints.

  • .*: Allows any number of other events to occur between two specified events.

  • (?t operator value): Specifies the time difference between two events.

    • t: A fixed reserved word that represents the timestamp field.

    • operator: The operator. Supported operators include >=, >, <, <=, ==, <>, and !=.

    • value: The value of the timestamp field. This parameter supports floating-point, positive, and negative numbers.

      • For DATETIME and TIMESTAMP types, the unit is milliseconds (ms).

      • For the DATE type, the unit is days (d).

      • For the BIGINT type, the unit depends on the time unit of the event.

timestamp

The timestamp field of the event, which indicates when the event occurred. This field supports the BIGINT, DATETIME, TIMESTAMP, and DATE data types.

cond

The constraints for an event. Each constraint is typically a comparison operation on a field.

The data type is BOOLEAN.

You can pass up to 32 cond constraints.

Note

The function first filters for events that meet the cond constraints, and then matches them against the pattern.

Examples

  • This example filters an event sequence for three types of events: number=1, number=2, and number=3. It then checks if an event where number=1 is immediately followed by an event where number=3.

    Sample statement:

    SELECT SEQUENCE_MATCH('(?1)(?3)', ts, number = 1, number = 2, number = 3) FROM sequence_test;

    Sample result:

    +-------------------------------------------------------------------+
    | sequence_match('(?1)(?3)', ts, number = 1, number = 2, number = 3)|
    +-------------------------------------------------------------------+
    |                                0                                  |
    +-------------------------------------------------------------------+
  • This example filters an event sequence for three types of events: number=1, number=2, and number=3. It then checks if an event where number=3 occurs after an event where number=1, with any number of other events allowed in between.

    Sample statement:

    SELECT SEQUENCE_MATCH('(?1).*(?3)', ts, number = 1, number = 2, number = 3) FROM sequence_test;

    Sample result:

    +----------------------------------------------------------------------+
    | sequence_match('(?1).*(?3)', ts, number = 1, number = 2, number = 3) |
    +----------------------------------------------------------------------+
    |                                  1                                   |
    +----------------------------------------------------------------------+
  • This example filters an event sequence for two types of events: number=1 and number=2. It then checks if an event where number=1 is immediately followed by an event where number=2, and the time difference between them is 3,600 ms or greater.

    Sample statement:

    SELECT SEQUENCE_MATCH('(?1)(?t>=3600)(?2)', ts, number = 1, number = 2) FROM sequence_test;

    Sample result:

    +------------------------------------------------------------------+
    | sequence_match('(?1)(?t>=3600)(?2)', ts, number = 1, number = 2) |
    +------------------------------------------------------------------+
    |                                1                                 |
    +------------------------------------------------------------------+

SEQUENCE_COUNT

The SEQUENCE_COUNT() function counts the number of occurrences of a specific sequence of events.

Syntax

SEQUENCE_COUNT(pattern, timestamp, cond1, cond2, ...)

Parameters

Parameter

Description

pattern

The regular expression used to match the order of event occurrences.

The supported syntax for pattern is '(?n)(?n)', '(?n).*(?n)', and '(?n)(?t operator value)(?n)'. The parameters are described as follows:

  • (?n): Matches a conditional event. It represents the nth condition in the list of cond constraints. The value of n cannot be 0 or exceed the number of cond constraints.

  • .*: Allows any number of other events to occur between two specified events.

  • (?t operator value): Specifies the time difference between two events.

    • t: A fixed reserved word that represents the timestamp field.

    • operator: The operator. Supported operators include >=, >, <, <=, ==, <>, and !=.

    • value: The value of the timestamp field. This parameter supports floating-point, positive, and negative numbers.

      • For DATETIME and TIMESTAMP types, the unit is milliseconds (ms).

      • For the DATE type, the unit is days (d).

      • For the BIGINT type, the unit depends on the time unit of the event.

timestamp

The timestamp field of the event, which indicates when the event occurred. This field supports the BIGINT, DATETIME, TIMESTAMP, and DATE data types.

cond

The constraints for an event. Each constraint is typically a comparison operation on a field.

The data type is BOOLEAN.

You can pass up to 32 cond constraints.

Note

The function first filters for events that meet the cond constraints, and then matches them against the pattern.

Examples

  • This example filters an event sequence for three types of events: number=1, number=2, and number=3. It then counts the number of times an event where number=1 is immediately followed by an event where number=3.

    Sample statement:

    SELECT SEQUENCE_COUNT('(?1)(?3)', ts, number = 1, number = 2, number = 3) FROM sequence_test;

    Sample result:

    +--------------------------------------------------------------------+
    | sequence_count('(?1)(?3)', ts, number = 1, number = 2, number = 3) |
    +--------------------------------------------------------------------+
    |                                0                                   |
    +--------------------------------------------------------------------+
  • This example filters an event sequence for three types of events: number=1, number=2, and number=3. It then counts the number of times an event where number=3 occurs after an event where number=1, with any number of other events allowed in between.

    Sample statement:

    SELECT SEQUENCE_COUNT('(?1).*(?3)', ts, number = 1, number = 2, number = 3) FROM sequence_test;

    Sample result:

    +----------------------------------------------------------------------+
    | sequence_count('(?1).*(?3)', ts, number = 1, number = 2, number = 3) |
    +----------------------------------------------------------------------+
    |                                  1                                   |
    +----------------------------------------------------------------------+
  • This example filters an event sequence for two types of events: number=1 and number=2. It then counts the number of times an event where number=1 is immediately followed by an event where number=2, and the time difference between them is 3,600 ms or greater.

    Sample statement:

    SELECT SEQUENCE_COUNT('(?1)(?t>=3600)(?2)', ts, number = 1, number = 2) FROM sequence_test;

    Sample result:

    +------------------------------------------------------------------+
    | sequence_count('(?1)(?t>=3600)(?2)', ts, number = 1, number = 2) |
    +------------------------------------------------------------------+
    |                              2                                   |
    +------------------------------------------------------------------+