All Products
Search
Document Center

Analytic functions

Last Updated: Jun 18, 2021

Overview

Analytic functions, also called window functions in some database services, are similar to aggregate functions. The similarity lies in that the calculations are performed based on a set of rows. The difference is that an aggregate function returns only one row for each set of rows, but an analytic function returns one or more rows for each set of rows. Each of the rows returned by the analytic function is the result of window-based logic computing. Analytic functions can optimize queries that contain self joins in a significant way.

Features

ApsaraDB for OceanBase supports the following analytic functions:

  • SUM

  • MIN

  • MAX

  • COUNT

  • AVG

  • GROUP_CONCAT, a function that is specific to ApsaraDB for OceanBase

  • ROW_NUMBER

  • RANK

  • DENSE_RANK

  • PERCENT_RANK

  • CUME_DIST

  • FIRST_VALUE

  • LAST_VALUE

  • NTH_VALUE

  • NTILE

  • LEAD

  • LAG

  • RATIO_TO_REPORT

  • STDDEV

  • VARIANCE

  • STDDEV_SAMP

  • STDDEV_POP

  • LISTAGG

ApsaraDB for OceanBase does not support the following window functions that are supported by Oracle:

  • CORR

  • COVAR_POP

  • COVAR_SAMP

  • MEDIAN

  • PERCENTILE_CONT

  • PERCENTILE_DISC

  • VAR_POP

  • VAR_SAMP

  • REGR_(Linear Regression) Functions

Syntax of analytic functions

A window is also called a frame. ApsaraDB for OceanBase supports the ROWS and RANGE options for the FRAME clauses. The ROWS option defines frames based on the offsets that are the differences in physical row numbers from the current row number. The RANGE option defines frames based on the offsets that are the differences in logical row values from the current row value.

Syntax of analytic_function: analytic_function([ arguments ]) OVER (analytic_clause)

Syntax of analytic_clause: [ query_partition_clause ] [ order_by_clause [ windowing_clause ] ]

Syntax of query_partition_clause: PARTITION BY { expr[, expr ]... | ( expr[, expr ]... ) }

Syntax of order_by_clause: ORDER [ SIBLINGS ] BY{ expr | position | c_alias } [ ASC | DESC ] [ NULLS FIRST | NULLS LAST ] [, { expr | position | c_alias } [ ASC | DESC ][ NULLS FIRST | NULLS LAST ]]...

Syntax of windowing_clause

Syntax of windowing_clause:

{ ROWS | RANGE } { BETWEEN { UNBOUNDED PRECEDING | CURRENT ROW | value_expr { PRECEDING | FOLLOWING } } AND{ UNBOUNDED FOLLOWING | CURRENT ROW | value_expr { PRECEDING | FOLLOWING } } | { UNBOUNDED PRECEDING | CURRENT ROW| value_expr PRECEDING}}

The syntax of windowing_clause is compatible with that of Oracle.

Description

The following content describes the syntax and features of analytic functions, and also compares analytic functions used in ApsaraDB for OceanBase and Oracle.

sum/min/max/count/avg

Syntax of the AVG function: AVG([ DISTINCT | ALL ] expr) [ OVER(analytic_clause) ]

Syntax of the SUM function: SUM([ DISTINCT | ALL ] expr) [ OVER (analytic_clause) ]

Syntax of the MIN function: MIN([ DISTINCT | ALL ] expr) [ OVER (analytic_clause) ]

Syntax of the MAX function: MAX([ DISTINCT | ALL ] expr) [ OVER (analytic_clause) ]

Syntax of the COUNT function: COUNT({ * | [ DISTINCT | ALL ] expr }) [ OVER (analytic_clause) ]

The preceding analytic functions have corresponding aggregate functions. The SUM function returns the sum of the expr values. The MIN function returns the minimum expr value. The MAX function returns the maximum expr value. The COUNT function returns the number of rows queried for a window. The AVG function returns the average expr value.

The COUNT function does not return null. If you specify expr, the function returns the number of non-null expr values. If you use an asterisk (*), the COUNT(*) function returns the number of all the rows.

create table exployees(last_name char(10), salary decimal, job_id char(32));
insert into exployees values('jim', 2000, 'cleaner');
insert into exployees values('mike', 12000, 'engineering');
insert into exployees values('lily', 13000, 'engineering');
insert into exployees values('tom', 11000, 'engineering');
OceanBase(TEST@TEST)>select last_name, sum(salary) over(partition by job_id) totol_s, min(salary) over(partition by job_id) min_s, max(salary) over(partition by job_id) max_s, count(*) over(partition by job_id) count_s from exployees;
+------------+---------+-------+-------+---------+
| LAST_NAME  | TOTOL_S | MIN_S | MAX_S | COUNT_S |
+------------+---------+-------+-------+---------+
| jim        |    2000 |  2000 |  2000 |       1 |
| mike       |   36000 | 11000 | 13000 |       3 |
| lily       |   36000 | 11000 | 13000 |       3 |
| tom        |   36000 | 11000 | 13000 |       3 |
+------------+---------+-------+-------+---------+

The syntax of the preceding five functions in ApsaraDB for OceanBase is not fully compatible with that in Oracle.

If you use the DISTINCT keyword in the SUM, COUNT, or AVG function, you can specify only query_partition_clause in analytic_clause. order_by_clause and windowing_clause are not allowed. ApsaraDB for OceanBase does not support this option. If you use DISTINCT in ApsaraDB for OceanBase, you can also specify order_by_clause and windowing_clause.

ApsaraDB for OceanBase does not support NULLS FIRST or NULLS LAST in order_by_clause.

For more information about the support for analytic_clause, see the "Syntax of windowing_clause" section.

nth_value/first_value/last_value

Syntax of the NTH_VALUE function: NTH_VALUE (measure_expr, n) [ FROM { FIRST | LAST } ][ { RESPECT | IGNORE } NULLS ] OVER (analytic_clause)

The NTH_VALUE function, as its name suggests, returns the Nth value in the window defined by analytic_clause. [ FROM { FIRST | LAST } ] specifies whether to begin the calculation from the first or last row of the window. By default, FROM FIRST is used. [ {RESPECT | IGNORE} ] NULLS specifies whether to ignore null during the calculation. The N value must be a positive integer. If the N value is null, the function returns an error. If the N value is greater than the number of all the rows in the window, the function returns null.

Syntax of the FIRST_VALUE or LAST_VALUE function: FIRST_VALUE or LAST_VALUE { (expr) [ {RESPECT | IGNORE} NULLS ] | (expr [ {RESPECT | IGNORE} NULLS ])} OVER (analytic_clause)

The FIRST_VALUE function, as its name suggests, returns the first value in a set of values. The LAST_VALUE function returns the last value in a set of values.

create table exployees(last_name char(10), salary decimal, job_id char(32));
insert into exployees values('jim', 2000, 'cleaner');
insert into exployees values('mike', 12000, 'engineering');
insert into exployees values('lily', 13000, 'engineering');
insert into exployees values('tom', 11000, 'engineering');
OceanBase(TEST@TEST)>select last_name, first_value(salary) over(partition by job_id) totol_s, last_value(salary) over(partition by job_id) min_s, max(salary) over(partition by job_id) max_s from exployees;
+------------+---------+-------+-------+
| LAST_NAME  | TOTOL_S | MIN_S | MAX_S |
+------------+---------+-------+-------+
| jim        |    2000 |  2000 |  2000 |
| mike       |   12000 | 11000 | 13000 |
| lily       |   12000 | 11000 | 13000 |
| tom        |   12000 | 11000 | 13000 |
+------------+---------+-------+-------+

The syntax of the preceding three functions is compatible with that of Oracle. No corresponding aggregate functions are available for the three functions.

lead/lag

The LAG or LEAD function allows you to query the value of the same field from a row that appears before or after the current row at the N offset. Self joins can also perform these operations. However, the LAG and LEAD analytic functions enable more efficient queries than self joins. The following syntax is used:

lag/lead { ( value_expr [, offset [, default]]) [ { RESPECT | IGNORE } NULLS ] | ( value_expr [ { RESPECT | IGNORE } NULLS ] [, offset [, default]] )} OVER ([ query_partition_clause ] order_by_clause)

value_expr specifies the field whose values are to be compared, and offset specifies the offset from the current row that is defined by value_expr. The default value of the default parameter is null. This indicates that if you do not specify the default parameter for the LAG or LEAD function, a null value is returned. Assume that you specify parameters for the LAG function and the current row is the fourth row in the table. If you set the offset parameter to 6, the serial number of the row that you want to query is -2. This goes beyond the scope of the table. In this case, the function returns the default value.

[ { RESPECT | IGNORE } NULLS ] specifies whether to ignore null during the calculation. The default value is RESPECT. This indicates that the null value is not ignored.

Take note of this point: The LEAD or LAG function must be followed by order_by_clause because this clause specifies the order in which the values in a column are sorted. The function can calculate the positions of the current row and the previous or subsequent rows only after the values are sorted. query_partition_clause is optional. If you do not specify this clause, the operation of the function applies to the global data.

create table exployees(last_name char(10), salary decimal, job_id char(32));
insert into exployees values('jim', 2000, 'cleaner');
insert into exployees values('mike', 12000, 'engineering');
insert into exployees values('lily', 13000, 'engineering');
insert into exployees values('tom', 11000, 'engineering');
OceanBase(TEST@TEST)>select last_name, lead(salary) over(order by salary) lead, lag(salary) over(order by salary) lag from exployees;
+------------+-------+-------+
| LAST_NAME  | LEAD  | LAG   |
+------------+-------+-------+
| jim        | 11000 |  NULL |
| tom        | 12000 |  2000 |
| mike       | 13000 | 11000 |
| lily       |  NULL | 12000 |
+------------+-------+-------+

The LAG and LEAD functions in ApsaraDB for OceanBase are compatible with those in Oracle. Take note of the following point: In Oracle and ApsaraDB for OceanBase, you can specify whether to ignore null in the results returned by the LEAD or LAG function. No corresponding aggregate function is available for the LEAD or LAG function.

stddev/variance/stddev_samp/stddev_pop

Syntax of the VARIANCE function: VARIANCE([ DISTINCT | ALL ] expr) [ OVER (analytic_clause) ]. The VARIANCE function returns the variance of the expr value. You can set expr to a value of a numeric data type or non-numeric data type that can be converted to a numeric data type. The function returns the variance of the same data type as the argument.

Syntax of the STDDEV function: STDDEV([ DISTINCT | ALL ] expr) [ OVER (analytic_clause) ]. The STDDEV function returns the standard deviation of the expr value. The STDDEV and VARIANCE functions have the same syntax and parameter types.

Syntax of the STDDEV_SAMP function: STDDEV_SAMP(expr) [ OVER (analytic_clause) ]

Syntax of the STDDEV_POP function: STDDEV_POP(expr) [ OVER (analytic_clause) ]

create table exployees(last_name char(10), salary decimal, job_id char(32));
insert into exployees values('jim', 2000, 'cleaner');
insert into exployees values('mike', 12000, 'engineering');
insert into exployees values('lily', 13000, 'engineering');
insert into exployees values('tom', 11000, 'engineering');
OceanBase(TEST@TEST)>select last_name, stddev(salary) over(order by salary) std, variance(salary) over(order by salary) var, stddev_pop(salary) over() std_pop, stddev_samp(salary) over() from exployees;
+------------+-------------------------------------------+-----------------------------------------+-------------------------------------------+-------------------------------------------+
| LAST_NAME  | STD                                       | VAR                                     | STD_POP                                   | STDDEV_SAMP(SALARY)OVER()                 |
+------------+-------------------------------------------+-----------------------------------------+-------------------------------------------+-------------------------------------------+
| jim        |                                         0 |                                       0 | 4387.482193696061030203194153708154780438 | 5066.228051190221210556248068679881679227 |
| tom        | 6363.961030678927719607599258943641353564 |                                40500000 | 4387.482193696061030203194153708154780438 | 5066.228051190221210556248068679881679227 |
| mike       | 5507.570547286102020605808770872073470916 | 30333333.333333333333333333333333333333 | 4387.482193696061030203194153708154780438 | 5066.228051190221210556248068679881679227 |
| lily       | 5066.228051190221210556248068679881679227 | 25666666.666666666666666666666666666667 | 4387.482193696061030203194153708154780438 | 5066.228051190221210556248068679881679227 |
+------------+-------------------------------------------+-----------------------------------------+-------------------------------------------+-------------------------------------------+

The syntax of the preceding four functions in ApsaraDB for OceanBase is not fully compatible with that in Oracle.

If you use the DISTINCT keyword in the VARIANCE or STDDEV function, you can specify only query_partition_clause in analytic_clause. order_by_clause and windowing_clause are not allowed. ApsaraDB for OceanBase does not support this option. If you use DISTINCT in ApsaraDB for OceanBase, you can also specify order_by_clause and windowing_clause. For the STDDEV_POP and STDDEV_SAMP functions, you cannot use the DISTINCT or ALL keyword. ApsaraDB for OceanBase does not support this option.

ApsaraDB for OceanBase does not support NULLS FIRST or NULLS LAST in order_by_clause.

For more information about the support for analytic_clause, see the "Syntax of windowing_clause" section.

ntile

Syntax of the NTILE function: NTILE(expr) OVER ([ query_partition_clause ] order_by_clause). The NTILE function divides sorted partition rows into a specified number of groups, and the size of each group is the same if possible. The function assigns each row to a group by giving each row a group number. If the expr value is null, the function returns null.

create table exployees(last_name char(10), salary decimal, job_id char(32));
insert into exployees values('jim', 2000, 'cleaner');
insert into exployees values('mike', 12000, 'engineering');
insert into exployees values('lily', 13000, 'engineering');
insert into exployees values('tom', 11000, 'engineering');
OceanBase(TEST@TEST)>select last_name, ntile(10) over(partition by job_id order by salary) ntl from exployees;
+------------+------+
| LAST_NAME  | NTL  |
+------------+------+
| jim        |    1 |
| tom        |    1 |
| mike       |    2 |
| lily       |    3 |
+------------+------+

The syntax of the NTILE function in ApsaraDB for OceanBase is compatible with that in Oracle. Take note of this point: You must specify order_by_clause for the NTILE analytic function. No corresponding aggregate function is available for the NTILE function.

row_number

Syntax of the ROW_NUMBER function: ROW_NUMBER( ) OVER ([ query_partition_clause ] order_by_clause). query_partition_clause is optional. The ROW_NUMBER function can run as expected only after the values are sorted. This function assigns a unique serial number to each row after the rows are sorted by applying the ORDER BY clause on expr values.

create table exployees(last_name char(10), salary decimal, job_id char(32));
insert into exployees values('jim', 2000, 'cleaner');
insert into exployees values('mike', 12000, 'engineering');
insert into exployees values('lily', 13000, 'engineering');
insert into exployees values('tom', 11000, 'engineering');
OceanBase(TEST@TEST)>select last_name, row_number() over(partition by job_id order by salary) ntl from exployees;
+------------+------+
| LAST_NAME  | NTL  |
+------------+------+
| jim        |    1 |
| tom        |    1 |
| mike       |    2 |
| lily       |    3 |
+------------+------+

The syntax of the ROW_NUMBER function in ApsaraDB for OceanBase is compatible with that in Oracle. Take note of this point: You must specify order_by_clause for the ROW_NUMBER analytic function. No corresponding aggregate function is available for the ROW_NUMBER function.

rank/dense_rank/percent_rank

Syntax of the RANK function: RANK( ) OVER ([ query_partition_clause ] order_by_clause). The RANK function is an analytic function and ranks all the rows for a given column that is specified by order_by_clause. In the following example, all the employees are ranked by salary:

Syntax of the DENSE_RANK function: DENSE_RANK( ) OVER([ query_partition_clause ] order_by_clause). The DENSE_RANK function acts like the RANK function except that DENSE_RANK assigns consecutive ranks.

Syntax of the PERCENT_RANK function: PERCENT_RANK( ) OVER ([ query_partition_clause ] order_by_clause). The PERCENT_RANK function acts like the RANK function except that PERCENT_RANK calculates the rank of a given row as a percentage.

create table exployees(last_name char(10), salary decimal, job_id char(32));
insert into exployees values('jim', 2000, 'cleaner');
insert into exployees values('mike', 12000, 'engineering');
insert into exployees values('lily', 13000, 'engineering');
insert into exployees values('tom', 11000, 'engineering');
OceanBase(TEST@TEST)>select last_name, rank() over(partition by job_id order by salary) rank, dense_rank() over(partition by job_id order by salary) dense_rank, percent_rank() over(partition by job_id order by salary) percent_rank from exployees;
+------------+------+------------+--------------+
| LAST_NAME  | RANK | DENSE_RANK | PERCENT_RANK |
+------------+------+------------+--------------+
| jim        |    1 |          1 |            0 |
| tom        |    1 |          1 |            0 |
| mike       |    2 |          2 |           .5 |
| lily       |    3 |          3 |            1 |
+------------+------+------------+--------------+

In the preceding three functions, the syntax of the RANK function in ApsaraDB for OceanBase is compatible with that in Oracle. Take note of this point: You must specify order_by_clause for the RANK function. Take note of this point: The syntax of the RANK aggregate function differs from that of the RANK analytic function. For more information, see the syntax of the RANK function in the "Aggregate functions" section. The syntax of the DENSE_RANK function in ApsaraDB for OceanBase is compatible with that in Oracle. Take note of this point: You must specify order_by_clause for the DENSE_RANK function. Take note of this point: The syntax of the DENSE_RANK aggregate function differs from that of the DENSE_RANK analytic function. For more information, see the syntax of the DENSE_RANK function in the "Aggregate functions" section. The syntax of the PERCENT_RANK function in ApsaraDB for OceanBase is compatible with that in Oracle. Take note of this point: You must specify order_by_clause for the PERCENT_RANK function. Take note of this point: The syntax of the PERCENT_RANK aggregate function differs from that of the PERCENT_RANK analytic function. For more information, see the syntax of the PERCENT_RANK function in the "Aggregate functions" section.

cume_dist

Syntax of the CUME_DIST function: CUME_DIST() OVER ([ query_partition_clause ] order_by_clause). The CUME_DIST function calculates the cumulative distribution of a value. CUME_DIST returns a value that is greater than 0 and less than or equal to 1. Assume that the values in a specified column are sorted in ascending order. The CUME_DIST analytic function calculates the percentage of each value that is less than or equal to the current value for the specified column. The following example groups the employees by job_id and ranks the salary percentile of each employee in a specified window.

create table exployees(last_name char(10), salary decimal, job_id char(32));
insert into exployees values('jim', 2000, 'cleaner');
insert into exployees values('mike', 12000, 'engineering');
insert into exployees values('lily', 13000, 'engineering');
insert into exployees values('tom', 11000, 'engineering');
OceanBase(TEST@TEST)>select last_name, cume_dist() over(partition by job_id order by salary) rank from exployees;
+------------+-------------------------------------------+
| LAST_NAME  | RANK                                      |
+------------+-------------------------------------------+
| jim        |                                         1 |
| tom        | .3333333333333333333333333333333333333333 |
| mike       | .6666666666666666666666666666666666666667 |
| lily       |                                         1 |
+------------+-------------------------------------------+

The syntax of the CUME_DIST function in ApsaraDB for OceanBase is compatible with that in Oracle. Take note of this point: The syntax of the CUME_DIST function includes query_partition_clause and order_by_clause. Take note of this point: The syntax of the CUME_DIST aggregate function differs from that of the CUME_DIST analytic function. For more information, see the syntax of the CUME_DIST function in the "Aggregate functions" section.

ratio_to_report

Syntax of the RATIO_TO_REPORT function: RATIO_TO_REPORT(expr) OVER ([ query_partition_clause ]). The RATIO_TO_REPORT function calculates the expr value and the ratio of the value to the sum of a set of values. If the expr value is null, the RATIO_TO_REPORT function returns null.

The sum of a set of values is determined by query_partition_clause. If you omit this clause, RATIO_TO_REPORT calculates the ratio of a value to the sum of all the expr values. You cannot use RATIO_TO_REPORT or other analytic functions in expr. However, you can use built-in functions in expr. This is similar to the expr values in aggregate functions. The following example is used to describe how to use RATIO_TO_REPORT to calculate the ratio of each employee's salary to the total salary of all the employees in the company:

create table exployees(last_name char(10), salary decimal, job_id char(32));
insert into exployees values('jim', 2000, 'cleaner');
insert into exployees values('mike', 12000, 'engineering');
insert into exployees values('lily', 13000, 'engineering');
insert into exployees values('tom', 11000, 'engineering');
OceanBase(TEST@TEST)>select last_name, salary, ratio_to_report(salary) over() as salary_ratio from exployees where job_id = 'engineering';
+------------+--------+-------------------------------------------+
| LAST_NAME  | SALARY | SALARY_RATIO                              |
+------------+--------+-------------------------------------------+
| mike       |  12000 | .3333333333333333333333333333333333333333 |
| lily       |  13000 | .3611111111111111111111111111111111111111 |
| tom        |  11000 | .3055555555555555555555555555555555555556 |
+------------+--------+-------------------------------------------+

The syntax of the RATIO_TO_REPORT function in ApsaraDB for OceanBase is compatible with that in Oracle. Take note of this point: You cannot specify order_by_clause for the RATIO_TO_REPORT analytic function.

Execution plans

The WINDOW FUNCTION operator is used to implement analytic functions in ApsaraDB for OceanBase. WINDOW FUNCTION works based on the partitioned and sorted output of the lower-level operator. Therefore, the system assigns a SORT operator if the SORT operator is required.

Example

mysql> explain select row_number() over (partition by table_id order by partition_id) from __all_root_table;
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query Plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ====================================================
|ID|OPERATOR       |NAME            |EST. ROWS|COST|
----------------------------------------------------
|0 |WINDOW FUNCTION|                |1000     |2240|
|1 | SORT          |                |1000     |2049|
|2 |  TABLE SCAN   |__all_root_table|1000     |499 |
====================================================

Outputs & filters:
-------------------------------------
  0 - output([T_WIN_FUN_ROW_NUMBER()]), filter(nil),
      win_expr(T_WIN_FUN_ROW_NUMBER()), partition_by([__all_root_table.table_id]), order_by([__all_root_table.partition_id, ASC]), window_type(RANGE), upper(UNBOUNDED PRECEDING), lower(UNBOUNDED FOLLOWING)
  1 - output([__all_root_table.table_id], [__all_root_table.partition_id]), filter(nil), sort_keys([__all_root_table.table_id, ASC], [__all_root_table.partition_id, ASC])
  2 - output([__all_root_table.table_id], [__all_root_table.partition_id]), filter(nil),
      access([__all_root_table.table_id], [__all_root_table.partition_id]), partitions(p0)

Remove a SORT operator

If the optimizer recognizes that the output of the lower-level operator is sorted, the optimizer does not assign a SORT operator. In this case, (tenant_id, table_id) is the prefix of the primary key in the __all_root_table table.

explain select row_number() over (partition by tenant_id, table_id order by partition_id) from __all_root_table;
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query Plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ====================================================
|ID|OPERATOR       |NAME            |EST. ROWS|COST|
----------------------------------------------------
|0 |WINDOW FUNCTION|                |1000     |690 |
|1 | TABLE SCAN    |__all_root_table|1000     |499 |
====================================================

Outputs & filters:
-------------------------------------
  0 - output([T_WIN_FUN_ROW_NUMBER()]), filter(nil),
      win_expr(T_WIN_FUN_ROW_NUMBER()), partition_by([__all_root_table.tenant_id], [__all_root_table.table_id]), order_by([__all_root_table.partition_id, ASC]), window_type(RANGE), upper(UNBOUNDED PRECEDING), lower(UNBOUNDED FOLLOWING)
  1 - output([__all_root_table.tenant_id], [__all_root_table.table_id], [__all_root_table.partition_id]), filter(nil),
      access([__all_root_table.tenant_id], [__all_root_table.table_id], [__all_root_table.partition_id]), partitions(p0)
 |

Combine calculations

When multiple analytic functions use the same sorted subset for calculation, all the functions use the same operator instead of multiple independent operators. This reduces memory overhead and range checks in an effective way.

In the following example, the calculation of the two analytic functions ROW_NUMBER() and RANK() can use the same sorted output. Therefore, only one SORT operator is assigned.

mysql> explain select row_number() over (partition by table_id order by partition_id), rank() over (partition by table_id, partition_id order by svr_ip) from __all_root_table;
| ====================================================
|ID|OPERATOR       |NAME            |EST. ROWS|COST|
----------------------------------------------------
|0 |WINDOW FUNCTION|                |1000     |3017|
|1 | SORT          |                |1000     |2826|
|2 |  TABLE SCAN   |__all_root_table|1000     |499 |
====================================================

Outputs & filters:
-------------------------------------
  0 - output([T_WIN_FUN_ROW_NUMBER()], [T_WIN_FUN_RANK()]), filter(nil),
      win_expr(T_WIN_FUN_RANK()), partition_by([__all_root_table.table_id], [__all_root_table.partition_id]), order_by([__all_root_table.svr_ip, ASC]), window_type(RANGE), uppe                   r(UNBOUNDED PRECEDING), lower(UNBOUNDED FOLLOWING)
      win_expr(T_WIN_FUN_ROW_NUMBER()), partition_by([__all_root_table.table_id]), order_by([__all_root_table.partition_id, ASC]), window_type(RANGE), upper(UNBOUNDED PRECEDING                   ), lower(UNBOUNDED FOLLOWING)
  1 - output([__all_root_table.table_id], [__all_root_table.partition_id], [__all_root_table.svr_ip]), filter(nil), sort_keys([__all_root_table.table_id, ASC], [__all_root_tabl                   e.partition_id, ASC], [__all_root_table.svr_ip, ASC])
  2 - output([__all_root_table.table_id], [__all_root_table.partition_id], [__all_root_table.svr_ip]), filter(nil),
      access([__all_root_table.table_id], [__all_root_table.partition_id], [__all_root_table.svr_ip]), partitions(p0)

Store data on disks

Analytic functions also support storing intermediate data on disks during execution. This implements the same logic as the SORT operator.