Use window functions to flexibly analyze and process data of specified columns in a window. Syntax, parameters, and examples are provided for each window function supported by the SQL analysis feature.
The SQL analysis feature supports the following window functions.
|
Function |
Description |
|
Calculates the average value of data in a window. |
|
|
Samples random rows of data. If true is returned, the specified row of data is sampled. |
|
|
Calculates the number of rows in a window. |
|
|
Calculates the cumulative distribution of data in a partition. |
|
|
Calculates the rank of a row in a group of rows. The ranks are consecutive. |
|
|
Obtains the calculated result of the first row of data in the window to which the current row belongs. |
|
|
Obtains the calculated result of the Nth row of data that precedes the current row at a given offset in a window. |
|
|
Obtains the calculated result of the last row of data in the window to which the current row belongs. |
|
|
Obtains the calculated result of the Nth row of data that follows the current row at a given offset in a window. |
|
|
Calculates the maximum value in a window. |
|
|
Calculates the median in a window. |
|
|
Calculates the minimum value in a window. |
|
|
Splits rows of data in a partition into N groups of equal size and returns the number of the group to which the current row belongs. The group number ranges from 1 to N. |
|
|
Obtains the calculated result of the Nth row of data in a window to which the current row belongs. |
|
|
Calculates the percentile rank of a row in an ordered group of rows. The function returns a percentage value. |
|
|
Calculates the rank of a row in an ordered group of rows. The ranks may not be consecutive integers. |
|
|
Calculates the sequence number of a row. The row number starts from 1. |
|
|
Returns the population standard deviation of all the input values. This function is also called STDDEV_POP. |
|
|
Returns the sample standard deviation of all the input values. |
|
|
Calculates the sum of data in a window. |
Limits
Before you use window functions, take note of the following limits:
-
Window functions are supported only in
SELECTstatements. -
A window function cannot contain nested window functions or nested aggregate functions.
-
You cannot use window functions together with aggregate functions of the same level.
Syntax
Syntax of window functions:
<function_name>([distinct][<expression> [, ...]]) over (<window_definition>)
<function_name>([distinct][<expression> [, ...]]) over <window_name>
-
function_name: the name of a built-in window function.
-
expression: the format of the window function. The format is subject to the function syntax.
-
windowing_definition: the definition of a window. For information about the syntax of windowing_definition, see windowing_definition.
-
window_name: the name of a window. You can use the
windowkeyword to configure a window and use windowing_definition to specify the name of the window. Syntax of named_window_def:window <window_name> as (<window_definition>)Position of named_window_def in an SQL statement:
select ... from ... [where ...] [group by ...] [having ...] named_window_def [order by ...] [limit ...]
windowing_definition
Syntax
--partition_clause:
[partition by <expression> [, ...]]
--orderby_clause:
[order by <expression> [asc|desc][nulls {first|last}] [, ...]]
[<frame_clause>]
If you use a window function in a SELECT statement, data is partitioned and sorted based on PARTITION BY and ORDER BY in windowing_definition when the window function is executed. If the SELECT statement does not include partition by, only one partition exists and the partition contains all data. If the SELECT statement does not include order by, data in a partition is arranged in a random order, and a data stream is generated. After the data stream is generated, a group of rows is extracted from the data stream based on frame_clause in windowing_definition to create a window for the current row. The window function calculates the data included in the window to which the current row belongs.
-
partition by <expression> [, ...]: optional. This parameter specifies the partition information. If the values of partition key columns are the same for a group of rows, these rows are included in the same window. For more information about the format of PARTITION BY, see Table operations.
-
order by <expression> [asc|desc][nulls {first|last}] [, ...]: optional. This parameter specifies how to sort rows of data in a window.
NoteIf the values of the column that is specified in
order byare the same, the sorting result may not be accurate. To reduce the random ordering of data, make sure that the values of the column that is specified inorder byare unique. -
frame_clause: optional. This parameter is used to determine the data boundaries of a window. For more information about frame_clause, see frame_clause.
frame_clause
Syntax
-- Syntax 1
{ROWS|RANGE|GROUPS} <frame_start> [<frame_exclusion>]
-- Syntax 2
{ROWS|RANGE|GROUPS} between <frame_start> and <frame_end> [<frame_exclusion>]
frame_clause is a closed interval that is used to determine the data boundaries of a window. The data boundaries are determined based on the rows that are specified by frame_start and frame_end.
-
ROWS|RANGE|GROUPS: required. ROWS, RANGE, and GROUPS indicate the types of frame_clause. The implementation rules of frame_start and frame_end vary based on the type of frame_clause. Description:
-
ROWS: The data boundaries of a window are determined based on the number of rows.
-
RANGE: The data boundaries of a window are determined based on the comparison results of the values of the column that is specified in
order by. In most cases,order byis specified in windowing_definition. Iforder byis not specified in windowing_definition, the values of the column that is specified inorder byare the same for all rows in a partition. NULL values are considered equivalent. -
GROUPS: In a partition, rows that have the same value of the column specified in
order byform a group. Iforder byis not specified, all rows in the partition form a group. NULL values are considered equivalent.
-
-
frame_start and frame_end: the start and end rows of a window. frame_start is required. frame_end is optional. If frame_end is not specified, the default value CURRENT ROW is used.
The row specified by frame_start must precede or be the same as the row specified by frame_end. Compared with the row specified by frame_end, the row specified by frame_start is closer to the first row in a window after all data in the window is sorted based on the column that is specified in
order byof windowing_definition. The following table describes the valid values and logic of frame_start and frame_end when the type of frame_clause is ROWS, RANGE, or GROUPS.frame_clause type
Value of frame_start or frame_end
Description
ROWS, RANGE, and GROUPS
UNBOUNDED PRECEDING
Indicates the first row of a partition. Rows are counted from 1.
UNBOUNDED FOLLOWING
Indicates the last row of a partition.
ROWS
CURRENT ROW
Indicates the current row. Each row of data corresponds to a result calculated by a window function. The current row is the row whose data is calculated by using a window function.
offset PRECEDING
Indicates the Nth row that precedes the current row at a specified
offset. For example, if0 PRECEDINGindicates the current row,1 PRECEDINGindicates the previous row.offsetmust be a non-negative integer.offset FOLLOWING
Indicates the Nth row that follows the current row at a specified
offset. For example, if0 FOLLOWINGindicates the current row,1 FOLLOWINGindicates the next row.offsetmust be a non-negative integer.RANGE
CURRENT ROW
-
If frame_start is set to CURRENT ROW, the first row that has the same value of the column specified in
order byis used as the current row. -
If frame_end is set to CURRENT ROW, the last row that has the same value of the column specified in
order byis used as the current row.
offset PRECEDING
The rows that are specified by frame_start and frame_end are determined based on the sorting order that is specified by
order by. For example, if data in a window is sorted by X, Xi specifies the X value that corresponds to the ith row, and Xc specifies the X value that corresponds to the current row. Positions of rows specified by frame_start and frame_end:-
order byis set to asc:-
frame_start specifies the first row that meets the following requirement:
Xc - Xi ≤ offset. -
frame_end specifies the last row that meets the following requirement:
Xc - Xi ≥ offset.
-
-
order byis set to desc:-
frame_start specifies the first row that meets the following requirement:
Xi - Xc ≤ offset. -
frame_end specifies the last row that meets the following requirement:
Xi - Xc ≥ offset.
-
The column that is specified in
order bycan be of the following data types: TINYINT, SMALLINT, INT, BIGINT, FLOAT, DOUBLE, DECIMAL, DATETIME, DATE, and TIMESTAMP.Syntax for
offsetof the DATE type:-
N: indicates N days or N seconds. It must be a non-negative integer. For an offset of the DATETIME or TIMESTAMP type, N is N seconds. For an offset of the DATE type, N is N days. -
interval 'N' {YEAR\MONTH\DAY\HOUR\MINUTE\SECOND}: indicates N years, months, days, hours, minutes, or seconds. For example,INTERVAL '3' YEARindicates 3 years. -
INTERVAL 'N-M' YEAR TO MONTH: indicates N years and M months. For example,INTERVAL '1-3' YEAR TO MONTHindicates 1 year and 3 months. -
INTERVAL 'D[ H[:M[:S[:N]]]]' DAY TO SECOND: indicates D days, H hours, M minutes, S seconds, and N nanoseconds. For example,INTERVAL '1 2:3:4:5' DAY TO SECONDindicates 1 day, 2 hours, 3 minutes, 4 seconds, and 5 nanoseconds.
offset FOLLOWING
The rows that are specified by frame_start and frame_end are determined based on the sorting order that is specified by
order by. For example, if data in a window is sorted by X, Xi specifies the X value that corresponds to the ith row, and Xc specifies the X value that corresponds to the current row. Positions of rows specified by frame_start and frame_end:-
order byis set to asc:-
frame_start specifies the first row that meets the following requirement:
Xi - Xc ≥ offset. -
frame_end specifies the last row that meets the following requirement:
Xi - Xc ≤ offset.
-
-
order byis set to desc:-
frame_start specifies the first row that meets the following requirement:
Xc - Xi >= offset. -
frame_end specifies the last row that meets the following requirement:
Xc - Xi <= offset.
-
GROUPS
CURRENT ROW
-
If frame_start is set to CURRENT ROW, the first row of the group to which the current row belongs is used.
-
If frame_end is set to CURRENT ROW, the last row of the group to which the current row belongs is used.
offset PRECEDING
-
If frame_start is set to offset PRECEDING, the first row of the Nth group that precedes the group of the current row at a specified
offsetis used. -
If frame_end is set to offset PRECEDING, the last row of the Nth group that precedes the group of the current row at a specified
offsetis used.
NoteYou cannot set frame_start to UNBOUNDED FOLLOWING, and you cannot set frame_end to UNBOUNED PRECEDING.
offset FOLLOWING
-
If frame_start is set to offset FOLLOWING, the first row of the Nth group that follows the group of the current row at a specified
offsetis used. -
If frame_end is set to offset FOLLOWING, the last row of the Nth group that follows the group of the current row at a specified
offsetis used.
NoteYou cannot set frame_start to UNBOUNDED FOLLOWING, and you cannot set frame_end to UNBOUNED PRECEDING.
-
-
frame_exclusion: optional. This parameter is used to remove specific rows from a window. Valid values:
-
EXCLUDE NO OTHERS: No rows are excluded from the window.
-
EXCLUDE CURRENT ROW: The current row is excluded from the window.
-
EXCLUDE GROUP: indicates that an entire group of rows in a partition is excluded from the window. In the group, all rows have the same value of the column that is specified in
order byas the current row. -
EXCLUDE TIES: An entire group of rows, except for the current row, are excluded from the window.
-
Default frame_clause
If you do not configure frame_clause, the default value of frame_clause in the following format. The format is the same as the format that is used by most SQL-based systems.
RANGE between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE NO OTHERS
Example of data boundaries of a window
In this example, a table named tbl contains three columns that are of the BIGINT type: pid, oid, and rid. The tbl table contains the following data:
+------------+------------+------------+
| pid | oid | rid |
+------------+------------+------------+
| 1 | NULL | 1 |
| 1 | NULL | 2 |
| 1 | 1 | 3 |
| 1 | 1 | 4 |
| 1 | 2 | 5 |
| 1 | 4 | 6 |
| 1 | 7 | 7 |
| 1 | 11 | 8 |
| 2 | NULL | 9 |
| 2 | NULL | 10 |
+------------+------------+------------+
You can replace ellipses (...) in the following SQL statements with windowing_definition to display the data in the windows in which each row of data is included.
If a value in the window column in the returned result is NULL, no data is contained in the window.
-
Windows in ROWS mode
-
windowing_definition 1
partition by pid order by oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW -- Sample SQL statement: select pid, oid, rid, collect_list(rid) over(partition by pid order by oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW) as window from tbl;Sample result:
+------------+------------+------------+--------+ | pid | oid | rid | window | +------------+------------+------------+--------+ | 1 | NULL | 1 | [1] | | 1 | NULL | 2 | [1, 2] | | 1 | 1 | 3 | [1, 2, 3] | | 1 | 1 | 4 | [1, 2, 3, 4] | | 1 | 2 | 5 | [1, 2, 3, 4, 5] | | 1 | 4 | 6 | [1, 2, 3, 4, 5, 6] | | 1 | 7 | 7 | [1, 2, 3, 4, 5, 6, 7] | | 1 | 11 | 8 | [1, 2, 3, 4, 5, 6, 7, 8] | | 2 | NULL | 9 | [9] | | 2 | NULL | 10 | [9, 10] | +------------+------------+------------+--------+ -
windowing_definition 2
partition by pid order by oid ROWS between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING -- Sample SQL statement: select pid, oid, rid, collect_list(rid) over(partition by pid order by oid ROWS between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING) as window from tbl;Sample result:
+------------+------------+------------+--------+ | pid | oid | rid | window | +------------+------------+------------+--------+ | 1 | NULL | 1 | [1, 2, 3, 4, 5, 6, 7, 8] | | 1 | NULL | 2 | [1, 2, 3, 4, 5, 6, 7, 8] | | 1 | 1 | 3 | [1, 2, 3, 4, 5, 6, 7, 8] | | 1 | 1 | 4 | [1, 2, 3, 4, 5, 6, 7, 8] | | 1 | 2 | 5 | [1, 2, 3, 4, 5, 6, 7, 8] | | 1 | 4 | 6 | [1, 2, 3, 4, 5, 6, 7, 8] | | 1 | 7 | 7 | [1, 2, 3, 4, 5, 6, 7, 8] | | 1 | 11 | 8 | [1, 2, 3, 4, 5, 6, 7, 8] | | 2 | NULL | 9 | [9, 10] | | 2 | NULL | 10 | [9, 10] | +------------+------------+------------+--------+ -
windowing_definition 3
partition by pid order by oid ROWS between 1 FOLLOWING and 3 FOLLOWING -- Sample SQL statement: select pid, oid, rid, collect_list(rid) over(partition by pid order by oid ROWS between 1 FOLLOWING and 3 FOLLOWING) as window from tbl;Sample result:
+------------+------------+------------+--------+ | pid | oid | rid | window | +------------+------------+------------+--------+ | 1 | NULL | 1 | [2, 3, 4] | | 1 | NULL | 2 | [3, 4, 5] | | 1 | 1 | 3 | [4, 5, 6] | | 1 | 1 | 4 | [5, 6, 7] | | 1 | 2 | 5 | [6, 7, 8] | | 1 | 4 | 6 | [7, 8] | | 1 | 7 | 7 | [8] | | 1 | 11 | 8 | NULL | | 2 | NULL | 9 | [10] | | 2 | NULL | 10 | NULL | +------------+------------+------------+--------+ -
windowing_definition 4
partition by pid order by oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE CURRENT ROW -- Sample SQL statement: select pid, oid, rid, collect_list(rid) over(partition by pid order by oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE CURRENT ROW) as window from tbl;Sample result:
+------------+------------+------------+--------+ | pid | oid | rid | window | +------------+------------+------------+--------+ | 1 | NULL | 1 | NULL | | 1 | NULL | 2 | [1] | | 1 | 1 | 3 | [1, 2] | | 1 | 1 | 4 | [1, 2, 3] | | 1 | 2 | 5 | [1, 2, 3, 4] | | 1 | 4 | 6 | [1, 2, 3, 4, 5] | | 1 | 7 | 7 | [1, 2, 3, 4, 5, 6] | | 1 | 11 | 8 | [1, 2, 3, 4, 5, 6, 7] | | 2 | NULL | 9 | NULL | | 2 | NULL | 10 | [9] | +------------+------------+------------+--------+ -
windowing_definition 5
partition by pid order by oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE GROUP -- Sample SQL statement: select pid, oid, rid, collect_list(rid) over(partition by pid order by oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE GROUP) as window from tbl;Sample result:
+------------+------------+------------+--------+ | pid | oid | rid | window | +------------+------------+------------+--------+ | 1 | NULL | 1 | NULL | | 1 | NULL | 2 | NULL | | 1 | 1 | 3 | [1, 2] | | 1 | 1 | 4 | [1, 2] | | 1 | 2 | 5 | [1, 2, 3, 4] | | 1 | 4 | 6 | [1, 2, 3, 4, 5] | | 1 | 7 | 7 | [1, 2, 3, 4, 5, 6] | | 1 | 11 | 8 | [1, 2, 3, 4, 5, 6, 7] | | 2 | NULL | 9 | NULL | | 2 | NULL | 10 | NULL | +------------+------------+------------+--------+ -
windowing_definition 6
partition by pid order by oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE TIES -- Sample SQL statement: select pid, oid, rid, collect_list(rid) over(partition by pid order by oid ROWS between UNBOUNDED PRECEDING and CURRENT ROW EXCLUDE TIES) as window from tbl;Sample result:
+------------+------------+------------+--------+ | pid | oid | rid | window | +------------+------------+------------+--------+ | 1 | NULL | 1 | [1] | | 1 | NULL | 2 | [2] | | 1 | 1 | 3 | [1, 2, 3] | | 1 | 1 | 4 | [1, 2, 4] | | 1 | 2 | 5 | [1, 2, 3, 4, 5] | | 1 | 4 | 6 | [1, 2, 3, 4, 5, 6] | | 1 | 7 | 7 | [1, 2, 3, 4, 5, 6, 7] | | 1 | 11 | 8 | [1, 2, 3, 4, 5, 6, 7, 8] | | 2 | NULL | 9 | [9] | | 2 | NULL | 10 | [10] | +------------+------------+------------+--------+The differences between EXCLUDE CURRENT ROW and EXCLUDE GROUP can be obtained based on the comparison between the
windowcolumn values of rows with theridcolumn values of 2, 4, and 10 in windowing_definition 5 and windowing_definition 6. If frame_exclusion is set to EXCLUDE GROUP, the rows that have the samepidcolumn value in a partition are extracted when the rows have the sameoidcolumn value as the current row.
-
-
Windows in RANGE mode
-
windowing_definition 1
partition by pid order by oid RANGE between UNBOUNDED PRECEDING and CURRENT ROW -- Sample SQL statement: select pid, oid, rid, collect_list(rid) over(partition by pid order by oid RANGE between UNBOUNDED PRECEDING and CURRENT ROW) as window from tbl;Sample result:
+------------+------------+------------+--------+ | pid | oid | rid | window | +------------+------------+------------+--------+ | 1 | NULL | 1 | [1, 2] | | 1 | NULL | 2 | [1, 2] | | 1 | 1 | 3 | [1, 2, 3, 4] | | 1 | 1 | 4 | [1, 2, 3, 4] | | 1 | 2 | 5 | [1, 2, 3, 4, 5] | | 1 | 4 | 6 | [1, 2, 3, 4, 5, 6] | | 1 | 7 | 7 | [1, 2, 3, 4, 5, 6, 7] | | 1 | 11 | 8 | [1, 2, 3, 4, 5, 6, 7, 8] | | 2 | NULL | 9 | [9, 10] | | 2 | NULL | 10 | [9, 10] | +------------+------------+------------+--------+If frame_end is set to CURRENT ROW, the last row that has the same value of the
oidcolumn inorder byas the current row is obtained. Therefore, thewindowcolumn value of the row whoseridcolumn value is 1 is [1, 2]. -
windowing_definition 2
partition by pid order by oid RANGE between CURRENT ROW and UNBOUNDED FOLLOWING -- Sample SQL statement: select pid, oid, rid, collect_list(rid) over(partition by pid order by oid RANGE between CURRENT ROW and UNBOUNDED FOLLOWING) as window from tbl;Sample result:
+------------+------------+------------+--------+ | pid | oid | rid | window | +------------+------------+------------+--------+ | 1 | NULL | 1 | [1, 2, 3, 4, 5, 6, 7, 8] | | 1 | NULL | 2 | [1, 2, 3, 4, 5, 6, 7, 8] | | 1 | 1 | 3 | [3, 4, 5, 6, 7, 8] | | 1 | 1 | 4 | [3, 4, 5, 6, 7, 8] | | 1 | 2 | 5 | [5, 6, 7, 8] | | 1 | 4 | 6 | [6, 7, 8] | | 1 | 7 | 7 | [7, 8] | | 1 | 11 | 8 | [8] | | 2 | NULL | 9 | [9, 10] | | 2 | NULL | 10 | [9, 10] | +------------+------------+------------+--------+ -
windowing_definition 3
partition by pid order by oid RANGE between 3 PRECEDING and 1 PRECEDING -- Sample SQL statement: select pid, oid, rid, collect_list(rid) over(partition by pid order by oid RANGE between 3 PRECEDING and 1 PRECEDING) as window from tbl;Sample result:
+------------+------------+------------+--------+ | pid | oid | rid | window | +------------+------------+------------+--------+ | 1 | NULL | 1 | [1, 2] | | 1 | NULL | 2 | [1, 2] | | 1 | 1 | 3 | NULL | | 1 | 1 | 4 | NULL | | 1 | 2 | 5 | [3, 4] | | 1 | 4 | 6 | [3, 4, 5] | | 1 | 7 | 7 | [6] | | 1 | 11 | 8 | NULL | | 2 | NULL | 9 | [9, 10] | | 2 | NULL | 10 | [9, 10] | +------------+------------+------------+--------+For the row whose value of
oidinorder byis NULL, if frame_start is set tooffset PRECEDINGor offset FOLLOWING, the row is the first row whose value of oid inorder byis NULL only if theoffsetis not unbounded. If frame_end is set to offset PRECEDING or offset FOLLOWING, the row is the last row whose value of oid inorder byis NULL only if the offset is not unbounded.
-
-
Windows in GROUPS mode
windowing_definition
partition by pid order by oid GROUPS between 2 PRECEDING and CURRENT ROW -- Sample SQL statement: select pid, oid, rid, collect_list(rid) over(partition by pid order by oid GROUPS between 2 PRECEDING and CURRENT ROW) as window from tbl;Sample result:
+------------+------------+------------+--------+ | pid | oid | rid | window | +------------+------------+------------+--------+ | 1 | NULL | 1 | [1, 2] | | 1 | NULL | 2 | [1, 2] | | 1 | 1 | 3 | [1, 2, 3, 4] | | 1 | 1 | 4 | [1, 2, 3, 4] | | 1 | 2 | 5 | [1, 2, 3, 4, 5] | | 1 | 4 | 6 | [3, 4, 5, 6] | | 1 | 7 | 7 | [5, 6, 7] | | 1 | 11 | 8 | [6, 7, 8] | | 2 | NULL | 9 | [9, 10] | | 2 | NULL | 10 | [9, 10] | +------------+------------+------------+--------+
Sample data
This section provides sample source data for the function examples below. Create a table named emp and insert the sample data into the table. Sample statement:
create table if not exists emp
(empno bigint,
ename string,
job string,
mgr bigint,
hiredate datetime,
sal bigint,
comm bigint,
deptno bigint);
tunnel upload emp.txt emp;
The emp.txt file contains the following sample data:
7369,SMITH,CLERK,7902,1980-12-17 00:00:00,800,,20
7499,ALLEN,SALESMAN,7698,1981-02-20 00:00:00,1600,300,30
7521,WARD,SALESMAN,7698,1981-02-22 00:00:00,1250,500,30
7566,JONES,MANAGER,7839,1981-04-02 00:00:00,2975,,20
7654,MARTIN,SALESMAN,7698,1981-09-28 00:00:00,1250,1400,30
7698,BLAKE,MANAGER,7839,1981-05-01 00:00:00,2850,,30
7782,CLARK,MANAGER,7839,1981-06-09 00:00:00,2450,,10
7788,SCOTT,ANALYST,7566,1987-04-19 00:00:00,3000,,20
7839,KING,PRESIDENT,,1981-11-17 00:00:00,5000,,10
7844,TURNER,SALESMAN,7698,1981-09-08 00:00:00,1500,0,30
7876,ADAMS,CLERK,7788,1987-05-23 00:00:00,1100,,20
7900,JAMES,CLERK,7698,1981-12-03 00:00:00,950,,30
7902,FORD,ANALYST,7566,1981-12-03 00:00:00,3000,,20
7934,MILLER,CLERK,7782,1982-01-23 00:00:00,1300,,10
7948,JACCKA,CLERK,7782,1981-04-12 00:00:00,5000,,10
7956,WELAN,CLERK,7649,1982-07-20 00:00:00,2450,,10
7956,TEBAGE,CLERK,7748,1982-12-30 00:00:00,1300,,10
AVG
-
Syntax
double avg([distinct] double <expr>) over ([partition_clause] [orderby_clause] [frame_clause]) decimal avg([distinct] decimal <expr>) over ([partition_clause] [orderby_clause] [frame_clause]) -
Description
Returns the average value of expr in a window.
-
Description
-
expr: required. This parameter specifies the expression that is used to calculate the returned result. The value is of the DOUBLE or DECIMAL type.
-
If the input value is of the STRING or BIGINT type, it is implicitly converted into a value of the DOUBLE type before calculation. If it is of another data type, an error is returned.
-
If the input value is NULL, the row that contains the value is not used for calculation.
-
If the distinct keyword is specified, the average value of distinct values is calculated.
-
-
partition_clause, orderby_clause, and frame_clause: For more information about these parameters, see windowing_definition.
-
-
Return value
If the input value of expr is of the DECIMAL type, a value of the DECIMAL type is returned. If the input value is of another data type, a value of the DOUBLE type is returned. If the input value of expr is NULL, NULL is returned.
-
Examples
-
Example 1: Use the deptno column to define a window and calculate the average value of the sal column. ORDER BY is not specified. This function returns the cumulative average value of the values from the first row to the last row in the current window. The current window includes the rows that have the same deptno column value. Sample statement:
select deptno, sal, avg(sal) over (partition by deptno) from emp;Sample result:
+------------+------------+------------+ | deptno | sal | _c2 | +------------+------------+------------+ | 10 | 1300 | 2916.6666666666665 | -- This row is the first row of this window. The return value is the cumulative average value of the values from the first row to the sixth row. | 10 | 2450 | 2916.6666666666665 | -- The return value is the cumulative average value of the values from the first row to the sixth row. | 10 | 5000 | 2916.6666666666665 | -- The return value is the cumulative average value of the values from the first row to the sixth row. | 10 | 1300 | 2916.6666666666665 | | 10 | 5000 | 2916.6666666666665 | | 10 | 2450 | 2916.6666666666665 | | 20 | 3000 | 2175.0 | | 20 | 3000 | 2175.0 | | 20 | 800 | 2175.0 | | 20 | 1100 | 2175.0 | | 20 | 2975 | 2175.0 | | 30 | 1500 | 1566.6666666666667 | | 30 | 950 | 1566.6666666666667 | | 30 | 1600 | 1566.6666666666667 | | 30 | 1250 | 1566.6666666666667 | | 30 | 1250 | 1566.6666666666667 | | 30 | 2850 | 1566.6666666666667 | +------------+------------+------------+ -
Example 2: In Hive-compatible mode, use the deptno column to define a window and calculate the average value of the sal column. ORDER BY is specified. This function returns the cumulative average value of the values from the first row to the row that has the same sal value as the current row in the current window. The average values for the rows that have the same sal value are the same. The current window includes the rows that have the same deptno value. Sample statement:
-- Execute the following statement: select deptno, sal, avg(sal) over (partition by deptno order by sal) from emp;Sample result:
+------------+------------+------------+ | deptno | sal | _c2 | +------------+------------+------------+ | 10 | 1300 | 1300.0 | -- This row is the first row of this window. The average value for the first row is the cumulative average value of the values in the first and second rows because the two rows have the same sal value. | 10 | 1300 | 1300.0 | -- The return value is the cumulative average value of the values in the first and second rows. | 10 | 2450 | 1875.0 | -- The average value for the third row is the cumulative average value of the values from the first row to the fourth row because the third and fourth rows have the same sal value. | 10 | 2450 | 1875.0 | -- The return value is the cumulative average value of the values from the first row to the fourth row. | 10 | 5000 | 2916.6666666666665 | | 10 | 5000 | 2916.6666666666665 | | 20 | 800 | 800.0 | | 20 | 1100 | 950.0 | | 20 | 2975 | 1625.0 | | 20 | 3000 | 2175.0 | | 20 | 3000 | 2175.0 | | 30 | 950 | 950.0 | | 30 | 1250 | 1150.0 | | 30 | 1250 | 1150.0 | | 30 | 1500 | 1237.5 | | 30 | 1600 | 1310.0 | | 30 | 2850 | 1566.6666666666667 | +------------+------------+------------+
-
CLUSTER_SAMPLE
-
Syntax
boolean cluster_sample(bigint <N>) OVER ([partition_clause]) boolean cluster_sample(bigint <N>, bigint <M>) OVER ([partition_clause]) -
Description
-
cluster_sample(bigint <N>): specifies that N random rows of data are sampled. -
cluster_sample(bigint <N>, bigint <M>): specifies that rows are sampled based on a specified ratio (M/N). The number of rows that are sampled is calculated by using the following formula:partition_row_count × M/N.partition_row_countspecifies the number of rows in a partition.
-
-
Description
-
N: required. A constant of the BIGINT type. If N is set to NULL, NULL is returned.
-
M: required. A constant of the BIGINT type. If M is set to NULL, NULL is returned.
-
partition_clause: optional. For more information, see windowing_definition.
-
-
Return value
The return value is a value of the BOOLEAN type.
-
Examples
Sample about 20% of data entries in each group. Sample statement:
select deptno, sal from ( select deptno, sal, cluster_sample(5, 1) over (partition by deptno) as flag from emp ) sub where flag = true;Sample result:
+------------+------------+ | deptno | sal | +------------+------------+ | 10 | 1300 | | 20 | 3000 | | 30 | 950 | +------------+------------+
COUNT
-
Syntax
bigint count(*) over ([partition_clause] [orderby_clause] [frame_clause]) bigint count([distinct] <expr>[,...]) over ([partition_clause] [orderby_clause] [frame_clause]) -
Description
-
count(*): the total number of rows to return. -
count([distinct] <expr>[,...]): When you calculate the number of rows, the rows whose expr is NULL are ignored. If multiple expr parameters exist, any rows whose expr is NULL are ignored. In addition, if the distinct keyword is specified, the number of rows after deduplication is calculated. Any rows whose expr is NULL are ignored.
-
-
Description
-
expr: required. This parameter specifies the column whose values you want to count. All data types are supported. If the input value is NULL, the row that contains this value is not used for the calculation. If the DISTINCT keyword is specified, the count value of distinct values is obtained.
-
partition_clause, orderby_clause, and frame_clause: For more information about these parameters, see windowing_definition.
-
-
Return value
A value of the BIGINT type is returned.
-
Examples
-
Example 1: Use the sal column to define a window. ORDER BY is not specified. This function returns the cumulative count from the first row to the last row in the current window. The current window includes the rows that have the same sal value. Sample statement:
select sal, count(sal) over (partition by sal) as count from emp;Sample result:
+------------+------------+ | sal | count | +------------+------------+ | 800 | 1 | | 950 | 1 | | 1100 | 1 | | 1250 | 2 | -- This row is the first row of this window. The value of COUNT for the first row is the cumulative count for the second row because the two rows have the same sal value. | 1250 | 2 | -- The return value is the cumulative count from the first row to the second row in the current window. | 1300 | 2 | | 1300 | 2 | | 1500 | 1 | | 1600 | 1 | | 2450 | 2 | | 2450 | 2 | | 2850 | 1 | | 2975 | 1 | | 3000 | 2 | | 3000 | 2 | | 5000 | 2 | | 5000 | 2 | +------------+------------+ -
Example 2: Use the sal column to define a window. ORDER BY is specified. This function returns the cumulative count from the first row to the last row in the current window. The current window includes the rows that have the same sal value. Sample statement:
-- Execute the following statement: select sal, count(sal) over (partition by sal order by sal) as count from emp;Sample result:
+------------+------------+ | sal | count | +------------+------------+ | 800 | 1 | | 950 | 1 | | 1100 | 1 | | 1250 | 2 | -- This row is the first row of this window. The value of COUNT for the first row is the cumulative count for the second row because the two rows have the same sal value. | 1250 | 2 | -- The return value is the cumulative count from the first row to the second row in the current window. | 1300 | 2 | | 1300 | 2 | | 1500 | 1 | | 1600 | 1 | | 2450 | 2 | | 2450 | 2 | | 2850 | 1 | | 2975 | 1 | | 3000 | 2 | | 3000 | 2 | | 5000 | 2 | | 5000 | 2 | +------------+------------+
-
CUME_DIST
-
Syntax
double cume_dist() over([partition_clause] [orderby_clause]) -
Description
This function calculates the cumulative distribution. The cumulative distribution indicates the ratio of rows whose values are greater than or equal to the values of the current row to all rows in a partition. The ratio is determined based on orderby_clause.
-
Description
partition_clause and orderby_clause: For more information about these parameters, see windowing_definition.
-
Return value
A value of the DOUBLE type is returned. The return value is calculated by using the following formula:
row_number_of_last_peer/partition_row_count.row_number_of_last_peerindicates the value returned by ROW_NUMBER that corresponds to the last row of the group to which the current row belongs.partition_row_countindicates the number of rows in a partition to which the current row belongs. -
Examples
Group all employees by department (deptno) and calculate the cumulative distribution of employees in each group by salary. Sample statement:
select deptno, ename, sal, concat(round(cume_dist() over (partition by deptno order by sal desc)*100,2),'%') as cume_dist from emp;Sample result:
+------------+------------+------------+------------+ | deptno | ename | sal | cume_dist | +------------+------------+------------+------------+ | 10 | JACCKA | 5000 | 33.33% | | 10 | KING | 5000 | 33.33% | | 10 | CLARK | 2450 | 66.67% | | 10 | WELAN | 2450 | 66.67% | | 10 | TEBAGE | 1300 | 100.0% | | 10 | MILLER | 1300 | 100.0% | | 20 | SCOTT | 3000 | 40.0% | | 20 | FORD | 3000 | 40.0% | | 20 | JONES | 2975 | 60.0% | | 20 | ADAMS | 1100 | 80.0% | | 20 | SMITH | 800 | 100.0% | | 30 | BLAKE | 2850 | 16.67% | | 30 | ALLEN | 1600 | 33.33% | | 30 | TURNER | 1500 | 50.0% | | 30 | MARTIN | 1250 | 83.33% | | 30 | WARD | 1250 | 83.33% | | 30 | JAMES | 950 | 100.0% | +------------+------------+------------+------------+
DENSE_RANK
-
Syntax
bigint dense_rank() over ([partition_clause] [orderby_clause]) -
Description
Calculates the rank of a row in an ordered group of rows that are sorted based on orderby_clause. The rank counts from 1. In a partition, rows with the same value of the column that is specified in
order byhave the same rank. The rank increases by 1 each time the value of the column that is specified inorder bychanges. -
Description
partition_clause and orderby_clause: For more information about these parameters, see windowing_definition.
-
Return value
A value of the BIGINT type is returned. If orderby_clause is not specified, the values in the returned results are all 1.
-
Examples
Group all employees based on the deptno column. In each group, sort the employees in descending order based on the sal value to obtain the sequence numbers of the employees in their respective groups. Sample statement:
select deptno, ename, sal, dense_rank() over (partition by deptno order by sal desc) as nums from emp;Sample result:
+------------+------------+------------+------------+ | deptno | ename | sal | nums | +------------+------------+------------+------------+ | 10 | JACCKA | 5000 | 1 | | 10 | KING | 5000 | 1 | | 10 | CLARK | 2450 | 2 | | 10 | WELAN | 2450 | 2 | | 10 | TEBAGE | 1300 | 3 | | 10 | MILLER | 1300 | 3 | | 20 | SCOTT | 3000 | 1 | | 20 | FORD | 3000 | 1 | | 20 | JONES | 2975 | 2 | | 20 | ADAMS | 1100 | 3 | | 20 | SMITH | 800 | 4 | | 30 | BLAKE | 2850 | 1 | | 30 | ALLEN | 1600 | 2 | | 30 | TURNER | 1500 | 3 | | 30 | MARTIN | 1250 | 4 | | 30 | WARD | 1250 | 4 | | 30 | JAMES | 950 | 5 | +------------+------------+------------+------------+
FIRST_VALUE
-
Syntax
first_value(<expr>[, <ignore_nulls>]) over ([partition_clause] [orderby_clause] [frame_clause]) -
Description
Returns the value of expr that corresponds to the first row of a window.
-
Description
-
expr: required. The expression that is used to calculate the returned result.
-
ignore_nulls: optional. A value of the BOOLEAN type. This parameter specifies whether to ignore null values. Default value: false. If this parameter is set to true, a non-null value of expr that corresponds to the first row of a window is returned.
-
partition_clause, orderby_clause, and frame_clause: For more information about these parameters, see windowing_definition.
-
-
Return value
A value of the same data type as expr is returned.
-
Examples
Group all employees by department and return the first row of data in each group. Sample statement:
-
ORDER BY is not specified.
select deptno, ename, sal, first_value(sal) over (partition by deptno) as first_value from emp;Sample result:
+------------+------------+------------+-------------+ | deptno | ename | sal | first_value | +------------+------------+------------+-------------+ | 10 | TEBAGE | 1300 | 1300 | -- This row is the first row of the current window. | 10 | CLARK | 2450 | 1300 | | 10 | KING | 5000 | 1300 | | 10 | MILLER | 1300 | 1300 | | 10 | JACCKA | 5000 | 1300 | | 10 | WELAN | 2450 | 1300 | | 20 | FORD | 3000 | 3000 | -- This row is the first row of the current window. | 20 | SCOTT | 3000 | 3000 | | 20 | SMITH | 800 | 3000 | | 20 | ADAMS | 1100 | 3000 | | 20 | JONES | 2975 | 3000 | | 30 | TURNER | 1500 | 1500 | -- This row is the first row of the current window. | 30 | JAMES | 950 | 1500 | | 30 | ALLEN | 1600 | 1500 | | 30 | WARD | 1250 | 1500 | | 30 | MARTIN | 1250 | 1500 | | 30 | BLAKE | 2850 | 1500 | +------------+------------+------------+-------------+ -
ORDER BY is specified.
select deptno, ename, sal, first_value(sal) over (partition by deptno order by sal desc) as first_value from emp;Sample result:
+------------+------------+------------+-------------+ | deptno | ename | sal | first_value | +------------+------------+------------+-------------+ | 10 | JACCKA | 5000 | 5000 | -- This row is the first row of the current window. | 10 | KING | 5000 | 5000 | | 10 | CLARK | 2450 | 5000 | | 10 | WELAN | 2450 | 5000 | | 10 | TEBAGE | 1300 | 5000 | | 10 | MILLER | 1300 | 5000 | | 20 | SCOTT | 3000 | 3000 | -- This row is the first row of the current window. | 20 | FORD | 3000 | 3000 | | 20 | JONES | 2975 | 3000 | | 20 | ADAMS | 1100 | 3000 | | 20 | SMITH | 800 | 3000 | | 30 | BLAKE | 2850 | 2850 | -- This row is the first row of the current window. | 30 | ALLEN | 1600 | 2850 | | 30 | TURNER | 1500 | 2850 | | 30 | MARTIN | 1250 | 2850 | | 30 | WARD | 1250 | 2850 | | 30 | JAMES | 950 | 2850 | +------------+------------+------------+-------------+
-
LAG
-
Syntax
lag(<expr>[,bigint <offset>[, <default>]]) over([partition_clause] orderby_clause) -
Description
Returns the value of expr that precedes the current row at a given offset. The expression expr can be a column, a column operation, or a function operation.
-
Description
-
expr: required. The expression that is used to calculate the returned result.
-
offset: optional. The value is a constant of the BIGINT type and must be greater than or equal to 0. The value 0 indicates the current row, and the value 1 indicates the previous row. Default value: 1. If the input value is of the STRING or DOUBLE type, it is implicitly converted into a value of the BIGINT type before calculation.
-
default: optional. The default value when the value of offset is out of the valid range. The value of this parameter must be a constant. The default value of this parameter is null. The value of this parameter must be of the same data type as the value of expr. If the value of expr is not a constant, the parameter value is determined based on the current row.
-
partition_clause and orderby_clause: For more information about these parameters, see windowing_definition.
-
-
Return value
A value of the same data type as expr is returned.
-
Examples
Group all employees based on the deptno column in a window and calculate the value of sal for each employee at a given offset. Sample statement:
select deptno, ename, sal, lag(sal, 1) over (partition by deptno order by sal) as sal_new from emp;Sample result:
+------------+------------+------------+------------+ | deptno | ename | sal | sal_new | +------------+------------+------------+------------+ | 10 | TEBAGE | 1300 | NULL | | 10 | MILLER | 1300 | 1300 | | 10 | CLARK | 2450 | 1300 | | 10 | WELAN | 2450 | 2450 | | 10 | KING | 5000 | 2450 | | 10 | JACCKA | 5000 | 5000 | | 20 | SMITH | 800 | NULL | | 20 | ADAMS | 1100 | 800 | | 20 | JONES | 2975 | 1100 | | 20 | SCOTT | 3000 | 2975 | | 20 | FORD | 3000 | 3000 | | 30 | JAMES | 950 | NULL | | 30 | MARTIN | 1250 | 950 | | 30 | WARD | 1250 | 1250 | | 30 | TURNER | 1500 | 1250 | | 30 | ALLEN | 1600 | 1500 | | 30 | BLAKE | 2850 | 1600 | +------------+------------+------------+------------+
LAST_VALUE
-
Syntax
last_value(<expr>[, <ignore_nulls>]) over([partition_clause] [orderby_clause] [frame_clause]) -
Description
Returns the calculated result of the last row of data in a window using the expression that is specified by expr.
-
Description
-
expr: required. The expression that is used to calculate the returned result.
-
ignore_nulls: optional. A value of the BOOLEAN type. This parameter specifies whether to ignore null values. Default value: false. If this parameter is set to true, a non-null value of expr that corresponds to the last row of a window is returned.
-
partition_clause, orderby_clause, and frame_clause: For more information about these parameters, see windowing_definition.
-
-
Return value
A value of the same data type as expr is returned.
-
Examples
Group all employees based on the deptno column and return the last row of data in each group. Sample statement:
-
If order by is not specified, the rows from the first row to the last row belong to the current window. The value of the last row in the current window is returned.
select deptno, ename, sal, last_value(sal) over (partition by deptno) as last_value from emp;Sample result:
+------------+------------+------------+-------------+ | deptno | ename | sal | last_value | +------------+------------+------------+-------------+ | 10 | TEBAGE | 1300 | 2450 | | 10 | CLARK | 2450 | 2450 | | 10 | KING | 5000 | 2450 | | 10 | MILLER | 1300 | 2450 | | 10 | JACCKA | 5000 | 2450 | | 10 | WELAN | 2450 | 2450 | -- This row is the last row of the current window. | 20 | FORD | 3000 | 2975 | | 20 | SCOTT | 3000 | 2975 | | 20 | SMITH | 800 | 2975 | | 20 | ADAMS | 1100 | 2975 | | 20 | JONES | 2975 | 2975 | -- This row is the last row of the current window. | 30 | TURNER | 1500 | 2850 | | 30 | JAMES | 950 | 2850 | | 30 | ALLEN | 1600 | 2850 | | 30 | WARD | 1250 | 2850 | | 30 | MARTIN | 1250 | 2850 | | 30 | BLAKE | 2850 | 2850 | -- This row is the last row of the current window. +------------+------------+------------+-------------+ -
If order by is specified, the rows from the first row to the current row belong to the current window. The value of the current row in the current window is returned.
select deptno, ename, sal, last_value(sal) over (partition by deptno order by sal desc) as last_value from emp;Sample result:
+------------+------------+------------+-------------+ | deptno | ename | sal | last_value | +------------+------------+------------+-------------+ | 10 | JACCKA | 5000 | 5000 | -- This row is the current row of the current window. | 10 | KING | 5000 | 5000 | -- This row is the current row of the current window. | 10 | CLARK | 2450 | 2450 | -- This row is the current row of the current window. | 10 | WELAN | 2450 | 2450 | -- This row is the current row of the current window. | 10 | TEBAGE | 1300 | 1300 | -- This row is the current row of the current window. | 10 | MILLER | 1300 | 1300 | -- This row is the current row of the current window. | 20 | SCOTT | 3000 | 3000 | -- This row is the current row of the current window. | 20 | FORD | 3000 | 3000 | -- This row is the current row of the current window. | 20 | JONES | 2975 | 2975 | -- This row is the current row of the current window. | 20 | ADAMS | 1100 | 1100 | -- This row is the current row of the current window. | 20 | SMITH | 800 | 800 | -- This row is the current row of the current window. | 30 | BLAKE | 2850 | 2850 | -- This row is the current row of the current window. | 30 | ALLEN | 1600 | 1600 | -- This row is the current row of the current window. | 30 | TURNER | 1500 | 1500 | -- This row is the current row of the current window. | 30 | MARTIN | 1250 | 1250 | -- This row is the current row of the current window. | 30 | WARD | 1250 | 1250 | -- This row is the current row of the current window. | 30 | JAMES | 950 | 950 | -- This row is the current row of the current window. +------------+------------+------------+-------------+
-
LEAD
-
Syntax
lead(<expr>[, bigint <offset>[, <default>]]) over([partition_clause] orderby_clause) -
Description
Returns the value of expr that corresponds to the Nth row following the current row at a given offset. The expression expr can be a column, a column operation, or a function operation.
-
Description
-
expr: required. The expression that is used to calculate the returned result.
-
offset: optional. The value is a constant of the BIGINT type and must be greater than or equal to 0. The value 0 indicates the current row, and the value 1 indicates the next row. Default value: 1. If the input value is of the STRING or DOUBLE type, it is implicitly converted into a value of the BIGINT type before calculation.
-
default: optional. The default value when the value of offset is out of the valid range. The value of this parameter must be a constant. The default value of this parameter is null. The value of this parameter must be of the same data type as the value of expr. If the value of expr is not a constant, the parameter value is determined based on the current row.
-
partition_clause and orderby_clause: For more information about these parameters, see windowing_definition.
-
-
Return value
A value of the same data type as expr is returned.
-
Examples
Group all employees based on the deptno column in a window and calculate the value of sal for each employee at a given offset. Sample statement:
select deptno, ename, sal, lead(sal, 1) over (partition by deptno order by sal) as sal_new from emp;Sample result:
+------------+------------+------------+------------+ | deptno | ename | sal | sal_new | +------------+------------+------------+------------+ | 10 | TEBAGE | 1300 | 1300 | | 10 | MILLER | 1300 | 2450 | | 10 | CLARK | 2450 | 2450 | | 10 | WELAN | 2450 | 5000 | | 10 | KING | 5000 | 5000 | | 10 | JACCKA | 5000 | NULL | | 20 | SMITH | 800 | 1100 | | 20 | ADAMS | 1100 | 2975 | | 20 | JONES | 2975 | 3000 | | 20 | SCOTT | 3000 | 3000 | | 20 | FORD | 3000 | NULL | | 30 | JAMES | 950 | 1250 | | 30 | MARTIN | 1250 | 1250 | | 30 | WARD | 1250 | 1500 | | 30 | TURNER | 1500 | 1600 | | 30 | ALLEN | 1600 | 2850 | | 30 | BLAKE | 2850 | NULL | +------------+------------+------------+------------+
MAX
-
Syntax
max(<expr>) over([partition_clause] [orderby_clause] [frame_clause]) -
Description
Returns the maximum value of expr in a window.
-
Description
-
expr: required. The expression that is used to calculate the maximum value. The input value can be of any data type other than BOOLEAN. If the input value for a row is NULL, this row is not used for calculation.
-
partition_clause, orderby_clause, and frame_clause: For more information about these parameters, see windowing_definition.
-
-
Return value
A value of the same type as expr is returned.
-
Examples
-
Example 1: Use the deptno column to define a window and obtain the maximum value of the sal column. The ORDER BY clause is not specified. This function returns the maximum value of the current window. The current window includes the rows that have the same deptno value. Sample statement:
select deptno, sal, max(sal) over (partition by deptno) from emp;Sample result:
+------------+------------+------------+ | deptno | sal | _c2 | +------------+------------+------------+ | 10 | 1300 | 5000 | -- This row is the first row of this window. The return value is the maximum value among the values from the first row to the sixth row. | 10 | 2450 | 5000 | -- The return value is the maximum value among the values from the first row to the sixth row. | 10 | 5000 | 5000 | -- The return value is the maximum value among the values from the first row to the sixth row. | 10 | 1300 | 5000 | | 10 | 5000 | 5000 | | 10 | 2450 | 5000 | | 20 | 3000 | 3000 | | 20 | 3000 | 3000 | | 20 | 800 | 3000 | | 20 | 1100 | 3000 | | 20 | 2975 | 3000 | | 30 | 1500 | 2850 | | 30 | 950 | 2850 | | 30 | 1600 | 2850 | | 30 | 1250 | 2850 | | 30 | 1250 | 2850 | | 30 | 2850 | 2850 | +------------+------------+------------+ -
Example 2: Use the deptno column to define a window and obtain the maximum value of the sal column. The ORDER BY clause is specified. This function returns the maximum value among the values from the first row to the current row in the current window. The current window includes the rows that have the same deptno value. Run the following command:
select deptno, sal, max(sal) over (partition by deptno order by sal) from emp;Sample result:
+------------+------------+------------+ | deptno | sal | _c2 | +------------+------------+------------+ | 10 | 1300 | 1300 | -- This row is the first row of this window. | 10 | 1300 | 1300 | -- The return value is the maximum value among the values in the first and second rows. | 10 | 2450 | 2450 | -- The return value is the maximum value among the values from the first row to the third row. | 10 | 2450 | 2450 | -- The return value is the maximum value among the values from the first row to the fourth row. | 10 | 5000 | 5000 | | 10 | 5000 | 5000 | | 20 | 800 | 800 | | 20 | 1100 | 1100 | | 20 | 2975 | 2975 | | 20 | 3000 | 3000 | | 20 | 3000 | 3000 | | 30 | 950 | 950 | | 30 | 1250 | 1250 | | 30 | 1250 | 1250 | | 30 | 1500 | 1500 | | 30 | 1600 | 1600 | | 30 | 2850 | 2850 | +------------+------------+------------+
-
MEDIAN
-
Syntax
median(<expr>) over ([partition_clause] [orderby_clause] [frame_clause]) -
Description
Returns the median of expr in a window.
-
Description
-
expr: required. The expression that is used to calculate the median. A value of the DOUBLE or DECIMAL type. The value of this parameter must be 1 to 255 digits in length.
-
If the input value is of the STRING or BIGINT type, it is implicitly converted into a value of the DOUBLE type before calculation. If it is of another data type, an error is returned.
-
If the input value is NULL, NULL is returned.
-
-
partition_clause, orderby_clause, and frame_clause: For more information about these parameters, see windowing_definition.
-
-
Return value
A value of the DOUBLE or DECIMAL type is returned. If the values of all expressions that are specified by expr are NULL, NULL is returned.
-
Examples
Use the deptno column to define a window and calculate the median value of the sal column. This function returns the median value of the current window. The current window includes the rows that have the same deptno value. Run the following command:
select deptno, sal, median(sal) over (partition by deptno) from emp;Sample result:
+------------+------------+------------+ | deptno | sal | _c2 | +------------+------------+------------+ | 10 | 1300 | 2450.0 | -- This row is the first row of this window. The return value is the median value of the values from the first row to the sixth row. | 10 | 2450 | 2450.0 | | 10 | 5000 | 2450.0 | | 10 | 1300 | 2450.0 | | 10 | 5000 | 2450.0 | | 10 | 2450 | 2450.0 | | 20 | 3000 | 2975.0 | | 20 | 3000 | 2975.0 | | 20 | 800 | 2975.0 | | 20 | 1100 | 2975.0 | | 20 | 2975 | 2975.0 | | 30 | 1500 | 1375.0 | | 30 | 950 | 1375.0 | | 30 | 1600 | 1375.0 | | 30 | 1250 | 1375.0 | | 30 | 1250 | 1375.0 | | 30 | 2850 | 1375.0 | +------------+------------+------------+
MIN
-
Syntax
min(<expr>) over([partition_clause] [orderby_clause] [frame_clause]) -
Description
Returns the minimum value of expr in a window.
-
Description
-
expr: required. The expression that is used to calculate the minimum value. The input value can be of any data type other than BOOLEAN. If the value for a row is null, this row is not used for calculation.
-
partition_clause, orderby_clause, and frame_clause: For more information about these parameters, see windowing_definition.
-
-
Return value
A value of the same data type as expr is returned.
-
Examples
-
Example 1: Use the deptno column to define a window and obtain the minimum value of the sal column. The ORDER BY clause is not specified. This function returns the minimum value of the current window. The current window includes the rows that have the same deptno value. Run the following command:
select deptno, sal, min(sal) over (partition by deptno) from emp;Sample result:
+------------+------------+------------+ | deptno | sal | _c2 | +------------+------------+------------+ | 10 | 1300 | 1300 | -- This row is the first row of this window. The return value is the minimum value among the values from the first row to the sixth row. | 10 | 2450 | 1300 | -- The return value is the minimum value among the values from the first row to the sixth row. | 10 | 5000 | 1300 | -- The return value is the minimum value among the values from the first row to the sixth row. | 10 | 1300 | 1300 | | 10 | 5000 | 1300 | | 10 | 2450 | 1300 | | 20 | 3000 | 800 | | 20 | 3000 | 800 | | 20 | 800 | 800 | | 20 | 1100 | 800 | | 20 | 2975 | 800 | | 30 | 1500 | 950 | | 30 | 950 | 950 | | 30 | 1600 | 950 | | 30 | 1250 | 950 | | 30 | 1250 | 950 | | 30 | 2850 | 950 | +------------+------------+------------+ -
Example 2: Use the deptno column to define a window and obtain the minimum value of the sal column. The ORDER BY clause is specified. This function returns the minimum value among the values from the first row to the current row in the current window. The current window includes the rows that have the same deptno value. Run the following command:
select deptno, sal, min(sal) over (partition by deptno order by sal) from emp;Sample result:
+------------+------------+------------+ | deptno | sal | _c2 | +------------+------------+------------+ | 10 | 1300 | 1300 | -- This row is the first row of this window. | 10 | 1300 | 1300 | -- The return value is the minimum value among the values in the first and second rows. | 10 | 2450 | 1300 | -- The return value is the minimum value among the values from the first row to the third row. | 10 | 2450 | 1300 | | 10 | 5000 | 1300 | | 10 | 5000 | 1300 | | 20 | 800 | 800 | | 20 | 1100 | 800 | | 20 | 2975 | 800 | | 20 | 3000 | 800 | | 20 | 3000 | 800 | | 30 | 950 | 950 | | 30 | 1250 | 950 | | 30 | 1250 | 950 | | 30 | 1500 | 950 | | 30 | 1600 | 950 | | 30 | 2850 | 950 | +------------+------------+------------+
-
NTILE
-
Syntax
bigint ntile(bigint <N>) over ([partition_clause] [orderby_clause]) -
Description
Splits rows of data in a partition into N groups of equal size and returns the number of the group to which the specified row belongs. If data in the partition cannot be split into N groups of equal size, one more row is preferentially allocated to the first M groups.
-
Description
-
N: required. This parameter specifies the number of splits. The input value is of the BIGINT type.
-
partition_clause and orderby_clause: For more information about these parameters, see windowing_definition.
-
-
Return value
A value of the BIGINT type is returned.
-
Examples
Divide all employees into three groups based on the sal column in descending order and obtain the number of the group to which each employee belongs. Sample statement:
select deptno, ename, sal, ntile(3) over (partition by deptno order by sal desc) as nt3 from emp;Sample result:
+------------+------------+------------+------------+ | deptno | ename | sal | nt3 | +------------+------------+------------+------------+ | 10 | JACCKA | 5000 | 1 | | 10 | KING | 5000 | 1 | | 10 | CLARK | 2450 | 2 | | 10 | WELAN | 2450 | 2 | | 10 | TEBAGE | 1300 | 3 | | 10 | MILLER | 1300 | 3 | | 20 | SCOTT | 3000 | 1 | | 20 | FORD | 3000 | 1 | | 20 | JONES | 2975 | 2 | | 20 | ADAMS | 1100 | 2 | | 20 | SMITH | 800 | 3 | | 30 | BLAKE | 2850 | 1 | | 30 | ALLEN | 1600 | 1 | | 30 | TURNER | 1500 | 2 | | 30 | MARTIN | 1250 | 2 | | 30 | WARD | 1250 | 3 | | 30 | JAMES | 950 | 3 | +------------+------------+------------+------------+
NTH_VALUE
-
Syntax
nth_value(<expr>, <number> [, <ignore_nulls>]) over ([partition_clause] [orderby_clause] [frame_clause]) -
Description
Returns the value of expr that corresponds to the Nth row in a window.
-
Description
-
expr: required. The expression that is used to calculate the returned result.
-
number: required. The value is of the BIGINT type The value must be an integer greater than or equal to 1. If the input value is 1, this function is equivalent to FIRST_VALUE.
-
ignore_nulls: optional. A value of the BOOLEAN type. This parameter specifies whether to ignore null values. Default value: false. If this parameter is set to true, a non-null value of expr that corresponds to the Nth row of a window is returned.
-
partition_clause, orderby_clause, and frame_clause: For more information about these parameters, see windowing_definition.
-
-
Return value
A value of the same data type as expr is returned.
-
Examples
Group all employees by department and return the sixth row of data in each group. Sample statement:
-
If order by is not specified, the rows from the first row to the last row belong to the current window. The value of the sixth row in the current window is returned.
select deptno, ename, sal, nth_value(sal,6) over (partition by deptno) as nth_value from emp;Sample result:
+------------+------------+------------+------------+ | deptno | ename | sal | nth_value | +------------+------------+------------+------------+ | 10 | TEBAGE | 1300 | 2450 | | 10 | CLARK | 2450 | 2450 | | 10 | KING | 5000 | 2450 | | 10 | MILLER | 1300 | 2450 | | 10 | JACCKA | 5000 | 2450 | | 10 | WELAN | 2450 | 2450 | -- This row is the sixth row of the current window. | 20 | FORD | 3000 | NULL | | 20 | SCOTT | 3000 | NULL | | 20 | SMITH | 800 | NULL | | 20 | ADAMS | 1100 | NULL | | 20 | JONES | 2975 | NULL | -- This current window has less than six rows, and null is returned. | 30 | TURNER | 1500 | 2850 | | 30 | JAMES | 950 | 2850 | | 30 | ALLEN | 1600 | 2850 | | 30 | WARD | 1250 | 2850 | | 30 | MARTIN | 1250 | 2850 | | 30 | BLAKE | 2850 | 2850 | -- This row is the sixth row of the current window. +------------+------------+------------+------------+ -
If order by is specified, the rows from the first row to the current row belong to the current window. The value of the sixth row in the current window is returned.
select deptno, ename, sal, nth_value(sal,6) over (partition by deptno order by sal) as nth_value from emp;Sample result:
+------------+------------+------------+------------+ | deptno | ename | sal | nth_value | +------------+------------+------------+------------+ | 10 | TEBAGE | 1300 | NULL | | 10 | MILLER | 1300 | NULL | -- This window has only two rows, and NULL is returned. | 10 | CLARK | 2450 | NULL | | 10 | WELAN | 2450 | NULL | | 10 | KING | 5000 | 5000 | | 10 | JACCKA | 5000 | 5000 | | 20 | SMITH | 800 | NULL | | 20 | ADAMS | 1100 | NULL | | 20 | JONES | 2975 | NULL | | 20 | SCOTT | 3000 | NULL | | 20 | FORD | 3000 | NULL | | 30 | JAMES | 950 | NULL | | 30 | MARTIN | 1250 | NULL | | 30 | WARD | 1250 | NULL | | 30 | TURNER | 1500 | NULL | | 30 | ALLEN | 1600 | NULL | | 30 | BLAKE | 2850 | 2850 | +------------+------------+------------+------------+
-
PERCENT_RANK
-
Syntax
double percent_rank() over([partition_clause] [orderby_clause]) -
Description
Calculates the percentile rank of the current row in a partition based on orderby_clause.
-
Description
partition_clause and orderby_clause: For more information about these parameters, see windowing_definition.
-
Return value
A value of the DOUBLE type is returned. The valid value range is [0.0, 1.0]. The return value is calculated by using the following formula:
(rank - 1)/(partition_row_count - 1).rankindicates the return value of the RANK function that corresponds to the current row.partition_row_countindicates the number of rows in the partition to which the current row belongs. If the partition contains only one row of data, 0.0 is returned. -
Examples
Calculate the percentile rank of each employee in a group based on the sal column. Sample statement:
select deptno, ename, sal, percent_rank() over (partition by deptno order by sal desc) as sal_new from emp;Sample result:
+------------+------------+------------+------------+ | deptno | ename | sal | sal_new | +------------+------------+------------+------------+ | 10 | JACCKA | 5000 | 0.0 | | 10 | KING | 5000 | 0.0 | | 10 | CLARK | 2450 | 0.4 | | 10 | WELAN | 2450 | 0.4 | | 10 | TEBAGE | 1300 | 0.8 | | 10 | MILLER | 1300 | 0.8 | | 20 | SCOTT | 3000 | 0.0 | | 20 | FORD | 3000 | 0.0 | | 20 | JONES | 2975 | 0.5 | | 20 | ADAMS | 1100 | 0.75 | | 20 | SMITH | 800 | 1.0 | | 30 | BLAKE | 2850 | 0.0 | | 30 | ALLEN | 1600 | 0.2 | | 30 | TURNER | 1500 | 0.4 | | 30 | MARTIN | 1250 | 0.6 | | 30 | WARD | 1250 | 0.6 | | 30 | JAMES | 950 | 1.0 | +------------+------------+------------+------------+
RANK
-
Syntax
bigint rank() over ([partition_clause] [orderby_clause]) -
Description
Calculates the rank of a row in an ordered group of rows that are sorted based on orderby_clause. The rank counts from 1.
-
Description
partition_clause and orderby_clause: For more information about these parameters, see windowing_definition.
-
Return value
A value of the BIGINT type is returned. The return value may be duplicate and inconsecutive. The return value is the sequence number of the first row in the group to which the current row belongs. The sequence number of the first row is calculated by using the
ROW_NUMBER()function. If orderby_clause is not specified, the values in the returned results are all 1. -
Examples
Group all employees based on the deptno column. In each group, sort the employees in descending order based on the sal value to obtain the sequence numbers of the employees in their respective groups. Sample statement:
select deptno, ename, sal, rank() over (partition by deptno order by sal desc) as nums from emp;Sample result:
+------------+------------+------------+------------+ | deptno | ename | sal | nums | +------------+------------+------------+------------+ | 10 | JACCKA | 5000 | 1 | | 10 | KING | 5000 | 1 | | 10 | CLARK | 2450 | 3 | | 10 | WELAN | 2450 | 3 | | 10 | TEBAGE | 1300 | 5 | | 10 | MILLER | 1300 | 5 | | 20 | SCOTT | 3000 | 1 | | 20 | FORD | 3000 | 1 | | 20 | JONES | 2975 | 3 | | 20 | ADAMS | 1100 | 4 | | 20 | SMITH | 800 | 5 | | 30 | BLAKE | 2850 | 1 | | 30 | ALLEN | 1600 | 2 | | 30 | TURNER | 1500 | 3 | | 30 | MARTIN | 1250 | 4 | | 30 | WARD | 1250 | 4 | | 30 | JAMES | 950 | 6 | +------------+------------+------------+------------+
ROW_NUMBER
-
Syntax
row_number() over([partition_clause] [orderby_clause]) -
Description
Returns the sequence number of the current row in a partition. The sequence number counts from 1.
-
Description
For more information, see windowing_definition. frame_clause is not supported.
-
Return value
A value of the BIGINT type is returned.
-
Examples
Group all employees based on the deptno column. In each group, sort the employees in descending order based on the sal value, and obtain the sequence numbers of the employees in their respective groups. Sample statement:
select deptno, ename, sal, row_number() over (partition by deptno order by sal desc) as nums from emp;Sample result:
+------------+------------+------------+------------+ | deptno | ename | sal | nums | +------------+------------+------------+------------+ | 10 | JACCKA | 5000 | 1 | | 10 | KING | 5000 | 2 | | 10 | CLARK | 2450 | 3 | | 10 | WELAN | 2450 | 4 | | 10 | TEBAGE | 1300 | 5 | | 10 | MILLER | 1300 | 6 | | 20 | SCOTT | 3000 | 1 | | 20 | FORD | 3000 | 2 | | 20 | JONES | 2975 | 3 | | 20 | ADAMS | 1100 | 4 | | 20 | SMITH | 800 | 5 | | 30 | BLAKE | 2850 | 1 | | 30 | ALLEN | 1600 | 2 | | 30 | TURNER | 1500 | 3 | | 30 | MARTIN | 1250 | 4 | | 30 | WARD | 1250 | 5 | | 30 | JAMES | 950 | 6 | +------------+------------+------------+------------+
STDDEV
-
Syntax
double stddev|stddev_pop([distinct] <expr>) over ([partition_clause] [orderby_clause] [frame_clause]) decimal stddev|stddev_pop([distinct] <expr>) over ([partition_clause] [orderby_clause] [frame_clause]) -
Description
Returns the population standard deviation of all input values. This function is also called STDDEV_POP.
-
Description
-
expr: required. The expression that is used to calculate the population standard deviation. The input values can be of the DOUBLE or DECIMAL type.
-
If the input value is of the STRING or BIGINT type, it is implicitly converted into a value of the DOUBLE type before calculation. If it is of another data type, an error is returned.
-
If the value in a row is null, this row is not used for calculation.
-
If the distinct keyword is specified, the population standard deviation of distinct values is calculated.
-
-
partition_clause, orderby_clause, and frame_clause: For more information about these parameters, see windowing_definition.
-
-
Return value
A value of the same data type as expr is returned. If the values of all expressions that are specified by expr are null, null is returned.
-
Examples
-
Example 1: Use the deptno column to define a window and calculate the population standard deviation of the sal column. The ORDER BY clause is not specified. This function returns the cumulative population standard deviation of the current window. The current window includes the rows that have the same deptno value. Sample statement:
select deptno, sal, stddev(sal) over (partition by deptno) from emp;Sample result:
+------------+------------+------------+ | deptno | sal | _c2 | +------------+------------+------------+ | 10 | 1300 | 1546.1421524412158 | -- This row is the first row of this window. The return value is the cumulative population standard deviation of the values from the first row to the sixth row. | 10 | 2450 | 1546.1421524412158 | -- The return value is the cumulative population standard deviation of the values from the first row to the sixth row. | 10 | 5000 | 1546.1421524412158 | | 10 | 1300 | 1546.1421524412158 | | 10 | 5000 | 1546.1421524412158 | | 10 | 2450 | 1546.1421524412158 | | 20 | 3000 | 1004.7387720198718 | | 20 | 3000 | 1004.7387720198718 | | 20 | 800 | 1004.7387720198718 | | 20 | 1100 | 1004.7387720198718 | | 20 | 2975 | 1004.7387720198718 | | 30 | 1500 | 610.1001739241042 | | 30 | 950 | 610.1001739241042 | | 30 | 1600 | 610.1001739241042 | | 30 | 1250 | 610.1001739241042 | | 30 | 1250 | 610.1001739241042 | | 30 | 2850 | 610.1001739241042 | +------------+------------+------------+ -
Example 2: Use the deptno column to define a window and calculate the population standard deviation of the sal column. The ORDER BY clause is specified. This function returns the cumulative population standard deviation of the values from the first row to the row that has the same sal value as the current row in the current window. The population standard deviations for the rows that have the same sal value are the same. The current window includes the rows that have the same deptno value. Sample statement:
-- Execute the following statement: select deptno, sal, stddev(sal) over (partition by deptno order by sal) from emp;Sample result:
+------------+------------+------------+ | deptno | sal | _c2 | +------------+------------+------------+ | 10 | 1300 | 0.0 | -- This row is the first row of this window. The population standard deviation for the first row is the cumulative population standard deviation of the values in the first and second rows because the two rows have the same sal value. | 10 | 1300 | 0.0 | -- The return value is the cumulative population standard deviation of the values in the first and second rows. | 10 | 2450 | 575.0 | -- The population standard deviation for the third row is the cumulative population standard deviation of the values from the first row to the fourth row because the third and fourth rows have the same sal value. | 10 | 2450 | 575.0 | -- The return value is the cumulative population standard deviation of the values from the first row to the fourth row. | 10 | 5000 | 1546.1421524412158 | | 10 | 5000 | 1546.1421524412158 | | 20 | 800 | 0.0 | | 20 | 1100 | 150.0 | | 20 | 2975 | 962.4188277460079 | | 20 | 3000 | 1004.7387720198718 | | 20 | 3000 | 1004.7387720198718 | | 30 | 950 | 0.0 | | 30 | 1250 | 141.4213562373095 | | 30 | 1250 | 141.4213562373095 | | 30 | 1500 | 194.8557158514987 | | 30 | 1600 | 226.71568097509268 | | 30 | 2850 | 610.1001739241042 | +------------+------------+------------+
-
STDDEV_SAMP
-
Syntax
double stddev_samp([distinct] <expr>) over([partition_clause] [orderby_clause] [frame_clause]) decimal stddev_samp([distinct] <expr>) over([partition_clause] [orderby_clause] [frame_clause]) -
Description
This function returns the sample standard deviation of all input values.
-
Description
-
expr: required. This parameter specifies the expression that is used to calculate the sample standard deviation. The input values can be of the DOUBLE or DECIMAL type.
-
If the input value is of the STRING or BIGINT type, it is implicitly converted into a value of the DOUBLE type before calculation. If it is of another data type, an error is returned.
-
If the value for a row is NULL, this row is not used for the calculation.
-
If the distinct keyword is specified, the sample standard deviation of distinct values is calculated.
-
-
partition_clause, orderby_clause, and frame_clause: For more information about these parameters, see windowing_definition.
-
-
Return value
A value of the same data type as expr is returned. If the values of all expressions that are specified by expr are NULL, NULL is returned. If the window has only one row of data whose expr value is not NULL, 0 is returned.
-
Examples
-
Example 1: Use the deptno column to define a window and calculate the sample standard deviation of the sal column. The ORDER BY clause is not specified. This function returns the cumulative sample standard deviation of the current window. The current window includes the rows that have the same deptno value. Sample statement:
select deptno, sal, stddev_samp(sal) over (partition by deptno) from emp;Sample result:
+------------+------------+------------+ | deptno | sal | _c2 | +------------+------------+------------+ | 10 | 1300 | 1693.7138680032904 | -- This row is the first row of this window. The return value is the cumulative sample standard deviation of the values from the first row to the sixth row. | 10 | 2450 | 1693.7138680032904 | -- The return value is the cumulative sample standard deviation of the values from the first row to the sixth row. | 10 | 5000 | 1693.7138680032904 | -- The return value is the cumulative sample standard deviation of the values from the first row to the sixth row. | 10 | 1300 | 1693.7138680032904 | | 10 | 5000 | 1693.7138680032904 | | 10 | 2450 | 1693.7138680032904 | | 20 | 3000 | 1123.3320969330487 | | 20 | 3000 | 1123.3320969330487 | | 20 | 800 | 1123.3320969330487 | | 20 | 1100 | 1123.3320969330487 | | 20 | 2975 | 1123.3320969330487 | | 30 | 1500 | 668.331255192114 | | 30 | 950 | 668.331255192114 | | 30 | 1600 | 668.331255192114 | | 30 | 1250 | 668.331255192114 | | 30 | 1250 | 668.331255192114 | | 30 | 2850 | 668.331255192114 | +------------+------------+------------+ -
Example 2: Use the deptno column to define a window and calculate the sample standard deviation of the sal column. The ORDER BY clause is specified. This function returns the cumulative sample standard deviation of the values from the first row to the current row in the current window. The current window includes the rows that have the same deptno value. Sample statement:
select deptno, sal, stddev_samp(sal) over (partition by deptno order by sal) from emp;Sample result:
+------------+------------+------------+ | deptno | sal | _c2 | +------------+------------+------------+ | 10 | 1300 | 0.0 | -- This row is the first row of this window. | 10 | 1300 | 0.0 | -- The return value is the cumulative sample standard deviation of the values in the first and second rows. | 10 | 2450 | 663.9528095680697 | -- The return value is the cumulative sample standard deviation of the values from the first row to the third row. | 10 | 2450 | 663.9528095680696 | | 10 | 5000 | 1511.2081259707413 | | 10 | 5000 | 1693.7138680032904 | | 20 | 800 | 0.0 | | 20 | 1100 | 212.13203435596427 | | 20 | 2975 | 1178.7175234126282 | | 20 | 3000 | 1182.7536725793752 | | 20 | 3000 | 1123.3320969330487 | | 30 | 950 | 0.0 | | 30 | 1250 | 212.13203435596427 | | 30 | 1250 | 173.20508075688772 | | 30 | 1500 | 225.0 | | 30 | 1600 | 253.4758371127315 | | 30 | 2850 | 668.331255192114 | +------------+------------+------------+
-
SUM
-
Syntax
sum([distinct] <expr>) over ([partition_clause] [orderby_clause] [frame_clause]) -
Description
Returns the sum of expr in a window.
-
Description
-
expr: required. This parameter specifies the column whose sum you want to calculate. The column can be of the DOUBLE, DECIMAL, or BIGINT type.
-
If an input value is of the STRING type, it is implicitly converted into a value of the DOUBLE type before calculation. If it is of another data type, an error is returned.
-
If the input value in a row is null, the row is not used for calculation.
-
If the distinct keyword is specified, the sum of distinct values is calculated.
-
-
partition_clause, orderby_clause, and frame_clause: For more information about these parameters, see windowing_definition.
-
-
Return value
-
If input values are of the BIGINT type, a value of the BIGINT type is returned.
-
If input values are of the DECIMAL type, a value of the DECIMAL type is returned.
-
If input values are of the DOUBLE or STRING type, a value of the DOUBLE type is returned.
-
If input values are null, null is returned.
-
-
Examples
-
Example 1: Use the deptno column to define a window and calculate the sum of the sal column. The ORDER BY clause is not specified. This function returns the cumulative sum of the current window. The current window includes the rows that have the same deptno value. Sample statement:
select deptno, sal, sum(sal) over (partition by deptno) from emp;Sample result:
+------------+------------+------------+ | deptno | sal | _c2 | +------------+------------+------------+ | 10 | 1300 | 17500 | -- This row is the first row of this window. The return value is the cumulative sum of the values from the first row to the sixth row. | 10 | 2450 | 17500 | -- The return value is the cumulative sum of the values from the first row to the sixth row. | 10 | 5000 | 17500 | -- The return value is the cumulative sum of the values from the first row to the sixth row. | 10 | 1300 | 17500 | | 10 | 5000 | 17500 | | 10 | 2450 | 17500 | | 20 | 3000 | 10875 | | 20 | 3000 | 10875 | | 20 | 800 | 10875 | | 20 | 1100 | 10875 | | 20 | 2975 | 10875 | | 30 | 1500 | 9400 | | 30 | 950 | 9400 | | 30 | 1600 | 9400 | | 30 | 1250 | 9400 | | 30 | 1250 | 9400 | | 30 | 2850 | 9400 | +------------+------------+------------+ -
Example 2: Use the deptno column to define a window and calculate the sum of the sal column. The ORDER BY clause is specified. This function returns the cumulative sum of the values from the first row to the row that has the same sal value as the current row in the current window. The sum values for the rows that have the same sal value are the same. The current window includes the rows that have the same deptno value. Sample statement:
-- Execute the following statement: select deptno, sal, sum(sal) over (partition by deptno order by sal) from emp;Sample result:
+------------+------------+------------+ | deptno | sal | _c2 | +------------+------------+------------+ | 10 | 1300 | 2600 | -- This row is the first row of this window. The sum for the first row is the cumulative sum of the values in the first and second rows because the two rows have the same sal value. | 10 | 1300 | 2600 | -- The return value is the cumulative sum of the values in the first and second rows. | 10 | 2450 | 7500 | -- The sum for the third row is the cumulative sum of the values from the first row to the fourth row because the third and fourth rows have the same sal value. | 10 | 2450 | 7500 | -- The return value is the cumulative sum of the values from the first row to the fourth row. | 10 | 5000 | 17500 | | 10 | 5000 | 17500 | | 20 | 800 | 800 | | 20 | 1100 | 1900 | | 20 | 2975 | 4875 | | 20 | 3000 | 10875 | | 20 | 3000 | 10875 | | 30 | 950 | 950 | | 30 | 1250 | 3450 | | 30 | 1250 | 3450 | | 30 | 1500 | 4950 | | 30 | 1600 | 6550 | | 30 | 2850 | 9400 | +------------+------------+------------+
-