The STDDEV_SAMP function calculates the sample standard deviation.
Limits
Before you use window functions, take note of the following limits:
Window functions are supported only in
SELECTstatements.A window function cannot contain nested window functions or nested aggregate functions.
You cannot use window functions together with aggregate functions of the same level.
Syntax
-- Calculate the sample standard deviation.
DOUBLE STDDEV_SAMP(DOUBLE <colname>)
DECIMAL STDDEV_SAMP(DECIMAL <colname>)
-- Calculate the sample standard deviation of expr in a window.
DOUBLE STDDEV_SAMP([DISTINCT] <expr>) OVER([partition_clause] [orderby_clause] [frame_clause])
DECIMAL STDDEV_SAMP([DISTINCT] <expr>) OVER([partition_clause] [orderby_clause] [frame_clause])Parameters
colname: Required. The name of a column. The column can be of the DOUBLE or DECIMAL type. If the input is of the STRING or BIGINT type, the values are implicitly converted to the DOUBLE type for the calculation.
expr: Required. The expression for which you want to calculate the sample standard deviation. The expression must evaluate to a DOUBLE or DECIMAL value.
If an input value is of the STRING or BIGINT type, it is implicitly converted to the DOUBLE type for the calculation. An error is returned for other data types.
If a value is NULL, the corresponding row is ignored in the calculation.
If you specify the DISTINCT keyword, the function calculates the sample standard deviation of unique values.
partition_clause, orderby_clause, and frame_clause: For more information, see windowing_definition.
Return value
If the value of colname is NULL, the row is ignored in the calculation. For other data types, the return type is determined as follows:
Input type
Return type
TINYINT
DOUBLE
SMALLINT
DOUBLE
INT
DOUBLE
BIGINT
DOUBLE
FLOAT
DOUBLE
DOUBLE
DOUBLE
DECIMAL
DECIMAL
The return value is of the same data type as expr. If all values for expr are NULL, the function returns NULL. If the window contains only one non-NULL value for expr, the function returns 0.
NoteIf the window contains only one non-NULL value for expr and you run the
SET odps.sql.bigquery.compatible=true;command to enable BigQuery compatibility mode, the function returns NULL.
Sample data
This section provides sample source data and examples for you to understand how to use the functions. Create a table named emp and insert the sample data into the table. Sample statement:
create table if not exists emp
(empno bigint,
ename string,
job string,
mgr bigint,
hiredate datetime,
sal bigint,
comm bigint,
deptno bigint);
tunnel upload emp.txt emp; -- Replace emp.txt with the actual path (path and name) to which you upload the data file.The emp.txt file contains the following sample data:
7369,SMITH,CLERK,7902,1980-12-17 00:00:00,800,,20
7499,ALLEN,SALESMAN,7698,1981-02-20 00:00:00,1600,300,30
7521,WARD,SALESMAN,7698,1981-02-22 00:00:00,1250,500,30
7566,JONES,MANAGER,7839,1981-04-02 00:00:00,2975,,20
7654,MARTIN,SALESMAN,7698,1981-09-28 00:00:00,1250,1400,30
7698,BLAKE,MANAGER,7839,1981-05-01 00:00:00,2850,,30
7782,CLARK,MANAGER,7839,1981-06-09 00:00:00,2450,,10
7788,SCOTT,ANALYST,7566,1987-04-19 00:00:00,3000,,20
7839,KING,PRESIDENT,,1981-11-17 00:00:00,5000,,10
7844,TURNER,SALESMAN,7698,1981-09-08 00:00:00,1500,0,30
7876,ADAMS,CLERK,7788,1987-05-23 00:00:00,1100,,20
7900,JAMES,CLERK,7698,1981-12-03 00:00:00,950,,30
7902,FORD,ANALYST,7566,1981-12-03 00:00:00,3000,,20
7934,MILLER,CLERK,7782,1982-01-23 00:00:00,1300,,10
7948,JACCKA,CLERK,7782,1981-04-12 00:00:00,5000,,10
7956,WELAN,CLERK,7649,1982-07-20 00:00:00,2450,,10
7956,TEBAGE,CLERK,7748,1982-12-30 00:00:00,1300,,10Examples
Example 1: Calculate the sample standard deviation of salaries (`sal`) for each department (`deptno`). The window is partitioned by `deptno`. Because an `ORDER BY` clause is not specified, the function calculates the standard deviation over all rows in the partition and returns the same value for each row. The following command is an example:
SELECT deptno, sal, STDDEV_SAMP(sal) OVER (PARTITION BY deptno) FROM emp;The following result is returned:
+------------+------------+------------+ | deptno | sal | _c2 | +------------+------------+------------+ | 10 | 1300 | 1693.7138680032904 | -- The first row of the window. The sample standard deviation is calculated over all 6 rows in the partition. | 10 | 2450 | 1693.7138680032904 | -- The sample standard deviation is calculated over all 6 rows in the partition. | 10 | 5000 | 1693.7138680032904 | -- The sample standard deviation is calculated over all 6 rows in the partition. | 10 | 1300 | 1693.7138680032904 | | 10 | 5000 | 1693.7138680032904 | | 10 | 2450 | 1693.7138680032904 | | 20 | 3000 | 1123.3320969330487 | | 20 | 3000 | 1123.3320969330487 | | 20 | 800 | 1123.3320969330487 | | 20 | 1100 | 1123.3320969330487 | | 20 | 2975 | 1123.3320969330487 | | 30 | 1500 | 668.331255192114 | | 30 | 950 | 668.331255192114 | | 30 | 1600 | 668.331255192114 | | 30 | 1250 | 668.331255192114 | | 30 | 1250 | 668.331255192114 | | 30 | 2850 | 668.331255192114 | +------------+------------+------------+Example 2: Calculate the cumulative sample standard deviation of salaries (`sal`) for each department (`deptno`). The window is partitioned by `deptno` and ordered by `sal`. The function calculates the standard deviation for the range from the first row to the current row in the partition. The following command is an example:
SELECT deptno, sal, STDDEV_SAMP(sal) OVER (PARTITION BY deptno ORDER BY sal) FROM emp;The following result is returned:
+------------+------------+------------+ | deptno | sal | _c2 | +------------+------------+------------+ | 10 | 1300 | 0.0 | -- The first row of the window. | 10 | 1300 | 0.0 | -- The cumulative sample standard deviation of rows 1 to 2. | 10 | 2450 | 663.9528095680697 | -- The cumulative sample standard deviation of rows 1 to 3. | 10 | 2450 | 663.9528095680696 | | 10 | 5000 | 1511.2081259707413 | | 10 | 5000 | 1693.7138680032904 | | 20 | 800 | 0.0 | | 20 | 1100 | 212.13203435596427 | | 20 | 2975 | 1178.7175234126282 | | 20 | 3000 | 1182.7536725793752 | | 20 | 3000 | 1123.3320969330487 | | 30 | 950 | 0.0 | | 30 | 1250 | 212.13203435596427 | | 30 | 1250 | 173.20508075688772 | | 30 | 1500 | 225.0 | | 30 | 1600 | 253.4758371127315 | | 30 | 2850 | 668.331255192114 | +------------+------------+------------+Example 3: Calculate the sample standard deviation of salaries (`sal`) for all employees. The following command is an example:
SELECT STDDEV_SAMP(sal) FROM emp;The following result is returned:
+------------+ | _c0 | +------------+ | 1301.6180541247609 | +------------+Example 4: Use this function with
GROUP BYto group employees by department (`deptno`) and calculate the sample standard deviation of salaries (`sal`) for each department. The following command is an example:SELECT deptno, STDDEV_SAMP(sal) FROM emp GROUP BY deptno;The following result is returned:
+------------+------------+ | deptno | _c1 | +------------+------------+ | 10 | 1693.7138680032901 | | 20 | 1123.3320969330487 | | 30 | 668.3312551921141 | +------------+------------+
Related functions
STDDEV_SAMP is an aggregate function or a window function.
For more information about aggregate functions, see Aggregate functions.
For more information about window functions, see Window functions.