All Products
Search
Document Center

MaxCompute:AVG

Last Updated:Jul 27, 2023

Calculates the average value of a column or the average value of expr in a window.

Limits

Before you use window functions, take note of the following limits:

  • Window functions are supported only in SELECT statements.

  • A window function cannot contain nested window functions or nested aggregate functions.

  • You cannot use window functions together with aggregate functions of the same level.

Syntax

-- Calculate the average value of a column.
DECIMAL|DOUBLE avg(<colname>)

-- Calculate the average value of expr in a window. 
double avg([distinct] double <expr>) over ([partition_clause] [orderby_clause] [frame_clause])
decimal avg([distinct] decimal <expr>) over ([partition_clause] [orderby_clause] [frame_clause])

Description

Calculates the average value of a column or the average value of expr in a window.

Parameters

  • colname: required. Column values support all data types and can be converted into the DOUBLE type before calculation.

  • expr: required. This parameter specifies the expression that is used to calculate the returned result. The value is of the DOUBLE or DECIMAL type.

    • If the input value is of the STRING or BIGINT type, it is implicitly converted into a value of the DOUBLE type before calculation. If it is of another data type, an error is returned.

    • If the input value is NULL, the row that contains the value is not used for calculation.

    • If the distinct keyword is specified, the average value of distinct values is calculated.

  • partition_clause, orderby_clause, and frame_clause: For more information about these parameters, see windowing_definition.

Return value

  • If the input value of colname is null, the row that contains this value is not used for calculation. The following table describes the mappings between the data types of input data and return values.

    Data type of input data

    Data type of return values

    TINYINT

    DOUBLE

    SMALLINT

    DOUBLE

    INT

    DOUBLE

    BIGINT

    DOUBLE

    FLOAT

    DOUBLE

    DOUBLE

    DOUBLE

    DECIMAL

    DECIMAL

  • If the input value of expr is of the DECIMAL type, a value of the DECIMAL type is returned. If the input value is of another data type, a value of the DOUBLE type is returned. If the input value of expr is NULL, NULL is returned.

Sample data

This section provides sample source data and examples for you to understand how to use the functions. Create a table named emp and insert the sample data into the table. Sample statement:

create table if not exists emp
   (empno bigint,
    ename string,
    job string,
    mgr bigint,
    hiredate datetime,
    sal bigint,
    comm bigint,
    deptno bigint);
tunnel upload emp.txt emp;

The emp.txt file contains the following sample data:

7369,SMITH,CLERK,7902,1980-12-17 00:00:00,800,,20
7499,ALLEN,SALESMAN,7698,1981-02-20 00:00:00,1600,300,30
7521,WARD,SALESMAN,7698,1981-02-22 00:00:00,1250,500,30
7566,JONES,MANAGER,7839,1981-04-02 00:00:00,2975,,20
7654,MARTIN,SALESMAN,7698,1981-09-28 00:00:00,1250,1400,30
7698,BLAKE,MANAGER,7839,1981-05-01 00:00:00,2850,,30
7782,CLARK,MANAGER,7839,1981-06-09 00:00:00,2450,,10
7788,SCOTT,ANALYST,7566,1987-04-19 00:00:00,3000,,20
7839,KING,PRESIDENT,,1981-11-17 00:00:00,5000,,10
7844,TURNER,SALESMAN,7698,1981-09-08 00:00:00,1500,0,30
7876,ADAMS,CLERK,7788,1987-05-23 00:00:00,1100,,20
7900,JAMES,CLERK,7698,1981-12-03 00:00:00,950,,30
7902,FORD,ANALYST,7566,1981-12-03 00:00:00,3000,,20
7934,MILLER,CLERK,7782,1982-01-23 00:00:00,1300,,10
7948,JACCKA,CLERK,7782,1981-04-12 00:00:00,5000,,10
7956,WELAN,CLERK,7649,1982-07-20 00:00:00,2450,,10
7956,TEBAGE,CLERK,7748,1982-12-30 00:00:00,1300,,10

Examples

  • Example 1: Use the deptno column to define a window and calculate the average value of the sal column. ORDER BY is not specified. This function returns the cumulative average value of the values from the first row to the last row in the current window. The current window includes the rows that have the same deptno column value. Sample statement:

    select deptno, sal, avg(sal) over (partition by deptno) from emp;

    The following result is returned:

    +------------+------------+------------+
    | deptno     | sal        | _c2        |
    +------------+------------+------------+
    | 10         | 1300       | 2916.6666666666665 |   -- This row is the first row of this window. The return value is the cumulative average value of the values from the first row to the sixth row. 
    | 10         | 2450       | 2916.6666666666665 |   -- The return value is the cumulative average value of the values from the first row to the sixth row. 
    | 10         | 5000       | 2916.6666666666665 |   -- The return value is the cumulative average value of the values from the first row to the sixth row. 
    | 10         | 1300       | 2916.6666666666665 |
    | 10         | 5000       | 2916.6666666666665 |
    | 10         | 2450       | 2916.6666666666665 |
    | 20         | 3000       | 2175.0     |
    | 20         | 3000       | 2175.0     |
    | 20         | 800        | 2175.0     |
    | 20         | 1100       | 2175.0     |
    | 20         | 2975       | 2175.0     |
    | 30         | 1500       | 1566.6666666666667 |
    | 30         | 950        | 1566.6666666666667 |
    | 30         | 1600       | 1566.6666666666667 |
    | 30         | 1250       | 1566.6666666666667 |
    | 30         | 1250       | 1566.6666666666667 |
    | 30         | 2850       | 1566.6666666666667 |
    +------------+------------+------------+
  • Example 2: Disable the Hive-compatible data type edition and use the deptno column to define a window and calculate the average value of the sal column. ORDER BY is specified. This function returns the cumulative average value of the values from the first row to the current row in the current window. The current window includes the rows that have the same deptno value. Sample statement:

    -- Disable the Hive-compatible data type edition. 
    set odps.sql.hive.compatible=false;
    -- Execute the following statement: 
    select deptno, sal, avg(sal) over (partition by deptno order by sal) from emp;

    The following result is returned:

    +------------+------------+------------+
    | deptno     | sal        | _c2        |
    +------------+------------+------------+
    | 10         | 1300       | 1300.0     |           -- This row is the first row of this window. 
    | 10         | 1300       | 1300.0     |           -- The return value is the cumulative average value of the values in the first and second rows. 
    | 10         | 2450       | 1683.3333333333333 |   -- The return value is the cumulative average value of the values from the first row to the third row. 
    | 10         | 2450       | 1875.0     |           -- The return value is the cumulative average value of the values from the first row to the fourth row. 
    | 10         | 5000       | 2500.0     |           -- The return value is the cumulative average value of the values from the first row to the fifth row. 
    | 10         | 5000       | 2916.6666666666665 |   -- The return value is the cumulative average value of the values from the first row to the sixth row. 
    | 20         | 800        | 800.0      |
    | 20         | 1100       | 950.0      |
    | 20         | 2975       | 1625.0     |
    | 20         | 3000       | 1968.75    |
    | 20         | 3000       | 2175.0     |
    | 30         | 950        | 950.0      |
    | 30         | 1250       | 1100.0     |
    | 30         | 1250       | 1150.0     |
    | 30         | 1500       | 1237.5     |
    | 30         | 1600       | 1310.0     |
    | 30         | 2850       | 1566.6666666666667 |
    +------------+------------+------------+
  • Example 3: Enable the Hive-compatible data type edition and use the deptno column to define a window and calculate the average value of the sal column. ORDER BY is specified. This function returns the cumulative average value of the values from the first row to the row that has the same sal value as the current row in the current window. The average values for the rows that have the same sal value are the same. The current window includes the rows that have the same deptno value. Sample statement:

    -- Enable the Hive-compatible data type edition. 
    set odps.sql.hive.compatible=true;
    -- Execute the following statement: 
    select deptno, sal, avg(sal) over (partition by deptno order by sal) from emp;

    The following result is returned:

    +------------+------------+------------+
    | deptno     | sal        | _c2        |
    +------------+------------+------------+
    | 10         | 1300       | 1300.0     |          -- This row is the first row of this window. The average value for the first row is the cumulative average value of the values in the first and second rows because the two rows have the same sal value. 
    | 10         | 1300       | 1300.0     |          -- The return value is the cumulative average value of the values in the first and second rows. 
    | 10         | 2450       | 1875.0     |          -- The average value for the third row is the cumulative average value of the values from the first row to the fourth row because the third and fourth rows have the same sal value. 
    | 10         | 2450       | 1875.0     |          -- The return value is the cumulative average value of the values from the first row to the fourth row. 
    | 10         | 5000       | 2916.6666666666665 |
    | 10         | 5000       | 2916.6666666666665 |
    | 20         | 800        | 800.0      |
    | 20         | 1100       | 950.0      |
    | 20         | 2975       | 1625.0     |
    | 20         | 3000       | 2175.0     |
    | 20         | 3000       | 2175.0     |
    | 30         | 950        | 950.0      |
    | 30         | 1250       | 1150.0     |
    | 30         | 1250       | 1150.0     |
    | 30         | 1500       | 1237.5     |
    | 30         | 1600       | 1310.0     |
    | 30         | 2850       | 1566.6666666666667 |
    +------------+------------+------------+
  • Example 4: Calculate the average salary (sal) of all employees. Sample statement:

    select avg(sal) from emp;

    The following result is returned:

    +------------+
    | _c0        |
    +------------+
    | 2222.0588235294117 |
    +------------+
  • Example 5: Use this function with GROUP BY to group all employees by department (deptno) and calculate the average salary (sal) values of employees in each department. Sample statement:

    select deptno, avg(sal) from emp group by deptno;

    The following result is returned:

    +------------+------------+
    | deptno     | _c1        |
    +------------+------------+
    | 10         | 2916.6666666666665 |
    | 20         | 2175.0     |
    | 30         | 1566.6666666666667 |
    +------------+------------+

Related functions

AVG is an aggregate function or a window function.

  • For more information about the functions that are used to calculate the average value of multiple input records and to aggregate parameters, see Aggregate functions.

  • For more information about the functions that are used to calculate the sum of data of columns in a window and to sort data, see Window functions.