All Products
Search
Document Center

MaxCompute:SUM

Last Updated:Jul 26, 2023

Calculates the sum of all input values.

Limits

Before you use window functions, take note of the following limits:

  • Window functions are supported only in SELECT statements.

  • A window function cannot contain nested window functions or nested aggregate functions.

  • You cannot use window functions together with aggregate functions of the same level.

Syntax

-- Calculate the sum of a column.
DECIMAL|DOUBLE|BIGINT sum(<colname>)

-- Calculate the sum of expr in a window.
sum([distinct] <expr>) over ([partition_clause] [orderby_clause] [frame_clause])

Description

Calculates the sum of a column or the sum of expr in a window.

Parameters

  • colname: required. Column values support all data types and can be converted into the DOUBLE type before calculation. If the specified column is of the STRING type, the values in the column are implicitly converted into the DOUBLE type before calculation.

  • expr: required. This parameter specifies the column whose sum you want to calculate. The column can be of the DOUBLE, DECIMAL, or BIGINT type.

    • If an input value is of the STRING type, it is implicitly converted into a value of the DOUBLE type before calculation. If it is of another data type, an error is returned.

    • If the input value in a row is null, the row is not used for calculation.

    • If the distinct keyword is specified, the sum of distinct values is calculated.

  • partition_clause, orderby_clause, and frame_clause: For more information about these parameters, see windowing_definition.

Return value

If the value of the column specified by colname in a row is null, the row is not used for calculation. The following table describes the mappings between data types of input data and return values.

Input type

Return value type

TINYINT

BIGINT

SMALLINT

BIGINT

INT

BIGINT

BIGINT

BIGINT

FLOAT

DOUBLE

DOUBLE

DOUBLE

DECIMAL

DECIMAL

Sample data

This section provides sample source data and examples for you to understand how to use the functions. Create a table named emp and insert the sample data into the table. Sample statement:

create table if not exists emp
   (empno bigint,
    ename string,
    job string,
    mgr bigint,
    hiredate datetime,
    sal bigint,
    comm bigint,
    deptno bigint);
tunnel upload emp.txt emp;

The emp.txt file contains the following sample data:

7369,SMITH,CLERK,7902,1980-12-17 00:00:00,800,,20
7499,ALLEN,SALESMAN,7698,1981-02-20 00:00:00,1600,300,30
7521,WARD,SALESMAN,7698,1981-02-22 00:00:00,1250,500,30
7566,JONES,MANAGER,7839,1981-04-02 00:00:00,2975,,20
7654,MARTIN,SALESMAN,7698,1981-09-28 00:00:00,1250,1400,30
7698,BLAKE,MANAGER,7839,1981-05-01 00:00:00,2850,,30
7782,CLARK,MANAGER,7839,1981-06-09 00:00:00,2450,,10
7788,SCOTT,ANALYST,7566,1987-04-19 00:00:00,3000,,20
7839,KING,PRESIDENT,,1981-11-17 00:00:00,5000,,10
7844,TURNER,SALESMAN,7698,1981-09-08 00:00:00,1500,0,30
7876,ADAMS,CLERK,7788,1987-05-23 00:00:00,1100,,20
7900,JAMES,CLERK,7698,1981-12-03 00:00:00,950,,30
7902,FORD,ANALYST,7566,1981-12-03 00:00:00,3000,,20
7934,MILLER,CLERK,7782,1982-01-23 00:00:00,1300,,10
7948,JACCKA,CLERK,7782,1981-04-12 00:00:00,5000,,10
7956,WELAN,CLERK,7649,1982-07-20 00:00:00,2450,,10
7956,TEBAGE,CLERK,7748,1982-12-30 00:00:00,1300,,10

Examples

  • Example 1: Use the deptno column to define a window and calculate the sum of the sal column. The ORDER BY clause is not specified. This function returns the cumulative sum of the current window. The current window includes the rows that have the same deptno value. Sample statement:

    select deptno, sal, sum(sal) over (partition by deptno) from emp;

    The following result is returned:

    +------------+------------+------------+
    | deptno     | sal        | _c2        |
    +------------+------------+------------+
    | 10         | 1300       | 17500      |   -- This row is the first row of this window. The return value is the cumulative sum of the values from the first row to the sixth row. 
    | 10         | 2450       | 17500      |   -- The return value is the cumulative sum of the values from the first row to the sixth row. 
    | 10         | 5000       | 17500      |   -- The return value is the cumulative sum of the values from the first row to the sixth row. 
    | 10         | 1300       | 17500      |
    | 10         | 5000       | 17500      |
    | 10         | 2450       | 17500      |
    | 20         | 3000       | 10875      |
    | 20         | 3000       | 10875      |
    | 20         | 800        | 10875      |
    | 20         | 1100       | 10875      |
    | 20         | 2975       | 10875      |
    | 30         | 1500       | 9400       |
    | 30         | 950        | 9400       |
    | 30         | 1600       | 9400       |
    | 30         | 1250       | 9400       |
    | 30         | 1250       | 9400       |
    | 30         | 2850       | 9400       |
    +------------+------------+------------+
  • Example 2: In non-Hive-compatible mode, use the deptno column to define a window and calculate the sum of the sal column. The ORDER BY clause is specified. This function returns the cumulative sum of the values from the first row to the current row in the current window. The current window includes the rows that have the same deptno value. Sample statements:

    -- Disable the Hive-compatible mode. 
    set odps.sql.hive.compatible=false;
    -- Execute the following statement: 
    select deptno, sal, sum(sal) over (partition by deptno order by sal) from emp;

    The following result is returned:

    +------------+------------+------------+
    | deptno     | sal        | _c2        |
    +------------+------------+------------+
    | 10         | 1300       | 1300       |   -- This row is the starting row of this window. 
    | 10         | 1300       | 2600       |   -- The return value is the cumulative sum of the values in the first and second rows. 
    | 10         | 2450       | 5050       |   -- The return value is the cumulative sum of the values from the first row to the third row. 
    | 10         | 2450       | 7500       |
    | 10         | 5000       | 12500      |
    | 10         | 5000       | 17500      |
    | 20         | 800        | 800        |
    | 20         | 1100       | 1900       |
    | 20         | 2975       | 4875       |
    | 20         | 3000       | 7875       |
    | 20         | 3000       | 10875      |
    | 30         | 950        | 950        |
    | 30         | 1250       | 2200       |
    | 30         | 1250       | 3450       |
    | 30         | 1500       | 4950       |
    | 30         | 1600       | 6550       |
    | 30         | 2850       | 9400       |
    +------------+------------+------------+
  • Example 3: In Hive-compatible mode, use the deptno column to define a window and calculate the sum of the sal column. The ORDER BY clause is specified. This function returns the cumulative sum of the values from the first row to the row that has the same sal value as the current row in the current window. The sum values for the rows that have the same sal value are the same. The current window includes the rows that have the same deptno value. Sample statements:

    -- Enable the Hive-compatible mode. 
    set odps.sql.hive.compatible=true;
    -- Execute the following statement: 
    select deptno, sal, sum(sal) over (partition by deptno order by sal) from emp;

    The following result is returned:

    +------------+------------+------------+
    | deptno     | sal        | _c2        |
    +------------+------------+------------+
    | 10         | 1300       | 2600       |   -- This row is the first row of this window. The sum for the first row is the cumulative sum of the values in the first and second rows because the two rows have the same sal value. 
    | 10         | 1300       | 2600       |   -- The return value is the cumulative sum of the values in the first and second rows. 
    | 10         | 2450       | 7500       |   -- The sum for the third row is the cumulative sum of the values from the first row to the fourth row because the third and fourth rows have the same sal value. 
    | 10         | 2450       | 7500       |   -- The return value is the cumulative sum of the values from the first row to the fourth row. 
    | 10         | 5000       | 17500      |
    | 10         | 5000       | 17500      |
    | 20         | 800        | 800        |
    | 20         | 1100       | 1900       |
    | 20         | 2975       | 4875       |
    | 20         | 3000       | 10875      |
    | 20         | 3000       | 10875      |
    | 30         | 950        | 950        |
    | 30         | 1250       | 3450       |
    | 30         | 1250       | 3450       |
    | 30         | 1500       | 4950       |
    | 30         | 1600       | 6550       |
    | 30         | 2850       | 9400       |
    +------------+------------+------------+
  • Example 4: Calculate the sum of salary (sal) values of all employees. Sample statement:

    select sum(sal) from emp;

    The following result is returned:

    +------------+
    | _c0        |
    +------------+
    | 37775      |
    +------------+
  • Example 5: Use this function with GROUP BY to group all employees by department (deptno) and calculate the sum of salary values of employees in each department. Sample statement:

    select deptno, sum(sal) from emp group by deptno;

    The following result is returned:

    +------------+------------+
    | deptno     | _c1        |
    +------------+------------+
    | 10         | 17500      |
    | 20         | 10875      |
    | 30         | 9400       |
    +------------+------------+

Related functions

SUM is an aggregate function or a window function.

  • For more information about the functions that are used to calculate the average value of multiple input records and aggregate parameters, see Aggregate functions.

  • For more information about the functions that are used to calculate the sum of data of columns in a window and sort data, see Window functions.