AnalyticDB for PostgreSQL supports reusable aggregate functions. When you execute multiple reusable aggregate functions on the same columns, AnalyticDB for PostgreSQL uses the first aggregate process and returns the results of each column. This feature significantly reduces the amount of time required to compute reusable aggregate functions.

Note Aggregate functions can be reused only for the same columns.

Determine whether aggregate functions can be reused

Aggregate functions work in three steps: value initialization, aggregate computation, and result computation. Aggregate functions can be reused if they have the same initialized value and the same aggregate computation method. You can execute the following statement to query reusable aggregation functions:

SELECT rank() over (ORDER BY aggtransfn,agginitval), aggfnoid,aggtransfn,agginitval FROM pg_aggregate;

For more information about the query results, see the "List of reusable aggregate functions" section of this topic.

Examples

Create a test table and insert data into the table.

CREATE TABLE agg_test(x float8, y float8);
INSERT INTO agg_test SELECT 10000*random(), 10000*random() FROM generate_series(1,50000000);

The following section shows the effect of aggregate function reuse in V7.0 and V6.0:

  • V7.0
    • A single aggregate function
      EXPLAIN (analyze,verbose,timing,costs,buffers) SELECT corr(y,x) FROM agg_test;

      The following execution plan of a single aggregate function is returned. It can be used for comparison with the execution plan of 11 aggregate functions.

                                                                        QUERY PLAN
      -----------------------------------------------------------------------------------------------------------------------------------------------
       Finalize Aggregate  (cost=0.00..1408.39 rows=1 width=8) (actual time=3972.170..3972.170 rows=1 loops=1)
         Output: corr(y, x)
         ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..1408.39 rows=1 width=8) (actual time=3748.570..3972.148 rows=3 loops=1)
               Output: (PARTIAL corr(y, x))
               ->  Partial Aggregate  (cost=0.00..1408.39 rows=1 width=8) (actual time=3748.602..3748.602 rows=1 loops=1)
                     Output: PARTIAL corr(y, x)
                     ->  Seq Scan on public.agg_test  (cost=0.00..852.66 rows=16666510 width=16) (actual time=0.249..1982.787 rows=16668088 loops=1)
                           Output: x, y
       Optimizer: Pivotal Optimizer (GPORCA)
       Planning Time: 8.248 ms
         (slice0)    Executor memory: 39K bytes.
         (slice1)    Executor memory: 37K bytes avg x 3 workers, 37K bytes max (seg0).
       Memory used:  128000kB
       Execution Time: 3973.034 ms
      (14 rows)
    • 11 reusable aggregate functions
      EXPLAIN (analyze,verbose,timing,costs,buffers) SELECT corr(y,x), covar_pop(y,x), covar_samp(y,x), regr_avgx(y,x), regr_avgy(y,x), regr_intercept(y,x), regr_r2(y,x), regr_slope(y,x), regr_sxx(y,x), regr_sxy(y,x), regr_syy(y,x) FROM agg_test;

      The following execution plan is returned. Compared with the execution plan of a single aggregate function, the execution process is the same and the execution duration is similar.

                                                                                                                                                            QUERY PLAN
      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
       Finalize Aggregate  (cost=0.00..2005.72 rows=1 width=88) (actual time=4060.219..4060.219 rows=1 loops=1)
         Output: corr(y, x), covar_pop(y, x), covar_samp(y, x), regr_avgx(y, x), regr_avgy(y, x), regr_intercept(y, x), regr_r2(y, x), regr_slope(y, x), regr_sxx(y, x), regr_sxy(y, x), regr_syy(y, x)
         ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..2005.72 rows=1 width=88) (actual time=3682.296..4060.182 rows=3 loops=1)
               Output: (PARTIAL corr(y, x)), (PARTIAL covar_pop(y, x)), (PARTIAL covar_samp(y, x)), (PARTIAL regr_avgx(y, x)), (PARTIAL regr_avgy(y, x)), (PARTIAL regr_intercept(y, x)), (PARTIAL regr_r2(y, x)), (PARTIAL regr_slope(y, x)), (PARTIAL regr_sxx(y, x)), (PARTIAL regr_sxy(y, x)), (PARTIAL regr_syy(y, x))
               ->  Partial Aggregate  (cost=0.00..2005.72 rows=1 width=88) (actual time=3681.908..3681.909 rows=1 loops=1)
                     Output: PARTIAL corr(y, x), PARTIAL covar_pop(y, x), PARTIAL covar_samp(y, x), PARTIAL regr_avgx(y, x), PARTIAL regr_avgy(y, x), PARTIAL regr_intercept(y, x), PARTIAL regr_r2(y, x), PARTIAL regr_slope(y, x), PARTIAL regr_sxx(y, x), PARTIAL regr_sxy(y, x), PARTIAL regr_syy(y, x)
                     ->  Seq Scan on public.agg_test  (cost=0.00..852.66 rows=16666510 width=16) (actual time=0.075..1909.142 rows=16668088 loops=1)
                           Output: x, y
       Optimizer: Pivotal Optimizer (GPORCA)
       Planning Time: 14.660 ms
         (slice0)    Executor memory: 84K bytes.
         (slice1)    Executor memory: 37K bytes avg x 3 workers, 37K bytes max (seg0).
       Memory used:  128000kB
       Execution Time: 4061.236 ms
      (14 rows)
  • V6.0
    • A single aggregate function
      EXPLAIN (analyze,verbose,timing,costs,buffers) SELECT corr(y,x) FROM agg_test;

      The following execution plan of a single aggregate function is returned. It can be used for comparison with the execution plan of 11 aggregate functions.

                                                                          QUERY PLAN
      --------------------------------------------------------------------------------------------------------------------------------------------------
       Aggregate  (cost=692190.42..692190.43 rows=1 width=8) (actual time=10206.860..10206.861 rows=1 loops=1)
         Output: pg_catalog.corr((corr(agg_test.y, agg_test.x)))
         Executor Memory: 8kB  Segments: 1  Max: 8kB (segment -1)
         ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=692190.35..692190.40 rows=1 width=32) (actual time=10206.810..10206.812 rows=3 loops=1)
               Output: (corr(agg_test.y, agg_test.x))
               ->  Aggregate  (cost=692190.35..692190.36 rows=1 width=32) (actual time=10143.501..10143.502 rows=1 loops=1)
                     Output: corr(agg_test.y, agg_test.x)
                     Executor Memory: 24kB  Segments: 3  Max: 8kB (segment 0)
                     ->  Seq Scan on public.agg_test  (cost=0.00..567193.48 rows=16666250 width=16) (actual time=0.103..5710.533 rows=16667260 loops=1)
                           Output: agg_test.y, agg_test.x
       Planning time: 1.394 ms
         (slice0)    Executor memory: 372K bytes.
         (slice1)    Executor memory: 73K bytes avg x 3 workers, 73K bytes max (seg0).
       Memory used:  128000kB
       Optimizer: Postgres query optimizer
       Execution time: 10222.988 ms
      (16 rows)
    • 11 reusable aggregate functions
      EXPLAIN (analyze,verbose,timing,costs,buffers) SELECT corr(y,x), covar_pop(y,x), covar_samp(y,x), regr_avgx(y,x), regr_avgy(y,x), regr_intercept(y,x), regr_r2(y,x), regr_slope(y,x), regr_sxx(y,x), regr_sxy(y,x), regr_syy(y,x) FROM agg_test;

      The following execution plan is returned. Compared with the execution plan of a single aggregate function, the execution process is significantly different and the execution duration is much longer.

      
                                                            QUERY PLAN
      
      ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      -----------------------------------------------------------------------------------------------------------------------
       Aggregate  (cost=1942159.19..1942159.20 rows=1 width=88) (actual time=39555.642..39555.643 rows=1 loops=1)
         Output: pg_catalog.corr((corr(agg_test.y, agg_test.x))), pg_catalog.covar_pop((covar_pop(agg_test.y, agg_test.x))), pg_catalog.covar_samp((covar_samp(agg_test.y, agg_test.x))), pg_catalog.regr_avgx((regr_avgx(agg_test.y, agg_test.x))), pg_catalog.regr_avgy((regr_a
      vgy(agg_test.y, agg_test.x))), pg_catalog.regr_intercept((regr_intercept(agg_test.y, agg_test.x))), pg_catalog.regr_r2((regr_r2(agg_test.y, agg_test.x))), pg_catalog.regr_slope((regr_slope(agg_test.y, agg_test.x))), pg_catalog.regr_sxx((regr_sxx(agg_test.y, agg_test.
      x))), pg_catalog.regr_sxy((regr_sxy(agg_test.y, agg_test.x))), pg_catalog.regr_syy((regr_syy(agg_test.y, agg_test.x)))
         Executor Memory: 8kB  Segments: 1  Max: 8kB (segment -1)
         ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=1942159.08..1942159.13 rows=1 width=352) (actual time=39555.594..39555.596 rows=3 loops=1)
               Output: (corr(agg_test.y, agg_test.x)), (covar_pop(agg_test.y, agg_test.x)), (covar_samp(agg_test.y, agg_test.x)), (regr_avgx(agg_test.y, agg_test.x)), (regr_avgy(agg_test.y, agg_test.x)), (regr_intercept(agg_test.y, agg_test.x)), (regr_r2(agg_test.y, agg_te
      st.x)), (regr_slope(agg_test.y, agg_test.x)), (regr_sxx(agg_test.y, agg_test.x)), (regr_sxy(agg_test.y, agg_test.x)), (regr_syy(agg_test.y, agg_test.x))
               ->  Aggregate  (cost=1942159.08..1942159.09 rows=1 width=352) (actual time=39253.556..39253.556 rows=1 loops=1)
                     Output: corr(agg_test.y, agg_test.x), covar_pop(agg_test.y, agg_test.x), covar_samp(agg_test.y, agg_test.x), regr_avgx(agg_test.y, agg_test.x), regr_avgy(agg_test.y, agg_test.x), regr_intercept(agg_test.y, agg_test.x), regr_r2(agg_test.y, agg_test.x),
      regr_slope(agg_test.y, agg_test.x), regr_sxx(agg_test.y, agg_test.x), regr_sxy(agg_test.y, agg_test.x), regr_syy(agg_test.y, agg_test.x)
                     Executor Memory: 24kB  Segments: 3  Max: 8kB (segment 0)
                     ->  Seq Scan on public.agg_test  (cost=0.00..567193.48 rows=16666250 width=16) (actual time=0.109..6214.809 rows=16667260 loops=1)
                           Output: agg_test.y, agg_test.x
       Planning time: 1.427 ms
         (slice0)    Executor memory: 781K bytes.
         (slice1)    Executor memory: 156K bytes avg x 3 workers, 156K bytes max (seg0).
       Memory used:  128000kB
       Optimizer: Postgres query optimizer
       Execution time: 39557.902 ms
      (16 rows)

Test summary: In V6.0, the execution duration of 11 aggregate functions is three to four times that of a single aggregate function. In V7.0, the execution duration of 11 aggregate functions is basically the same as that of a single aggregate function. As such, aggregate function reuse significantly reduces the computation time.

List of reusable aggregate functions

Aggregate functions whose values are the same in the rank column can be reused.

 rank |           aggfnoid            |                aggtransfn                 |     agginitval
------+-------------------------------+-------------------------------------------+---------------------
    1 | pg_catalog.sum                | float4pl                                  |
    2 | pg_catalog.var_samp           | float4_accum                              | {0,0,0}
    2 | pg_catalog.stddev_pop         | float4_accum                              | {0,0,0}
    2 | pg_catalog.avg                | float4_accum                              | {0,0,0}
    2 | pg_catalog.var_pop            | float4_accum                              | {0,0,0}
    2 | pg_catalog.stddev             | float4_accum                              | {0,0,0}
    2 | pg_catalog.variance           | float4_accum                              | {0,0,0}
    2 | pg_catalog.stddev_samp        | float4_accum                              | {0,0,0}
    9 | pg_catalog.max                | float4larger                              |
   10 | pg_catalog.min                | float4smaller                             |
   11 | pg_catalog.sum                | float8pl                                  |
   12 | pg_catalog.stddev_pop         | float8_accum                              | {0,0,0}
   12 | pg_catalog.stddev_samp        | float8_accum                              | {0,0,0}
   12 | pg_catalog.avg                | float8_accum                              | {0,0,0}
   12 | pg_catalog.stddev             | float8_accum                              | {0,0,0}
   12 | pg_catalog.var_pop            | float8_accum                              | {0,0,0}
   12 | pg_catalog.var_samp           | float8_accum                              | {0,0,0}
   12 | pg_catalog.variance           | float8_accum                              | {0,0,0}
   19 | pg_catalog.max                | float8larger                              |
   20 | pg_catalog.min                | float8smaller                             |
   21 | pg_catalog.max                | text_larger                               |
   22 | pg_catalog.min                | text_smaller                              |
   23 | pg_catalog.max                | array_larger                              |
   24 | pg_catalog.min                | array_smaller                             |
   25 | pg_catalog.max                | int4larger                                |
   26 | pg_catalog.min                | int4smaller                               |
   27 | pg_catalog.max                | int2larger                                |
   28 | pg_catalog.min                | int2smaller                               |
   29 | pg_catalog.sum                | cash_pl                                   |
   30 | pg_catalog.max                | cashlarger                                |
   31 | pg_catalog.min                | cashsmaller                               |
   32 | pg_catalog.max                | bpchar_larger                             |
   33 | pg_catalog.min                | bpchar_smaller                            |
   34 | pg_catalog.max                | date_larger                               |
   35 | pg_catalog.min                | date_smaller                              |
   36 | pg_catalog.sum                | interval_pl                               |
   37 | pg_catalog.min                | timestamptz_smaller                       |
   38 | pg_catalog.max                | timestamptz_larger                        |
   39 | pg_catalog.min                | interval_smaller                          |
   40 | pg_catalog.max                | interval_larger                           |
   41 | pg_catalog.count              | int8inc                                   | 0
   42 | pg_catalog.max                | int8larger                                |
   43 | pg_catalog.min                | int8smaller                               |
   44 | pg_catalog.max                | time_larger                               |
   45 | pg_catalog.min                | time_smaller                              |
   46 | pg_catalog.max                | timetz_larger                             |
   47 | pg_catalog.min                | timetz_smaller                            |
   48 | pg_catalog.bit_and            | bitand                                    |
   49 | pg_catalog.bit_or             | bitor                                     |
   50 | pg_catalog.min                | numeric_smaller                           |
   51 | pg_catalog.max                | numeric_larger                            |
   52 | pg_catalog.stddev_pop         | numeric_accum                             |
   52 | pg_catalog.var_pop            | numeric_accum                             |
   52 | pg_catalog.var_samp           | numeric_accum                             |
   52 | pg_catalog.variance           | numeric_accum                             |
   52 | pg_catalog.stddev_samp        | numeric_accum                             |
   52 | pg_catalog.stddev             | numeric_accum                             |
   58 | pg_catalog.stddev             | int2_accum                                |
   58 | pg_catalog.stddev_pop         | int2_accum                                |
   58 | pg_catalog.stddev_samp        | int2_accum                                |
   58 | pg_catalog.variance           | int2_accum                                |
   58 | pg_catalog.var_pop            | int2_accum                                |
   58 | pg_catalog.var_samp           | int2_accum                                |
   64 | pg_catalog.stddev_pop         | int4_accum                                |
   64 | pg_catalog.var_pop            | int4_accum                                |
   64 | pg_catalog.stddev             | int4_accum                                |
   64 | pg_catalog.var_samp           | int4_accum                                |
   64 | pg_catalog.stddev_samp        | int4_accum                                |
   64 | pg_catalog.variance           | int4_accum                                |
   70 | pg_catalog.stddev_samp        | int8_accum                                |
   70 | pg_catalog.stddev_pop         | int8_accum                                |
   70 | pg_catalog.var_pop            | int8_accum                                |
   70 | pg_catalog.variance           | int8_accum                                |
   70 | pg_catalog.var_samp           | int8_accum                                |
   70 | pg_catalog.stddev             | int8_accum                                |
   76 | pg_catalog.sum                | int2_sum                                  |
   77 | pg_catalog.sum                | int4_sum                                  |
   78 | pg_catalog.avg                | interval_accum                            | {0 second,0 second}
   79 | pg_catalog.bit_and            | int2and                                   |
   80 | pg_catalog.bit_or             | int2or                                    |
   81 | pg_catalog.bit_and            | int4and                                   |
   82 | pg_catalog.bit_or             | int4or                                    |
   83 | pg_catalog.bit_and            | int8and                                   |
   84 | pg_catalog.bit_or             | int8or                                    |
   85 | pg_catalog.avg                | int2_avg_accum                            | {0,0}
   86 | pg_catalog.avg                | int4_avg_accum                            | {0,0}
   87 | pg_catalog.max                | oidlarger                                 |
   88 | pg_catalog.min                | oidsmaller                                |
   89 | pg_catalog.min                | timestamp_smaller                         |
   90 | pg_catalog.max                | timestamp_larger                          |
   91 | pg_catalog.array_agg          | array_agg_transfn                         |
   91 | pg_catalog.gp_array_agg       | array_agg_transfn                         |
   93 | every                         | booland_statefunc                         |
   93 | bool_and                      | booland_statefunc                         |
   95 | bool_or                       | boolor_statefunc                          |
   96 | pg_catalog.sum                | int8_avg_accum                            |
   96 | pg_catalog.avg                | int8_avg_accum                            |
   98 | pg_catalog.max                | tidlarger                                 |
   99 | pg_catalog.min                | tidsmaller                                |
  100 | pg_catalog.count              | int8inc_any                               | 0
  101 | regr_count                    | int8inc_float8_float8                     | 0
  102 | regr_sxy                      | float8_regr_accum                         | {0,0,0,0,0,0}
  102 | regr_sxx                      | float8_regr_accum                         | {0,0,0,0,0,0}
  102 | regr_syy                      | float8_regr_accum                         | {0,0,0,0,0,0}
  102 | regr_avgx                     | float8_regr_accum                         | {0,0,0,0,0,0}
  102 | regr_avgy                     | float8_regr_accum                         | {0,0,0,0,0,0}
  102 | regr_r2                       | float8_regr_accum                         | {0,0,0,0,0,0}
  102 | regr_slope                    | float8_regr_accum                         | {0,0,0,0,0,0}
  102 | regr_intercept                | float8_regr_accum                         | {0,0,0,0,0,0}
  102 | covar_pop                     | float8_regr_accum                         | {0,0,0,0,0,0}
  102 | covar_samp                    | float8_regr_accum                         | {0,0,0,0,0,0}
  102 | corr                          | float8_regr_accum                         | {0,0,0,0,0,0}
  113 | pg_catalog.sum                | numeric_avg_accum                         |
  113 | pg_catalog.avg                | numeric_avg_accum                         |
  115 | xmlagg                        | xmlconcat2                                |
  116 | json_agg                      | json_agg_transfn                          |
  117 | json_object_agg               | json_object_agg_transfn                   |
  118 | jsonb_agg                     | jsonb_agg_transfn                         |
  119 | jsonb_object_agg              | jsonb_object_agg_transfn                  |
  120 | pg_catalog.min                | enum_smaller                              |
  121 | pg_catalog.max                | enum_larger                               |
  122 | pg_catalog.string_agg         | string_agg_transfn                        |
  123 | pg_catalog.string_agg         | bytea_string_agg_transfn                  |
  124 | pg_catalog.max                | network_larger                            |
  125 | pg_catalog.min                | network_smaller                           |
  126 | pg_catalog."median"           | ordered_set_transition                    |
  126 | pg_catalog.percentile_cont    | ordered_set_transition                    |
  126 | pg_catalog."median"           | ordered_set_transition                    |
  126 | pg_catalog.percentile_cont    | ordered_set_transition                    |
  126 | pg_catalog.percentile_cont    | ordered_set_transition                    |
  126 | pg_catalog.percentile_cont    | ordered_set_transition                    |
  126 | pg_catalog."median"           | ordered_set_transition                    |
  126 | pg_catalog.percentile_disc    | ordered_set_transition                    |
  126 | pg_catalog.percentile_cont    | ordered_set_transition                    |
  126 | pg_catalog.percentile_cont    | ordered_set_transition                    |
  126 | pg_catalog.percentile_disc    | ordered_set_transition                    |
  126 | pg_catalog.percentile_cont    | ordered_set_transition                    |
  126 | pg_catalog.percentile_cont    | ordered_set_transition                    |
  126 | mode                          | ordered_set_transition                    |
  126 | pg_catalog."median"           | ordered_set_transition                    |
  141 | pg_catalog.cume_dist          | ordered_set_transition_multi              |
  141 | pg_catalog.rank               | ordered_set_transition_multi              |
  141 | pg_catalog.percent_rank       | ordered_set_transition_multi              |
  141 | pg_catalog.dense_rank         | ordered_set_transition_multi              |
  145 | pg_catalog.array_agg          | array_agg_array_transfn                   |
  145 | pg_catalog.gp_array_agg       | array_agg_array_transfn                   |
  147 | array_sum                     | array_add                                 | {}
  148 | pg_catalog.sum                | int2_matrix_accum                         |
  149 | pg_catalog.sum                | int4_matrix_accum                         |
  150 | pg_catalog.sum                | int8_matrix_accum                         |
  151 | pg_catalog.sum                | float8_matrix_accum                       |
  152 | pg_catalog.pivot_sum          | int4_pivot_accum                          |
  153 | pg_catalog.pivot_sum          | int8_pivot_accum                          |
  154 | pg_catalog.pivot_sum          | float8_pivot_accum                        |
  155 | gp_hyperloglog_accum          | gp_hyperloglog_add_item_agg_default       |
  156 | pg_catalog.gp_percentile_cont | gp_percentile_cont_float8_transition      |
  157 | pg_catalog.gp_percentile_cont | gp_percentile_cont_interval_transition    |
  158 | pg_catalog.gp_percentile_cont | gp_percentile_cont_timestamp_transition   |
  159 | pg_catalog.gp_percentile_cont | gp_percentile_cont_timestamptz_transition |
  160 | gp_percentile_disc            | gp_percentile_disc_transition             |