Hologres supports a subset of the PostgreSQL aggregate functions. This topic lists the supported functions with their syntax, parameters, and examples.
For the full PostgreSQL aggregate function specification, see Aggregate Functions.
Supported functions
The following table lists all supported functions, grouped by category.
General aggregation
| Function | Argument types | Return type | Description |
|---|---|---|---|
| AVG | BIGINT, FLOAT4, FLOAT8, INT | Same as input | Average of non-NULL values |
| COUNT | * or any expression | BIGINT | Row count or non-NULL input count |
| MAX | BIGINT, FLOAT4, FLOAT8, INT, NUMERIC | Same as input | Maximum value |
| MAX_BY | Any (x), any comparable (y) | Same as x | Value of x in the row with the maximum y |
| MIN | BIGINT, FLOAT4, FLOAT8, INT, NUMERIC | Same as input | Minimum value |
| MIN_BY | Any (x), any comparable (y) | Same as x | Value of x in the row with the minimum y |
| SUM | BIGINT, FLOAT4, FLOAT8, INT, NUMERIC | Same as input | Sum of all values |
| ARRAY_AGG | Any (except JSON, JSONB, TIMETZ, INTERVAL, INET, OID, UUID, arrays) | ARRAY | Values aggregated into an array |
| STRING_AGG | TEXT, TEXT | TEXT | Non-NULL values joined with a separator |
Boolean aggregation
| Function | Argument types | Return type | Description |
|---|---|---|---|
| BOOL_AND | BOOLEAN | BOOLEAN | TRUE if all values are true |
| BOOL_OR | BOOLEAN | BOOLEAN | TRUE if any value is true |
| EVERY | BOOLEAN | BOOLEAN | TRUE if all values are true |
Bitwise aggregation
| Function | Argument types | Return type | Description |
|---|---|---|---|
| BIT_AND | BIGINT, INT | Same as input | Bitwise AND of non-NULL values |
| BIT_OR | BIGINT, INT | Same as input | Bitwise OR of non-NULL values |
Cardinality estimation (using HyperLogLog)
| Function | Argument types | Return type | Description |
|---|---|---|---|
| APPROX_COUNT_DISTINCT | Any | BIGINT | Approximate distinct count (HyperLogLog) |
| UNIQ | SMALLINT, INTEGER, BIGINT, REAL, DOUBLE PRECISION, TEXT, VARCHAR, TIMESTAMP, TIMESTAMPTZ, DATE, TIMETZ, UUID | BIGINT | Distinct count of values in a column, optimized for high-cardinality GROUP BY |
Percentile estimation
| Function | Argument types | Return type | Description |
|---|---|---|---|
| APPROX_PERCENTILE | NUMERIC | NUMERIC | Approximate percentile (reservoir sampling; V3.1+) |
Statistics
| Function | Argument types | Return type | Description |
|---|---|---|---|
| CORR | DOUBLE PRECISION, FLOAT, FLOAT8 | Same as input | Correlation coefficient |
| COVAR_POP | DOUBLE PRECISION, FLOAT, FLOAT8 | Same as input | Population covariance |
| COVAR_SAMP | DOUBLE PRECISION, FLOAT, FLOAT8 | Same as input | Sample covariance |
| STDDEV | INT, NUMERIC, FLOAT8 | NUMERIC or FLOAT8 | Sample standard deviation |
| STDDEV_POP | INT, NUMERIC, FLOAT8 | NUMERIC or FLOAT8 | Population standard deviation |
| STDDEV_SAMP | INT, NUMERIC, FLOAT8 | NUMERIC or FLOAT8 | Sample standard deviation |
| VARIANCE | INT, NUMERIC | NUMERIC | Sample variance |
| VAR_POP | FLOAT8, INT, NUMERIC | NUMERIC or FLOAT8 | Population variance |
| VAR_SAMP | INT, FLOAT8, NUMERIC | NUMERIC or FLOAT8 | Sample variance |
Linear regression
| Function | Argument types | Return type | Description |
|---|---|---|---|
| REGR_AVGX | DOUBLE PRECISION, FLOAT, FLOAT8 | Same as input | Average of the independent variable (X) |
| REGR_AVGY | DOUBLE PRECISION, FLOAT, FLOAT8 | Same as input | Average of the dependent variable (Y) |
| REGR_COUNT | DOUBLE PRECISION, FLOAT, FLOAT8 | BIGINT | Rows where neither input is NULL |
| REGR_INTERCEPT | DOUBLE PRECISION, FLOAT, FLOAT8 | Same as input | Y-intercept of the least-squares-fit line |
| REGR_R2 | DOUBLE PRECISION, FLOAT, FLOAT8 | Same as input | Square of the correlation coefficient |
| REGR_SLOPE | DOUBLE PRECISION, FLOAT, FLOAT8 | Same as input | Slope of the least-squares-fit line |
| REGR_SXX | DOUBLE PRECISION, FLOAT, FLOAT8 | Same as input | Sum of squares of X |
| REGR_SXY | DOUBLE PRECISION, FLOAT, FLOAT8 | Same as input | Sum of products of X and Y |
Sample data
The following examples use example_table. Run this setup script to create the table and insert sample data.
-- Create a table.
CREATE TABLE example_table(
c1 INT,
c2 BOOLEAN,
c3 VARCHAR(10),
c4 DECIMAL(3, 1),
c5 TEXT,
c6 FLOAT8,
c7 FLOAT8
);
-- Insert data.
INSERT INTO example_table (c1, c2, c3, c4, c5, c6, C7) VALUES
(1, true, 'a', 1.1, 3, 6, 7),
(2, false, 'b', 2.2, 4, 6, 2.6);APPROX_COUNT_DISTINCT
Uses HyperLogLog cardinality estimation to return an approximate distinct count. The average error rate is between 0.1% and 1%, which is much lower overhead than an exact COUNT DISTINCT ( column ).
APPROX_COUNT_DISTINCT ( <column> )Parameters
| Parameter | Description |
|---|---|
column | Required. The column for which to calculate the approximate distinct count. |
Usage notes
The precision of the HyperLogLog algorithm is controlled by a bucketing parameter:
SET hg_experimental_approx_count_distinct_precision = 20;| Setting | Details |
|---|---|
| Valid range | 12–20 |
| Default | 17 |
| Effect | Larger value = more buckets = higher precision, but more compute and memory |
| Precision > 17 | Switches to the HyperLogLog++ algorithm, which further reduces and stabilizes the error |
| Precision = 20 | Reduces error to 0.01%–0.2% in most cases |
Set this parameter at the session level with SET or at the database level with ALTER DATABASE:
-- Session level
SET hg_experimental_approx_count_distinct_precision = 20;
-- Database level
ALTER DATABASE dbname SET hg_experimental_approx_count_distinct_precision = 20;To perform an exact distinct count, use COUNT DISTINCT ( column ). This consumes more resources.Example
The following examples use the ORDERS table from the TPC-H public dataset provided by MaxCompute. Run the setup script below to create the required tables before running the queries.
-- Create a foreign table.
IMPORT FOREIGN SCHEMA public_data
LIMIT TO (ORDERS_10g)
FROM SERVER odps_server INTO public options (if_table_exist 'update');
-- Create an internal table.
BEGIN;
CREATE TABLE ORDERS (
O_ORDERKEY bigint NOT NULL PRIMARY KEY,
O_CUSTKEY int NOT NULL,
O_ORDERSTATUS text NOT NULL,
O_TOTALPRICE DECIMAL(12, 2) NOT NULL,
O_ORDERDATE timestamptz NOT NULL,
O_ORDERPRIORITY text NOT NULL,
O_CLERK text NOT NULL,
O_SHIPPRIORITY int NOT NULL,
O_COMMENT text NOT NULL
);
CALL set_table_property ('ORDERS', 'segment_key', 'O_ORDERDATE');
CALL set_table_property ('ORDERS', 'distribution_key', 'O_ORDERKEY');
CALL set_table_property ('ORDERS', 'bitmap_columns', 'O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT');
CALL set_table_property ('ORDERS', 'dictionary_encoding_columns', 'O_ORDERSTATUS,O_ORDERPRIORITY,O_CLERK,O_COMMENT');
COMMENT ON COLUMN ORDERS.O_ORDERKEY IS 'Order ID';
COMMENT ON COLUMN ORDERS.O_CUSTKEY IS 'Customer ID';
COMMENT ON COLUMN ORDERS.O_ORDERSTATUS IS 'Order status';
COMMENT ON COLUMN ORDERS.O_TOTALPRICE IS 'Total price';
COMMENT ON COLUMN ORDERS.O_ORDERDATE IS 'Order date';
COMMENT ON COLUMN ORDERS.O_ORDERPRIORITY IS 'Order priority';
COMMENT ON COLUMN ORDERS.O_CLERK IS 'Cashier';
COMMENT ON COLUMN ORDERS.O_SHIPPRIORITY IS 'Shipping priority';
COMMENT ON COLUMN ORDERS.O_COMMENT IS 'Comment';
COMMIT;
-- Import data to the internal table.
INSERT INTO ORDERS SELECT * FROM ORDERS_10g;Calculate the approximate distinct count in the O_CUSTKEY column:
SELECT APPROX_COUNT_DISTINCT ( O_CUSTKEY ) FROM ORDERS;Result:
approx_count_distinct
-----------------------
1000422Calculate with global precision set to 20:
ALTER DATABASE dbname SET hg_experimental_approx_count_distinct_precision = 20;
SELECT APPROX_COUNT_DISTINCT ( O_CUSTKEY ) FROM ORDERS;Result:
approx_count_distinct
-----------------------
1000422Calculate with session-level precision set to 20:
SET hg_experimental_approx_count_distinct_precision = 20;
SELECT APPROX_COUNT_DISTINCT ( O_CUSTKEY ) FROM ORDERS;Result:
approx_count_distinct
-----------------------
998854APPROX_PERCENTILE
Supported in Hologres V3.1 and later.
Uses reservoir sampling (up to 8,192 elements) to return an approximate percentile of a numeric column.
APPROX_PERCENTILE (fraction) WITHIN GROUP (ORDER BY sort_expression)Parameters
| Parameter | Description |
|---|---|
fraction | Required. A constant between 0 and 1. For example, 0.5 represents the median. |
sort_expression | Required. The column for which to calculate the percentile. Must be sortable. |
Example
Calculate the Q1, median, and Q3 of a salary column:
SELECT
APPROX_PERCENTILE(0.25) WITHIN GROUP (ORDER BY salary) AS q1,
APPROX_PERCENTILE(0.5) WITHIN GROUP (ORDER BY salary) AS median,
APPROX_PERCENTILE(0.75) WITHIN GROUP (ORDER BY salary) AS q3
FROM employees;ARRAY_AGG
Aggregates values of any expression into an array.
ARRAY_AGG(expression)Parameters
| Parameter | Description |
|---|---|
expression | Required. An expression of any data type except JSON, JSONB, TIMETZ, INTERVAL, INET, OID, UUID, and array types. |
Return value
ARRAY type.
Example
SELECT ARRAY_AGG(c1) FROM example_table;Result:
array_agg
-----------
{2,1}AVG
Returns the average of non-NULL values.
AVG(expression)Parameters
| Parameter | Description |
|---|---|
expression | Required. A BIGINT, FLOAT8, FLOAT4, or INT expression. |
Return value
Same data type as the input.
Example
SELECT AVG(c7) FROM example_table;Result:
avg
----------
1.500000BIT_AND
Performs a bitwise AND on all non-NULL values.
BIT_AND(expression)Parameters
| Parameter | Description |
|---|---|
expression | Required. A BIGINT or INT expression. |
Return value
Same data type as the input.
Example
SELECT BIT_AND(c1) FROM example_table;Result:
bit_and
---------
0BIT_OR
Performs a bitwise OR on all non-NULL values.
BIT_OR(expression)Parameters
| Parameter | Description |
|---|---|
expression | Required. A BIGINT or INT expression. |
Return value
Same data type as the input.
Example
SELECT BIT_OR(c1) FROM example_table;Result:
bit_or
--------
3BOOL_AND
Returns TRUE if all values of a Boolean expression are true; returns FALSE otherwise.
BOOL_AND(bool)Parameters
| Parameter | Description |
|---|---|
bool | Required. A BOOLEAN expression. |
Return value
BOOLEAN. Returns t if all values are true, f otherwise.
Example
SELECT BOOL_AND(c2) FROM example_table;Result:
bool_and
----------
fBOOL_OR
Returns TRUE if any value of a Boolean expression is true; returns FALSE otherwise.
BOOL_OR(bool)Parameters
| Parameter | Description |
|---|---|
bool | Required. A BOOLEAN expression. |
Return value
BOOLEAN. Returns t if any value is true, f otherwise.
Example
SELECT BOOL_OR(c2) FROM example_table;Result:
bool_or
---------
tCORR
Calculates the correlation coefficient between two variables. Both parameters must be DOUBLE PRECISION, FLOAT, or FLOAT8.
CORR(Y, X)Parameters
| Parameter | Description |
|---|---|
Y | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression (dependent variable). |
X | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression (independent variable). |
Return value
Same data type as the input.
Example
SELECT CORR(c6, c7) FROM example_table;COUNT
Returns the number of rows or the number of non-NULL values in an expression.
COUNT(*)counts all rows, including rows with NULL values.COUNT(expression)counts only non-NULL values.
-- Count all rows, including rows with NULL values.
COUNT(*)
-- Count only non-NULL input values.
COUNT(expression)Parameters
| Parameter | Description |
|---|---|
* | Counts all rows, including rows where columns contain NULL. |
expression | Required. A non-NULL BIGINT or NUMERIC expression. Rows where the expression is NULL are not counted. |
Return value
BIGINT.
Example
SELECT COUNT(*) FROM example_table;Result:
count
-------
2COVAR_POP
Calculates the population covariance of two variables.
COVAR_POP(Y, X)Parameters
| Parameter | Description |
|---|---|
Y | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
X | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
Return value
Same data type as the input.
Example
SELECT COVAR_POP(c6, c7) FROM example_table;Result:
covar_pop
---------------------
3.5527136788005e-15COVAR_SAMP
Calculates the sample covariance of two variables.
COVAR_SAMP(Y, X)Parameters
| Parameter | Description |
|---|---|
Y | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
X | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
Return value
Same data type as the input.
Example
SELECT COVAR_SAMP(c6, c7) FROM example_table;Result:
covar_samp
--------------------
7.105427357601e-15EVERY
Returns TRUE if all values of a Boolean expression are true; returns FALSE otherwise.
EVERY(bool)Parameters
| Parameter | Description |
|---|---|
bool | Required. A BOOLEAN expression. |
Return value
BOOLEAN. Returns t if all values are true, f otherwise.
Example
SELECT EVERY(c2) FROM example_table;Result:
every
-------
fMAX
Returns the maximum value.
MAX(expression)Parameters
| Parameter | Description |
|---|---|
expression | Required. A BIGINT, FLOAT8, FLOAT4, INT, or NUMERIC expression. |
Return value
Same data type as the input.
Example
SELECT MAX(c1) FROM example_table;Result:
max
-----
2MAX_BY
Supported in Hologres V1.3.36 and later. For earlier versions, see How to obtain more online support?
Finds the row with the maximum value in column y and returns the corresponding value from column x. For example, if a table contains name and cost columns, MAX_BY(name, cost) returns the name of the item with the highest cost.
Numeric types are compared by numeric value. Non-numeric types are compared by dictionary sort order.
MAX_BY(x, y)Parameters
| Parameter | Description |
|---|---|
y | Required. The column from which to find the maximum value. Rows where y is NULL are excluded. |
x | Required. The column from which to return the value corresponding to the maximum y. |
Return value
Returns the value of
xfrom the row whereyis at its maximum.Returns NULL if all values in
yare NULL.If multiple rows share the maximum
yvalue, returns the maximum of the correspondingxvalues.
Example
-- Sample data
DROP TABLE IF EXISTS test;
CREATE TABLE IF NOT EXISTS test (
id INT,
name TEXT,
cost INT
);
INSERT INTO test
VALUES (1, 'a', 100), (1, 'aa', 200), (1, 'aaa', 300), (2, 'b', 150), (2, 'bb', 300), (3, 'c', 150), (3, 'cc', 50);Return the name corresponding to the highest cost. Because multiple rows share the maximum cost (300), the function returns the maximum name value among those rows:
SELECT max_by(name, cost) FROM test;Result:
max_by
--------
aaaMIN
Returns the minimum value.
MIN(expression)Parameters
| Parameter | Description |
|---|---|
expression | Required. A BIGINT, FLOAT8, FLOAT4, INT, or NUMERIC expression. |
Return value
Same data type as the input.
Example
SELECT MIN(c1) FROM example_table;Result:
min
-----
1MIN_BY
Supported in Hologres V1.3.36 and later. For earlier versions, see How to obtain more online support?
Finds the row with the minimum value in column y and returns the corresponding value from column x. For example, if a table contains name and cost columns, MIN_BY(name, cost) returns the name of the item with the lowest cost.
Numeric types are compared by numeric value. Non-numeric types are compared by dictionary sort order.
MIN_BY(x, y)Parameters
| Parameter | Description |
|---|---|
y | Required. The column from which to find the minimum value. Rows where y is NULL are excluded. |
x | Required. The column from which to return the value corresponding to the minimum y. |
Return value
Returns the value of
xfrom the row whereyis at its minimum.Returns NULL if all values in
yare NULL.If multiple rows share the minimum
yvalue, returns the minimum of the correspondingxvalues.
Example
-- Sample data
DROP TABLE IF EXISTS test;
CREATE TABLE IF NOT EXISTS test (
id INT,
name TEXT,
cost INT
);
INSERT INTO test
VALUES (1, 'a', 100), (1, 'aa', 200), (1, 'aaa', 300), (2, 'b', 150), (2, 'bb', 300), (3, 'c', 150), (3, 'cc', 50);Return the name corresponding to the lowest cost:
SELECT MIN_BY(name, cost) FROM test;Result:
min_by
--------
cc
(1 row)REGR_AVGX
Calculates the average of the independent variable (X).
REGR_AVGX(Y, X)Parameters
| Parameter | Description |
|---|---|
Y | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
X | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
Return value
Same data type as the input.
Example
SELECT REGR_AVGX(c6, c7) FROM example_table;Result:
regr_avgx
-----------
4.8REGR_AVGY
Calculates the average of the dependent variable (Y).
REGR_AVGY(Y, X)Parameters
| Parameter | Description |
|---|---|
Y | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
X | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
Return value
Same data type as the input.
Example
SELECT REGR_AVGY(c6, c7) FROM example_table;Result:
regr_avgy
-----------
6REGR_COUNT
Returns the number of rows where neither input is NULL.
REGR_COUNT(Y, X)Parameters
| Parameter | Description |
|---|---|
Y | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
X | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
Return value
BIGINT.
Example
SELECT REGR_COUNT(c6, c7) FROM example_table;Result:
regr_count
------------
2REGR_R2
Calculates the square of the correlation coefficient (R²).
REGR_R2(Y, X)Parameters
| Parameter | Description |
|---|---|
Y | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
X | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
Return value
Same data type as the input.
Example
SELECT REGR_R2(c6, c7) FROM example_table;Result:
regr_r2
---------
1REGR_INTERCEPT
Calculates the y-intercept of the least-squares-fit linear equation determined by the (Y, X) pairs.
REGR_INTERCEPT(Y, X)Parameters
| Parameter | Description |
|---|---|
Y | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
X | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
Return value
Same data type as the input.
Example
SELECT REGR_INTERCEPT(c6, c7) FROM example_table;Result:
regr_intercept
----------------
6REGR_SLOPE
Calculates the slope of the least-squares-fit linear equation.
REGR_SLOPE(Y, X)Parameters
| Parameter | Description |
|---|---|
Y | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
X | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
Return value
Same data type as the input.
Example
SELECT REGR_SLOPE(c6, c7) FROM example_table;Result:
regr_slope
----------------------
7.34031751818285e-16REGR_SXX
Calculates the sum of squares of the independent variable (X), as sum(X^2) - sum(X)^2/N.
REGR_SXX(Y, X)Parameters
| Parameter | Description |
|---|---|
Y | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
X | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
Return value
Same data type as the input.
Example
SELECT REGR_SXX(c6, c7) FROM example_table;Result:
regr_sxx
----------
9.68REGR_SXY
Calculates the sum of products of the independent variable (X) and dependent variable (Y), as sum(X*Y) - sum(X) * sum(Y)/N.
REGR_SXY(Y, X)Parameters
| Parameter | Description |
|---|---|
Y | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
X | Required. A DOUBLE PRECISION, FLOAT, or FLOAT8 expression. |
Return value
Same data type as the input.
Example
SELECT REGR_SXY(c6, c7) FROM example_table;Result:
regr_sxy
--------------------
7.105427357601e-15SUM
Returns the sum of all values.
SUM(expression)Parameters
| Parameter | Description |
|---|---|
expression | Required. A BIGINT, FLOAT8, FLOAT4, INT, or NUMERIC expression. |
Return value
Same data type as the input.
Example
SELECT SUM(c1) FROM example_table;Result:
sum
-----
3STDDEV
Returns the sample standard deviation.
STDDEV(expression)Parameters
| Parameter | Description |
|---|---|
expression | Required. An INT, NUMERIC, or FLOAT8 expression. |
Return value
NUMERIC or FLOAT8.
Example
SELECT STDDEV(c1) FROM example_table;Result:
stddev
--------------
0.7071067811STDDEV_POP
Returns the population standard deviation.
STDDEV_POP(expression)Parameters
| Parameter | Description |
|---|---|
expression | Required. An INT, NUMERIC, or FLOAT8 expression. |
Return value
NUMERIC or FLOAT8.
Example
SELECT STDDEV_POP(c1) FROM example_table;Result:
stddev_pop
--------------
0.5000000000STDDEV_SAMP
Returns the sample standard deviation. Equivalent to STDDEV.
STDDEV_SAMP(expression)Parameters
| Parameter | Description |
|---|---|
expression | Required. An INT, NUMERIC, or FLOAT8 expression. |
Return value
NUMERIC or FLOAT8.
Example
SELECT STDDEV_SAMP(c1) FROM example_table;Result:
stddev_samp
--------------
0.7071067812STRING_AGG
Concatenates non-NULL values into a single string, separated by a specified delimiter.
STRING_AGG(<expression> TEXT, <delimiter> TEXT)Parameters
| Parameter | Description |
|---|---|
expression | Required. The source string (TEXT). |
delimiter | Required. The separator string (TEXT). |
Return value
TEXT.
Example
SELECT STRING_AGG(c5, '-') FROM example_table;Result:
string_agg
------------
3-4UNIQ
Supported in Hologres V1.3 and later. For earlier versions, see Common errors when preparing for an upgrade or How to obtain more online support?.
Returns the distinct count for a column. UNIQ outperforms COUNT DISTINCT specifically when the query includes a GROUP BY clause and the GROUP BY field values are evenly distributed without data skew.
UNIQ ( <column> )Parameters
| Parameter | Description |
|---|---|
column | Required. The column for which to calculate the distinct count. Supported types: SMALLINT, INTEGER, BIGINT, REAL, DOUBLE PRECISION, TEXT, VARCHAR, TIMESTAMP, TIMESTAMPTZ, DATE, TIMETZ, and UUID. |
Usage notes
UNIQ uses less memory than COUNT DISTINCT and is a useful alternative when COUNT DISTINCT causes out-of-memory errors.
UNIQ performs better than COUNT DISTINCT when the GROUP BY key has high cardinality.
UNIQ only outperforms COUNT DISTINCT when the query has a
GROUP BYclause and values are evenly distributed. Without a GROUP BY clause, or with data skew, there is no performance advantage.Since Hologres V2.1, COUNT DISTINCT has been significantly optimized across scenarios including single and multiple COUNT DISTINCT functions, data skew, and queries without a GROUP BY clause. Manually rewriting queries to use UNIQ is no longer necessary in most cases. For details, see Count Distinct optimization.
Example
The following examples use the ORDERS table from the TPC-H public dataset. See APPROX_COUNT_DISTINCT for the table setup script.
Calculate the distinct count of O_CUSTKEY in the ORDERS table:
SELECT UNIQ ( O_CUSTKEY ) FROM ORDERS;Result:
uniq
--------
999982Calculate the distinct O_CUSTKEY count for each O_ORDERSTATUS:
SELECT O_ORDERSTATUS, UNIQ ( O_CUSTKEY ) FROM ORDERS GROUP BY O_ORDERSTATUS;Result:
o_orderstatus | uniq
---------------+--------
P | 313478
F | 996258
O | 996400VARIANCE
Returns the sample variance.
VARIANCE(expression)Parameters
| Parameter | Description |
|---|---|
expression | Required. An INT or NUMERIC expression. |
Return value
NUMERIC.
Example
SELECT VARIANCE(c1) FROM example_table;Result:
variance
--------------
0.5000000000VAR_POP
Returns the population variance.
VAR_POP(expression)Parameters
| Parameter | Description |
|---|---|
expression | Required. A FLOAT8, INT, or NUMERIC expression. |
Return value
NUMERIC or FLOAT8.
Example
SELECT VAR_POP(c1) FROM example_table;Result:
var_pop
--------------
0.2500000000VAR_SAMP
Returns the sample variance.
VAR_SAMP(expression)Parameters
| Parameter | Description |
|---|---|
expression | Required. An INT, FLOAT8, or NUMERIC expression. |
Return value
NUMERIC or FLOAT8.
Example
SELECT VAR_SAMP(c1) FROM example_table;Result:
var_samp
--------------
0.5000000000