Hologres is compatible with PostgreSQL and supports the standard PostgreSQL syntax for data development. This topic describes the general-purpose aggregate functions supported by Hologres.
Hologres supports a subset of the functions available in PostgreSQL. For more information about how to use these functions, see Aggregate Functions. The following table lists the general-purpose aggregate functions that Hologres supports.
Function | Description |
Calculates the number of rows that have distinct values in a column. The result of this function is an approximate value. | |
Calculates the approximate percentile of a numeric column. Note Hologres supports APPROX_PERCENTILE from V3.1 and later. | |
Aggregates the values of an expression into an array. | |
Calculates the average of the non-NULL values returned by a BIGINT, FLOAT8, FLOAT4, or INT expression. | |
Performs bitwise AND operations on the non-NULL values returned by a BIGINT or an INT expression. | |
Performs bitwise OR operations on the non-NULL values returned by a BIGINT or an INT expression. | |
Checks whether all values returned by a BOOLEAN expression are TRUE. | |
Checks whether any value returned by a BOOLEAN expression is TRUE. | |
Calculates the correlation coefficient between two variables. | |
Gets the number of rows in a specified table or the number of input rows from a BIGINT or NUMERIC expression. | |
Calculates the population covariance. | |
Calculates the sample covariance. | |
Checks whether all values of a BOOLEAN expression are TRUE. | |
Gets the maximum value from a BIGINT, FLOAT8, FLOAT4, INT, or NUMERIC expression. | |
Compares the values in a column (column y) to find the maximum value, and returns the value of another specified column (column x) from the same row. | |
Gets the minimum value from a BIGINT, FLOAT8, FLOAT4, INT, or NUMERIC expression. | |
Compares the values in a column (column y) to find the minimum value, and returns the value of another specified column (column x) from the same row. | |
Calculates the average of an independent variable (X). | |
Calculates the average of a dependent variable (Y). | |
Calculates the number of rows where neither input parameter is NULL. | |
Calculates the square of the correlation coefficient. | |
Calculates the y-intercept of the least-squares-fit linear equation determined by the | |
Calculates the slope of the least-squares-fit linear equation. | |
Calculates the sum of squares of the independent variable (X), as | |
Calculates the sum of products of the independent variable (X) and dependent variable (Y), as | |
Gets the sum of all values from a BIGINT, FLOAT8, FLOAT4, INT, or NUMERIC expression. | |
Calculates the sample standard deviation of the values returned by an INT, NUMERIC, or FLOAT8 expression. | |
Calculates the population standard deviation of the values returned by an INT, NUMERIC, or FLOAT8 expression. | |
Calculates the sample standard deviation of the values returned by an INT, NUMERIC, or FLOAT8 expression. | |
Concatenates the non-NULL values returned by a TEXT expression into a string using a specified separator. | |
Calculates the number of rows that have distinct values in a column. | |
Calculates the sample variance of the values returned by an INT or NUMERIC expression. | |
Calculates the population variance of the values returned by a FLOAT8, INT, or NUMERIC expression. | |
Calculates the sample variance of the values returned by an INT, FLOAT8, or NUMERIC expression. |
Sample data
-- Create a table.
CREATE TABLE example_table(
c1 INT,
c2 BOOLEAN,
c3 VARCHAR(10),
c4 DECIMAL(3, 1),
c5 TEXT,
c6 FLOAT8,
c7 FLOAT8
);
-- Insert data.
INSERT INTO example_table (c1, c2, c3, c4, c5, c6, C7) VALUES
(1, true, 'a', 1.1, 3, 6, 7),
(2, false, 'b', 2.2, 4, 6, 2.6);APPROX_COUNT_DISTINCT
Description: Calculates the number of rows that have distinct values in a column. This function returns a single, approximate value.
APPROX_COUNT_DISTINCT ( <column> )Parameters
column: Required. Specifies the column for which you want to calculate the approximate number of distinct rows.
Usage notes
The
APPROX_COUNT_DISTINCTfunction uses HyperLogLog cardinality estimation to perform an inexact COUNT DISTINCT calculation. An inexact COUNT DISTINCT calculation improves query performance, especially when the column has many discrete values. The average error rate is typically between0.1% and 1%. This function is suitable for performance-sensitive scenarios where some degree of error is acceptable. You can adjust the error rate using the following parameter.SET hg_experimental_approx_count_distinct_precision = 20;The value ranges from 12 to 20. The default value is 17.
The precision parameter specifies the number of bits for bucketing in the HyperLogLog algorithm. A larger value indicates more buckets and higher theoretical precision.
A higher precision value also increases the computation time and memory overhead. However, the overhead is much lower than that of an exact
COUNT DISTINCT ( column )statement. Therefore, we recommend that you useAPPROX_COUNT_DISTINCTinstead ofCOUNT DISTINCT ( column ).When the precision parameter is set to a value greater than 17, Hologres uses the HyperLogLog++ algorithm. This algorithm corrects the return value to further reduce and stabilize the error.
For example, if you set hg_experimental_approx_count_distinct_precision to 20, the error rate can be reduced to a range of 0.01% to 0.2% in most cases.
NoteYou can also perform an exact COUNT DISTINCT calculation using the
COUNT DISTINCT ( column )function. This function consumes more resources.Example
The following example uses the ORDERS table from the TPC-H public dataset provided by MaxCompute. You can run the following SQL statements directly without any additional preparation.
-- Create a foreign table. IMPORT FOREIGN SCHEMA public_data LIMIT TO (ORDERS_10g) FROM SERVER odps_server INTO public options (if_table_exist 'update'); -- Create an internal table. BEGIN; CREATE TABLE ORDERS ( O_ORDERKEY bigint NOT NULL PRIMARY KEY, O_CUSTKEY int NOT NULL, O_ORDERSTATUS text NOT NULL, O_TOTALPRICE DECIMAL(12, 2) NOT NULL, O_ORDERDATE timestamptz NOT NULL, O_ORDERPRIORITY text NOT NULL, O_CLERK text NOT NULL, O_SHIPPRIORITY int NOT NULL, O_COMMENT text NOT NULL ); CALL set_table_property ('ORDERS', 'segment_key', 'O_ORDERDATE'); CALL set_table_property ('ORDERS', 'distribution_key', 'O_ORDERKEY'); CALL set_table_property ('ORDERS', 'bitmap_columns', 'O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT'); CALL set_table_property ('ORDERS', 'dictionary_encoding_columns', 'O_ORDERSTATUS,O_ORDERPRIORITY,O_CLERK,O_COMMENT'); COMMENT ON COLUMN ORDERS.O_ORDERKEY IS 'Order ID'; COMMENT ON COLUMN ORDERS.O_CUSTKEY IS 'Customer ID'; COMMENT ON COLUMN ORDERS.O_ORDERSTATUS IS 'Order status'; COMMENT ON COLUMN ORDERS.O_TOTALPRICE IS 'Total price'; COMMENT ON COLUMN ORDERS.O_ORDERDATE IS 'Order date'; COMMENT ON COLUMN ORDERS.O_ORDERPRIORITY IS 'Order priority'; COMMENT ON COLUMN ORDERS.O_CLERK IS 'Cashier'; COMMENT ON COLUMN ORDERS.O_SHIPPRIORITY IS 'Shipping priority'; COMMENT ON COLUMN ORDERS.O_COMMENT IS 'Comment'; COMMIT; -- Import data to the internal table. INSERT INTO ORDERS SELECT * FROM ORDERS_10g;Calculate the approximate number of distinct rows in the O_CUSTKEY column.
SELECT APPROX_COUNT_DISTINCT ( O_CUSTKEY ) FROM ORDERS;The following result is returned.
approx_count_distinct ----------------------- 1000422Set the global precision to 20 and calculate the approximate number of distinct rows in the O_CUSTKEY column.
ALTER DATABASE dbname SET hg_experimental_approx_count_distinct_precision = 20; SELECT APPROX_COUNT_DISTINCT ( O_CUSTKEY ) FROM ORDERS;The following result is returned.
approx_count_distinct ----------------------- 1000422Set the session-level precision to 20 and calculate the approximate number of distinct rows in the O_CUSTKEY column.
--Set the session-level precision to 20. SET hg_experimental_approx_count_distinct_precision = 20; SELECT APPROX_COUNT_DISTINCT ( O_CUSTKEY ) FROM ORDERS;The following result is returned.
approx_count_distinct ----------------------- 998854
APPROX_PERCENTILE
Hologres supports APPROX_PERCENTILE from V3.1 and later.
Description: Calculates the approximate percentile of a numeric column. This function returns a single, approximate value.
APPROX_PERCENTILE (fraction) WITHIN GROUP (ORDER BY sort_expression)Parameters
fraction: A constant between 0 and 1. For example, 0.5 represents the median.
sort_expression: Specifies the column for which to calculate the percentile. This column must be sortable.
Usage notes
The
APPROX_PERCENTILEfunction uses a reservoir sampling algorithm. It samples a maximum of 8,192 elements and then calculates the percentile from these samples.Example
The following example calculates the approximate count of distinct values in the O_CUSTKEY column.
SELECT APPROX_PERCENTILE(0.25) WITHIN GROUP (ORDER BY salary) AS q1, APPROX_PERCENTILE(0.5) WITHIN GROUP (ORDER BY salary) AS median, APPROX_PERCENTILE(0.75) WITHIN GROUP (ORDER BY salary) AS q3 FROM employees;
ARRAY_AGG
Description: Concatenates the values of an expression into an array.
ARRAY_AGG(expression)Parameters
expression: Required. An expression of any data type except JSON, JSONB, TIMETZ, INTERVAL, INET, OID, UUID, and array types.
Return value
Returns a value of the ARRAY type.
Example
SELECT ARRAY_AGG(c1) FROM example_table;The following result is returned.
array_agg ----------- {2,1}
AVG
Description: Calculates the average of the non-NULL values returned by a BIGINT, FLOAT8, FLOAT4, or INT expression.
AVG(expression)Parameters
expression: Required. A BIGINT, FLOAT8, FLOAT4, or INT expression.
Return value
The return value is the same data type as the parameter.
Example
SELECT AVG(c7) FROM example_table;The following result is returned.
avg ---------- 1.500000
BIT_AND
Description: Performs bitwise AND operations on the non-NULL values returned by a BIGINT or INT expression.
BIT_AND(expression)Parameters
expression: Required. A BIGINT or INT expression.
Return value
The return value is the same data type as the parameter.
Example
SELECT BIT_AND(c1) FROM example_table;The result is as follows.
bit_and --------- 0
BIT_OR
Description: Performs bitwise OR operations on the non-NULL values returned by a BIGINT or INT expression.
BIT_OR(expression)Parameters
expression: Required. A BIGINT or INT expression.
Return value
The return value is the same data type as the parameter.
Example
SELECT BIT_OR(c1) FROM example_table;The following result is returned.
bit_or -------- 3
BOOL_AND
Description: Checks whether all values returned by a BOOLEAN expression are TRUE.
BOOL_AND(bool)Parameters
bool: Required. A BOOLEAN expression.
Return value
Returns a value of the BOOLEAN type. This function returns TRUE (t) if all values returned by the expression are true, and FALSE (f) otherwise.
Example
SELECT BOOL_AND(c2) FROM example_table;The following result is returned.
bool_and ---------- f
BOOL_OR
Description: Checks whether any value returned by a BOOLEAN expression is TRUE.
BOOL_OR(bool)Parameters
bool: Required. A BOOLEAN expression.
Return value
Returns a value of the BOOLEAN type. This function returns TRUE (t) if any value returned by the expression is true, and FALSE (f) otherwise.
Example
SELECT BOOL_OR(c2) FROM example_table;The following result is returned.
bool_or --------- t
CORR
Description: Calculates the correlation coefficient between two variables.
CORR(Y, X)NoteThe parameters must be of the DOUBLE PRECISION, FLOAT, or FLOAT8 data type.
Parameters
Y and X: Required. DOUBLE PRECISION, FLOAT, or FLOAT8 expressions.
Return value
The return value is the same data type as the parameters.
Example
SELECT CORR(c6, c7) FROM example_table;The following result is returned.
corr ------
COUNT
Description: Retrieves the number of rows in a specified table or the number of input rows from a BIGINT or NUMERIC expression.
Retrieves the number of rows in a specified table.
COUNT(*)Retrieves the number of input rows from a BIGINT or NUMERIC expression.
COUNT(expression)expression: Required. A non-NULL expression of the BIGINT or NUMERIC data type.
Return value
Returns a value of the BIGINT type.
Example
SELECT COUNT(*) FROM example_table;The following result is returned.
count ------- 2
COVAR_POP
Description: Calculates the population covariance.
COVAR_POP(Y, X)Parameters
Y and X: Required. DOUBLE PRECISION, FLOAT, or FLOAT8 expressions.
Return value
The return value is the same data type as the parameters.
Example
SELECT COVAR_POP(c6, c7) FROM example_table;The following result is returned.
covar_pop --------------------- 3.5527136788005e-15
COVAR_SAMP
Description: Calculates the sample covariance.
COVAR_SAMP(Y, X)Parameters
Y and X: Required. DOUBLE PRECISION, FLOAT, or FLOAT8 expressions.
Return value
The return value is the same data type as the parameters.
Example
SELECT COVAR_SAMP(c6, c7) FROM example_table;The following result is returned.
covar_samp -------------------- 7.105427357601e-15
EVERY
Description: Checks whether all values of a BOOLEAN expression are TRUE.
EVERY(bool)Parameters
bool: Required. A BOOLEAN expression.
Return value
Returns a value of the BOOLEAN type. This function returns TRUE (t) if all values of the expression are true, and FALSE (f) otherwise.
Example
SELECT EVERY(c2) FROM example_table;The following result is returned.
every ------- f
MAX
Description: Retrieves the maximum value from a BIGINT, FLOAT8, FLOAT4, INT, or NUMERIC expression.
MAX(expression)Parameters
expression: Required. A BIGINT, FLOAT8, FLOAT4, INT, or NUMERIC expression.
Return value
The return value is the same data type as the parameter.
Example
SELECT MAX(c1) FROM example_table;The following result is returned.
max ----- 2
MAX_BY
Description: Compares the values in a column (column y) to find the maximum value and returns the value of another specified column (column x) from the corresponding row.
MAX_BY(x, y);NoteNumeric types are compared based on their numeric value. Non-numeric types are compared based on their dictionary sort order.
Parameters
y: Required. Specifies the column from which to retrieve the maximum value. Rows where column y is NULL are not included in the calculation.
x: Required. Specifies the column from which to retrieve a value that corresponds to the maximum value in column y.
Usage notes
The MAX_BY function is supported only in Hologres V1.3.36 and later.
NoteIf your instance version is earlier than V1.3.36, you can join the Hologres DingTalk group to request an instance upgrade. For more information, see How to obtain more online support?
Return value
The function returns the value from column x that corresponds to the maximum value in column y.
If all values in column y are NULL, the function returns NULL.
If the MAX_BY function finds multiple rows with the maximum value in column y, it returns the maximum value from the corresponding values in column x.
Example
--Sample data DROP TABLE IF EXISTS test; CREATE TABLE IF NOT EXISTS test ( id INT, name TEXT, cost INT ); INSERT INTO test VALUES (1, 'a', 100), (1, 'aa', 200), (1, 'aaa', 300), (2, 'b', 150), (2, 'bb', 300), (3, 'c', 150), (3, 'cc', 50);Query for the value in the name column that corresponds to the maximum value in the cost column. In this example, multiple maximum values exist.
SELECT max_by(name, cost) FROM test;The following result is returned.
max_by -------- aaa
MIN
Description: Retrieves the minimum value from a BIGINT, FLOAT8, FLOAT4, INT, or NUMERIC expression.
MIN(expression)Parameters
expression: Required. A BIGINT, FLOAT8, FLOAT4, INT, or NUMERIC type.
Return value
The return value is the same data type as the parameter.
Example
SELECT MIN(c1) FROM example_table;The following result is returned.
min ----- 1
MIN_BY
Description: Compares the values in a column (column y) to find the minimum value and returns the value of another specified column (column x) from the corresponding row.
MIN_BY(x, y);NoteNumeric types are compared based on their numeric value. Non-numeric types are compared based on their dictionary sort order.
Parameters
y: Required. Specifies the column from which to retrieve the minimum value. Rows where column y is NULL are not included in the calculation.
x: Required. Specifies the column from which to retrieve a value that corresponds to the minimum value in column y.
Usage notes
The MAX_BY and MIN_BY functions are supported only in Hologres V1.3.36 and later.
NoteIf your instance version is earlier than V1.3.36, you can join the Hologres DingTalk group to request an instance upgrade. For more information, see How to obtain more online support?
Return value
The function returns the value from column x that corresponds to the minimum value in column y.
If all values in column y are NULL, the function returns NULL.
If the MIN_BY function finds multiple rows with the minimum value in column y, it returns the minimum value from the corresponding values in column x.
Example
--Sample data DROP TABLE IF EXISTS test; CREATE TABLE IF NOT EXISTS test ( id INT, name TEXT, cost INT ); INSERT INTO test VALUES (1, 'a', 100), (1, 'aa', 200), (1, 'aaa', 300), (2, 'b', 150), (2, 'bb', 300), (3, 'c', 150), (3, 'cc', 50);Query for the value in the name column that corresponds to the minimum value in the cost column.
SELECT MIN_BY(name, cost) FROM test;The following result is returned.
min_by -------- cc (1 row)
REGR_AVGX
Description: Calculates the average of an independent variable (X).
REGR_AVGX(Y, X)Parameters
Y and X: Required. DOUBLE PRECISION, FLOAT, or FLOAT8 expressions.
Return value
The return value is the same data type as the parameters.
Example
SELECT REGR_AVGX(c6, c7) FROM example_table;The following result is returned.
regr_avgx ----------- 4.8
REGR_AVGY
Description: Calculates the average of a dependent variable (Y).
REGR_AVGY(Y, X)Parameters
Y and X: Required. DOUBLE PRECISION, FLOAT, or FLOAT8 expressions.
Return value
The return value is the same data type as the parameters.
Example
SELECT REGR_AVGY(c6, c7) FROM example_table;The following result is returned.
regr_avgy ----------- 6
REGR_COUNT
Description: Calculates the number of rows where neither input parameter is NULL.
REGR_COUNT(Y, X)Parameters
Y and X: Required. DOUBLE PRECISION, FLOAT, or FLOAT8 expressions.
Return value
Returns a value of the BIGINT type.
Example
SELECT REGR_COUNT(c6, c7) FROM example_table;The following result is returned.
regr_count ------------ 2
REGR_R2
Description: Calculates the square of the correlation coefficient.
REGR_R2(Y, X)Parameters
Y and X: Required. DOUBLE PRECISION, FLOAT, or FLOAT8 expressions.
Return value
The return value is the same data type as the parameters.
Example
SELECT REGR_R2(c6, c7) FROM example_table;The following result is returned.
regr_r2 --------- 1
REGR_INTERCEPT
Description: Calculates the y-intercept of the least-squares-fit linear equation determined by the
(Y, X)pairs.REGR_INTERCEPT(Y, X)Parameters
Y and X: Required. DOUBLE PRECISION, FLOAT, or FLOAT8 expressions.
Return value
The return value is the same data type as the parameters.
Example
SELECT REGR_INTERCEPT(c6, c7) FROM example_table;The following result is returned.
regr_intercept ---------------- 6
REGR_SLOPE
Description: Calculates the slope of the least-squares-fit linear equation.
REGR_SLOPE(Y, X)Parameters
Y and X: Required. DOUBLE PRECISION, FLOAT, or FLOAT8 expressions.
Return value
The return value is the same data type as the parameters.
Example
SELECT REGR_SLOPE(c6, c7) FROM example_table;The following result is returned.
regr_slope ---------------------- 7.34031751818285e-16
REGR_SXX
Description: Calculates the sum of squares of the independent variable (X), as
sum(X^2) - sum(X)^2/N.REGR_SXX(Y, X)Parameters
Y and X: Required. DOUBLE PRECISION, FLOAT, or FLOAT8 expressions.
Return value
The return value is the same data type as the parameters.
Example
SELECT REGR_SXX(c6, c7) FROM example_table;The following result is returned.
regr_sxx ---------- 9.68
REGR_SXY
Description: Calculates the sum of products of the independent variable (X) and dependent variable (Y), as
sum(X × Y) - sum(X) × sum(Y)/N.REGR_SXY(Y, X)Parameters
Y and X: Required. DOUBLE PRECISION, FLOAT, or FLOAT8 expressions.
Return value
The return value is the same data type as the parameters.
Example
SELECT REGR_SXY(c6, c7) FROM example_table;The following result is returned.
regr_sxy -------------------- 7.105427357601e-15
SUM
Description: Calculates the sum of all values from a BIGINT, FLOAT8, FLOAT4, INT, or NUMERIC expression.
SUM(expression)Parameters
expression: Required. A BIGINT, FLOAT8, FLOAT4, INT, or NUMERIC expression.
Return value
The return value is the same data type as the parameter.
Example
SELECT SUM(c1) FROM example_table;The following result is returned.
sum ----- 3
STDDEV
Description: Calculates the sample standard deviation of the values returned by an INT, NUMERIC, or FLOAT8 expression.
STDDEV(expression)Parameters
expression: Required. An INT, NUMERIC, or FLOAT8 expression.
Return value
Returns a value of the NUMERIC or FLOAT8 type.
Example
SELECT STDDEV(c1) FROM example_table;The following result is returned.
stddev -------------- 0.7071067811
STDDEV_POP
Description: Calculates the population standard deviation of the values returned by an INT, NUMERIC, or FLOAT8 expression.
STDDEV_POP(expression)Parameters
expression: Required. An INT, NUMERIC, or FLOAT8 expression.
Return value
Returns a value of the NUMERIC or FLOAT8 type.
Example
SELECT STDDEV_POP(c1) FROM example_table;The following result is returned.
stddev_pop -------------- 0.5000000000
STDDEV_SAMP
Description: Calculates the sample standard deviation of the values returned by an INT, NUMERIC, or FLOAT8 expression.
STDDEV_SAMP(expression)Parameters
expression: Required. An INT, NUMERIC, or FLOAT8 expression.
Return value
Returns a value of the NUMERIC or FLOAT8 type.
Example
SELECT STDDEV_SAMP(c1) FROM example_table;The following result is returned.
stddev_samp -------------- 0.7071067812
STRING_AGG
Description: Concatenates the non-NULL values returned by a TEXT expression into a string using a specified separator.
STRING_AGG(<expression> TEXT, <delimiter> TEXT)Parameters
expression: Required. The source string.
delimiter: Required. The separator.
Return value
Returns a value of the TEXT type.
Example
SELECT STRING_AGG(c5, '-') FROM example_table;The following result is returned.
string_agg ------------ 3-4
UNIQ
Description: Calculates the number of rows that have distinct values in a column.
UNIQ ( < column > );Parameters
column: Required. Specifies the column for which you want to calculate the number of rows that have distinct values. This column can be of the following data types: SMALLINT, INTEGER, BIGINT, REAL, DOUBLE PRECISION, TEXT, VARCHAR, TIMESTAMP, TIMESTAMPTZ, DATE, TIMETZ, and UUID.
Usage notes
The UNIQ function is supported only in Hologres V1.3 and later.
NoteIf your instance version is earlier than V1.3, see Common errors when preparing for an upgrade or join the Hologres DingTalk group to provide feedback. For more information, see How to obtain more online support?.
The UNIQ function performs better than the COUNT DISTINCT function only when the SQL statement contains a GROUP BY clause and the values of the GROUP BY field are evenly distributed without data skew.
Typically, UNIQ performs better than COUNT DISTINCT when the GROUP BY key has high cardinality. UNIQ also consumes less memory than COUNT DISTINCT. If you encounter an out-of-memory error when you use COUNT DISTINCT, you can use UNIQ as an alternative.
Since Hologres V2.1, the performance of COUNT DISTINCT has been significantly optimized for various scenarios, including those with single or multiple COUNT DISTINCT functions, data skew, or SQL statements without a GROUP BY clause. You no longer need to manually rewrite queries to use UNIQ for better performance. For more information, see Count Distinct optimization.
Example
The following example uses the ORDERS table from the TPC-H public dataset provided by MaxCompute. You can run the following SQL statements directly without any additional preparation.
-- Create a foreign table. IMPORT FOREIGN SCHEMA public_data LIMIT TO (ORDERS_10g) FROM SERVER odps_server INTO public options (if_table_exist 'update'); -- Create an internal table. BEGIN; CREATE TABLE ORDERS ( O_ORDERKEY bigint NOT NULL PRIMARY KEY, O_CUSTKEY int NOT NULL, O_ORDERSTATUS text NOT NULL, O_TOTALPRICE DECIMAL(12, 2) NOT NULL, O_ORDERDATE timestamptz NOT NULL, O_ORDERPRIORITY text NOT NULL, O_CLERK text NOT NULL, O_SHIPPRIORITY int NOT NULL, O_COMMENT text NOT NULL ); CALL set_table_property ('ORDERS', 'segment_key', 'O_ORDERDATE'); CALL set_table_property ('ORDERS', 'distribution_key', 'O_ORDERKEY'); CALL set_table_property ('ORDERS', 'bitmap_columns', 'O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT'); CALL set_table_property ('ORDERS', 'dictionary_encoding_columns', 'O_ORDERSTATUS,O_ORDERPRIORITY,O_CLERK,O_COMMENT'); COMMENT ON COLUMN ORDERS.O_ORDERKEY IS 'Order ID'; COMMENT ON COLUMN ORDERS.O_CUSTKEY IS 'Customer ID'; COMMENT ON COLUMN ORDERS.O_ORDERSTATUS IS 'Order status'; COMMENT ON COLUMN ORDERS.O_TOTALPRICE IS 'Total price'; COMMENT ON COLUMN ORDERS.O_ORDERDATE IS 'Order date'; COMMENT ON COLUMN ORDERS.O_ORDERPRIORITY IS 'Order priority'; COMMENT ON COLUMN ORDERS.O_CLERK IS 'Cashier'; COMMENT ON COLUMN ORDERS.O_SHIPPRIORITY IS 'Shipping priority'; COMMENT ON COLUMN ORDERS.O_COMMENT IS 'Comment'; COMMIT; -- Import data to the internal table. INSERT INTO ORDERS SELECT * FROM ORDERS_10g;Calculate the number of distinct rows in the O_CUSTKEY column of the ORDERS table.
SELECT UNIQ ( O_CUSTKEY ) FROM ORDERS;The following result is returned.
uniq -------- 999982Calculate the number of distinct O_CUSTKEY values for each O_ORDERSTATUS in the ORDERS table.
SELECT O_ORDERSTATUS, UNIQ ( O_CUSTKEY ) FROM ORDERS GROUP BY O_ORDERSTATUS;The following result is returned.
o_orderstatus | uniq ---------------+-------- P | 313478 F | 996258 O | 996400
VARIANCE
Description: Calculates the sample variance of the values returned by an INT or NUMERIC expression.
VARIANCE(expression)Parameters
expression: Required. An INT or NUMERIC expression.
Return value
Returns a value of the NUMERIC type.
Example
SELECT VARIANCE(c1) FROM example_table;The following result is returned.
variance -------------- 0.5000000000
VAR_POP
Description: Calculates the population variance of the values returned by a FLOAT8, INT, or NUMERIC expression.
VAR_POP(expression)Parameters
expression: Required. A FLOAT8, INT, or NUMERIC expression.
Return value
Returns a value of the NUMERIC or FLOAT8 type.
Example
SELECT VAR_POP(c1) FROM example_table;The following result is returned.
var_pop -------------- 0.2500000000
VAR_SAMP
Description: Calculates the sample variance of the values returned by an INT, FLOAT8, or NUMERIC expression.
VAR_SAMP(expression)Parameters
expression: Required. An INT, FLOAT8, or NUMERIC expression.
Return value
Returns a value of the NUMERIC or FLOAT8 type.
Example
SELECT VAR_SAMP(c1) FROM example_table;The following result is returned.
var_samp -------------- 0.5000000000