DISTINCT removes duplicate rows from a query result. ORDER BY sorts that result. Used together, they have specific interaction rules—particularly around which columns ORDER BY can reference. This topic explains the syntax, execution order, usage constraints, and performance considerations for combining DISTINCT and ORDER BY in PolarDB for PostgreSQL (Compatible with Oracle).
How it works
DISTINCT runs before ORDER BY. The database first eliminates duplicate rows, then sorts the deduplicated result set.
This order has a direct consequence: because DISTINCT collapses rows based on the selected columns, ORDER BY can only sort by columns that appear in the SELECT list. Sorting by a column that is not in the output would be meaningless—after deduplication, there is no unambiguous value for that column to sort by.
Syntax
SELECT DISTINCT column1, column2, ...
FROM table_name
WHERE condition
ORDER BY column1 [ASC|DESC], column2 [ASC|DESC], ...;| Clause | Description |
|---|---|
SELECT DISTINCT | Returns only unique records, eliminating duplicates. |
FROM | Specifies the table to query. |
WHERE | (Optional) Filters rows before deduplication. |
ORDER BY | Sorts the deduplicated result set. Accepts one or more columns in ascending (ASC) or descending (DESC) order. |
Sample dataset
All examples in this topic use the following employees table.
CREATE TABLE employees (
employee_id INT PRIMARY KEY, -- unique employee ID
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
department VARCHAR(50),
salary DECIMAL(10,2) CHECK (salary > 0)
);
INSERT INTO employees VALUES (1, 'John', 'Doe', 'Sales', 5000);
INSERT INTO employees VALUES (2, 'Jane', 'Smith', 'Marketing', 6000);
INSERT INTO employees VALUES (3, 'John', 'Doe', 'Sales', 5000);
INSERT INTO employees VALUES (4, 'Alice', 'Johnson', 'IT', 7000);
INSERT INTO employees VALUES (5, 'Bob', 'Brown', 'Sales', 5000);Examples
Unique departments in ascending order
SELECT DISTINCT DEPARTMENT
FROM employees
ORDER BY DEPARTMENT ASC;Output:
department
------------
IT
Marketing
Sales
(3 rows)Unique salaries in descending order
SELECT DISTINCT SALARY
FROM employees
ORDER BY SALARY DESC;Output:
salary
---------
7000.00
6000.00
5000.00
(3 rows)Unique full names in ascending order
SELECT DISTINCT FIRST_NAME, LAST_NAME
FROM employees
ORDER BY FIRST_NAME ASC, LAST_NAME ASC;Output:
first_name | last_name
------------+-----------
Alice | Johnson
Bob | Brown
Jane | Smith
John | Doe
(4 rows)DISTINCTapplies to the combination of all selected columns. Rows 1 and 3 share the sameFIRST_NAMEandLAST_NAME, so only one appears in the result.
Usage notes
Columns in ORDER BY must appear in SELECT
When DISTINCT is present, every column referenced in ORDER BY must exist in the SELECT list. The constraint exists because DISTINCT collapses rows based on selected columns—a sort column not in the output has no unambiguous value to sort by after deduplication.
Referencing a column that is not selected returns an error:
-- Incorrect: LAST_NAME is not in the SELECT list
SELECT DISTINCT FIRST_NAME
FROM employees
ORDER BY LAST_NAME ASC;ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select listOracle compatibility exception
To match Oracle behavior, set the polar_enable_distinct_orderby_new_column parameter to ON. This allows ORDER BY to reference columns not in the SELECT list, but requires a FETCH FIRST X ROWS ONLY or LIMIT clause at the end of the statement.
polar_enable_distinct_orderby_new_column was introduced in PolarDB for PostgreSQL (Compatible with Oracle) 2.0, revision version 2.0.14.26.0. If your cluster is on an earlier version, upgrade it first. For upgrade instructions, see Version management.Example after enabling the parameter:
SELECT DISTINCT FIRST_NAME
FROM employees
ORDER BY LAST_NAME ASC
FETCH FIRST 2 ROWS ONLY;Using aliases in ORDER BY
Define an alias in SELECT and reference it in ORDER BY:
SELECT DISTINCT FIRST_NAME AS FN
FROM employees
ORDER BY FN ASC;Using column positions in ORDER BY
Reference a column by its position in the SELECT list:
SELECT DISTINCT FIRST_NAME, LAST_NAME
FROM employees
ORDER BY 1 ASC;Positional references are fragile. If the column order in SELECT changes, the sort behavior changes silently without an error. Prefer explicit column names in production queries.DISTINCT applies to all selected columns
DISTINCT deduplicates based on the full combination of selected columns, not just the first one.
SELECT DISTINCT FIRST_NAME, DEPARTMENT
FROM employees
ORDER BY FIRST_NAME ASC;This query deduplicates by the (FIRST_NAME, DEPARTMENT) pair, not by FIRST_NAME alone. A user named "John" in two different departments would appear twice.
Advanced usage
Filtering with WHERE
WHERE filters rows before DISTINCT removes duplicates:
SELECT DISTINCT DEPARTMENT
FROM employees
WHERE SALARY > 5000
ORDER BY DEPARTMENT ASC;Output:
department
------------
IT
Marketing
(2 rows)Finding the maximum value per group
Use GROUP BY and MAX() to get the highest salary in each department:
SELECT DEPARTMENT, MAX(SALARY) AS MAX_SALARY
FROM employees
GROUP BY DEPARTMENT
ORDER BY DEPARTMENT ASC;Output:
department | max_salary
------------+------------
IT | 7000.00
Marketing | 6000.00
Sales | 5000.00
(3 rows)Ranking with window functions
Use ROW_NUMBER() to identify the top earner in each department, then sort by salary:
SELECT DISTINCT DEPARTMENT, FIRST_NAME, LAST_NAME, SALARY
FROM (
SELECT DEPARTMENT, FIRST_NAME, LAST_NAME, SALARY,
ROW_NUMBER() OVER (PARTITION BY DEPARTMENT ORDER BY SALARY DESC) AS rn
FROM employees
)
WHERE rn = 1
ORDER BY SALARY DESC;Output:
department | first_name | last_name | salary
------------+------------+-----------+--------
IT | Alice | Johnson | 7000
Marketing | Jane | Smith | 6000
Sales | John | Doe | 5000
(3 rows)Performance considerations
DISTINCT and ORDER BY together require the database to deduplicate and then sort the result, which can be expensive on large datasets. Apply the following optimizations as needed:
Create indexes on columns used in
WHERE,DISTINCT, andORDER BYto reduce scan costs.-- Index to speed up queries on the DEPARTMENT column CREATE INDEX idx_employees_department ON employees(DEPARTMENT);Select only necessary columns. Each additional column in
SELECT DISTINCTincreases the data volume the database must process and compare.Partition large tables. Partitioning limits the rows each query touches, improving query efficiency.
Analyze execution plans. Use
EXPLAIN PLANto identify bottlenecks and verify that indexes are being used.