All Products
Search
Document Center

MaxCompute:Query the first N data records of each group

Last Updated:Mar 26, 2026

Use ROW_NUMBER() to rank rows within each group, then filter by row number to retrieve the top N records per group — for example, the lowest-paid employees in each job category.

Sample data

The examples in this topic use the following emp table.

empnoenamejobsal
7369SMITHCLERK800.0
7876SMITHCLERK1100.0
7900JAMESCLERK950.0
7934MILLERCLERK1300.0
7499ALLENSALESMAN1600.0
7654MARTINSALESMAN1250.0
7844TURNERSALESMAN1500.0
7521WARDSALESMAN1250.0

Implementation

Query the row ID of each record and use the WHERE clause to filter the records.

SELECT * FROM (
  SELECT empno
  , ename
  , sal
  , job
  , ROW_NUMBER() OVER (PARTITION BY job ORDER BY sal) AS rn
  FROM emp
) tmp
WHERE rn < 10;

How it works

Assign a row number to each record within its group using ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...), then wrap the query in a subquery and apply a WHERE filter on the row number.

  • PARTITION BY job — resets the row number counter for each distinct job value

  • ORDER BY sal — determines the ranking order within each partition (ascending by default, so the lowest salary gets row number 1)

  • WHERE rn < N + 1 — keeps only the first N rows from each group; replace N with your target count (for example, WHERE rn < 4 returns the top 3 records per group)

Query

SELECT * FROM (
  SELECT empno
  , ename
  , sal
  , job
  , ROW_NUMBER() OVER (PARTITION BY job ORDER BY sal) AS rn
  FROM emp
) tmp
WHERE rn < 10;

Expected output

empnoenamejobsalrn
7369SMITHCLERK800.01
7900JAMESCLERK950.02
7876SMITHCLERK1100.03
7934MILLERCLERK1300.04
7654MARTINSALESMAN1250.01
7521WARDSALESMAN1250.02
7844TURNERSALESMAN1500.03
7499ALLENSALESMAN1600.04
When two rows have the same sal value (for example, MARTIN and WARD both have 1250.0), ROW_NUMBER() assigns them distinct row numbers in an arbitrary order. If you need consistent tie-breaking, add a tiebreaker column to the ORDER BY clause (for example, ORDER BY sal, empno). To include all tied records rather than break ties, use RANK() or DENSE_RANK() instead.