Updated on 2026-05-15 GMT+08:00

Top-N

Function

Top-N queries are used to identify the N smallest or largest values based on a specific column's sorting order. Both maximum and minimum value sets are categorized as Top-N queries. These queries are highly effective in both batch and stream processing tables when you need to display only the N top-tier or bottom-tier records that meet specific criteria.

Syntax Format

1
2
3
4
5
6
7
SELECT [column_list]
FROM (
   SELECT [column_list],
     ROW_NUMBER() OVER ([PARTITION BY col1[, col2...]]
       ORDER BY col1 [asc|desc][, col2 [asc|desc]...]) AS rownum
   FROM table_name)
WHERE rownum <= N [AND conditions]

Syntax Description

  • ROW_NUMBER(): Allocates a unique and consecutive number to each line starting from the first line in the current partition. Currently, ROW_NUMBER can be used in the over window function only. RANK() and DENSE_RANK() will be supported soon.
  • PARTITION BY col1[, col2...]: Specifies the partition columns. Each partition will have a Top-N result.
  • ORDER BY col1 [asc|desc][, col2 [asc|desc]...]: Specifies the ordering columns. The ordering directions can be different for different columns.
  • WHERE rownum <= N: Specifies a Top-N query in Flink. Where N represents the N smallest or largest records that need to be returned.
  • [AND conditions]: You can add other query conditions in the where clause. Note that the other conditions can only be combined with rownum <= N using the AND logic.

Constraints

  • Top-N queries return updated results.
  • Flink SQL will sort the input data stream according to the order key.
  • If the top N records change, the changed ones are sent as retractions or record updates to downstream systems.
  • If the top N records need to be stored in external storage, the result table must have the same unique key as the Top-N query.

Example

This is an example to get the top five products per category that have the maximum sales in real time.

1
2
3
4
5
6
SELECT * 
  FROM ( 
     SELECT *,
         ROW_NUMBER() OVER (PARTITION BY category ORDER BY sales DESC) as row_num
     FROM ShopSales)
  WHERE row_num <= 5;