Updated on 2024-10-25 GMT+08:00

Common Flink SQL Syntax

SELECT and WHERE

Filter using the WHERE clause.

  • Syntax:

    SELECT select_list FROM table_expression [ WHERE boolean_expression ]

  • Example:
    SELECT price + tax FROM Orders WHERE id = 10

WITH

WITH provides a way to write auxiliary statements for use in a larger query. These statements, which are often referred to as Common Table Expression (CTE), can be thought of as defining temporary views that exist just for one query.

  • Syntax:

    WITH <with_item_definition> [ , ... ]

    SELECT ... FROM ...;

    <with_item_defintion>:

    with_item_name (column_name[, ...n]) AS ( <select_query> )

  • Example:

    Define a common table expression orders_with_total and use it in a GROUP BY query.

    WITH orders_with_total AS (
        SELECT order_id, price + tax AS total
        FROM Orders
    )
    SELECT order_id, SUM(total)
    FROM orders_with_total
    GROUP BY order_id;

Window Aggregation

Queries with a group by window aggregation will compute a single result row per group.

  • Syntax:

    SELECT ...

    FROM <windowed_table> -- relation applied windowing TVF

    GROUP BY window_start, window_end, …

  • Example:
    SELECT window_start, window_end, SUM(price)
    FROM TABLE(
    TUMBLE(TABLE Bid, DESCRIPTOR(bidtime), INTERVAL '10' MINUTES))
    GROUP BY window_start, window_end;

Window Deduplication

Window Deduplication is a special deduplication which removes rows that duplicate over a set of columns, keeping the first one or the last one for each window and partitioned keys.

  • Syntax:

    SELECT [column_list]

    FROM (

    SELECT [column_list],

    ROW_NUMBER() OVER (PARTITION BY window_start, window_end [, col_key1...]

    ORDER BY time_attr [asc|desc]) AS rownum

    FROM table_name) -- relation applied windowing TVF

    WHERE (rownum = 1 | rownum <=1 | rownum < 2) [AND conditions]

  • Example:
    SELECT *
    FROM (
    SELECT bidtime, price, item, supplier_id, window_start, window_end,
    ROW_NUMBER() OVER (PARTITION BY window_start, window_end ORDER BY bidtime DESC) AS rownum
    FROM TABLE(
    TUMBLE(TABLE Bid, DESCRIPTOR(bidtime), INTERVAL '10' MINUTES))
    ) WHERE rownum <= 1;

Top-N

Top-N queries ask for the N smallest or largest values ordered by columns.

  • Syntax:

    SELECT [column_list]

    FROM (

    SELECT [column_list],

    ROW_NUMBER() OVER ([PARTITION BY col1[, col2...]]

    ORDER BY col1 [asc|desc][, col2 [asc|desc]...]) AS rownum

    FROM table_name)

    WHERE rownum <= N [AND conditions]

  • Example:
    CREATE TABLE ShopSales (
    product_id   STRING,
    category     STRING,
    product_name STRING,
    sales        BIGINT
    ) WITH (...);
    
    SELECT *
    FROM (
    SELECT *,
    ROW_NUMBER() OVER (PARTITION BY category ORDER BY sales DESC) AS row_num
    FROM ShopSales)
    WHERE row_num <= 5