Help Center/ Data Lake Insight/ Spark SQL Syntax Reference/ Data/ Reusing Results of Subqueries
Updated on 2025-03-17 GMT+08:00

Reusing Results of Subqueries

Function

To improve query performance, caching the results of a subquery and reusing them in different parts of the query can be done to avoid redundant calculations of the same subquery during the execution of a SELECT statement.

Syntax

1
2
3
4
5
6
7
8
WITH cte_name AS (
    SELECT ...
    FROM table_name
	WHERE ...
)
SELECT ...
FROM cte_name
WHERE ...

Keywords

Table 1 Keywords

Parameter

Description

cte_name

Custom subquery name

table_name

Name of the table where subquery commands are executed

Example

  • Example 1: Define the sales_data subquery based on the sales table, query all items with sales amounts greater than 100, and retrieve all data from the subquery.
    1
    2
    3
    4
    5
    6
    WITH sales_data AS (
      SELECT product_name, sales_amount
      FROM sales
      WHERE sales_amount > 100
    )
    SELECT * FROM sales_data;
    
  • Example 2: Define two subqueries, sales_data1 and sales_data2, based on the sales table, where sales_data1 retrieves all items with sales amounts greater than 100 and sales_data2 retrieves all items with sales amounts less than 20. Finally, merge the data from both subqueries.
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    WITH sales_data1 AS (
      SELECT product_name, sales_amount
      FROM sales
      WHERE sales_amount > 100
    ),
    sales_data2 AS (
      SELECT product_name, sales_amount
      FROM sales
      WHERE sales_amount < 20
    )
    SELECT * FROM sales_data1
    UNION ALL
    SELECT * FROM sales_data2;