Updated on 2025-05-29 GMT+08:00

Materialize

Description

The Materialize operator is used to cache the results returned by subnodes and save the subquery results. For subnodes that need to be scanned repeatedly for multiple times (especially when the scanning results are the same each time), the execution cost can be reduced.

Typical Scenarios

  • If a query statement involves subqueries and the same batch of data needs to be queried for multiple times, the optimizer selects the Materialize operator to cache the subquery results, greatly reducing the scan execution time. Subqueries often exist in clauses such as IN, ANY, ALL, and EXISTS.
  • Nested loop is used as the join operator.

Examples

Perform subqueries with ALL.

-- Prepare data.
gaussdb=# CREATE TABLE student(id integer, class_id integer, grade number); 
CREATE TABLE 
gaussdb=# CREATE TABLE class_table(class_id integer); 
CREATE TABLE 
gaussdb=# INSERT INTO student VALUES(generate_series(1,50), 1, floor(100 * random())); 
INSERT 0 50 
gaussdb=# INSERT INTO student VALUES(generate_series(51,100), 2, floor(100 * random())); 
INSERT 0 50 
gaussdb=# INSERT INTO student VALUES(generate_series(101,150), 3, floor(100 * random())); 
INSERT 0 50 
gaussdb=# INSERT INTO student VALUES(generate_series(151,200), 3, floor(100 * random())); 
INSERT 0 50 
gaussdb=# INSERT INTO class_table VALUES(1),(2),(3),(4); 
INSERT 0 4
-- Execution result.
gaussdb=# EXPLAIN SELECT * FROM student WHERE class_id >= all (SELECT class_id FROM class_table); 
                                 QUERY PLAN                                   
----------------------------------------------------------------------------- 
 Seq Scan on student  (cost=0.00..5206.50 rows=100 width=12) 
   Filter: (SubPlan 1) 
   SubPlan 1 
     ->  Materialize  (cost=0.00..46.03 rows=2402 width=4) 
           ->  Seq Scan on class_table  (cost=0.00..34.02 rows=2402 width=4) 
(5 rows) 

-- Drop.
gaussdb=# DROP TABLE IF EXISTS student,class_table;

In the preceding example, the output of the Materialize operator is as follows.

Item

Description

Materialize

Operator name.

Seq Scan

Operator name.

Example: Nested loop is used as the join operator.

-- Prepare data.
gaussdb=# CREATE TABLE employee(id int, deptid int); 
CREATE TABLE 
gaussdb=# INSERT INTO employee VALUES(1, 1), (2,1),(3,2),(4, 1), (5,2); 
INSERT 0 5 
gaussdb=# CREATE TABLE manager(id int, deptid int); 
CREATE TABLE 
gaussdb=# INSERT INTO manager VALUES(1,1), (2,2),(3,1),(4,2); 
INSERT 0 4
-- Execution result.
gaussdb=# EXPLAIN SELECT * FROM employee e JOIN manager m ON e.deptid < m.deptid; 
                               QUERY PLAN                                 
------------------------------------------------------------------------- 
 Nested Loop  (cost=0.00..69341.37 rows=1539400 width=16) 
   Join Filter: (e.deptid < m.deptid) 
   ->  Seq Scan on employee e  (cost=0.00..31.49 rows=2149 width=8) 
   ->  Materialize  (cost=0.00..42.23 rows=2149 width=8) 
         ->  Seq Scan on manager m  (cost=0.00..31.49 rows=2149 width=8) 
(5 rows)

-- Drop.
gaussdb=# DROP TABLE employee,manager;