Materialize
Description
The Materialize operator is used to cache the results returned by subnodes and save the subquery results. For subnodes that need to be scanned repeatedly for multiple times (especially when the scanning results are the same each time), the execution cost can be reduced.
Typical Scenarios
- If a query statement involves subqueries and the same batch of data needs to be queried for multiple times, the optimizer selects the Materialize operator to cache the subquery results, greatly reducing the scan execution time. Subqueries often exist in clauses such as IN, ANY, ALL, and EXISTS.
- Nested loop is used as the join operator.
Examples
Perform subqueries with ALL.
-- Prepare data. gaussdb=#DROP TABLE IF EXISTS student; gaussdb=#CREATE TABLE student(id integer, class_id integer, grade number); CREATE TABLE gaussdb=#CREATE TABLE class_table(class_id integer); CREATE TABLE gaussdb=#INSERT INTO student VALUES(generate_series(1,50), 1, floor(100 * random())); INSERT 0 50 gaussdb=#INSERT INTO student VALUES(generate_series(51,100), 2, floor(100 * random())); INSERT 0 50 gaussdb=#INSERT INTO student VALUES(generate_series(101,150), 3, floor(100 * random())); INSERT 0 50 gaussdb=#INSERT INTO student VALUES(generate_series(151,200), 3, floor(100 * random())); INSERT 0 50 gaussdb=#INSERT INTO class_table VALUES(1),(2),(3),(4); INSERT 0 4 -- Execution result. gaussdb=#EXPLAIN SELECT * FROM student WHERE class_id >= all (SELECT class_id FROM class_table); QUERY PLAN --------------------------------------------------------------------------------------- Streaming (type: GATHER) (cost=0.38..100.67 rows=10 width=40) Node/s: All datanodes -> Seq Scan on student (cost=0.00..100.11 rows=10 width=40) Filter: (SubPlan 1) SubPlan 1 -> Materialize (cost=0.00..13.28 rows=80 width=4) -> Streaming(type: BROADCAST) (cost=0.00..13.18 rows=40 width=4) Spawn on: All datanodes -> Seq Scan on class_table (cost=0.00..13.13 rows=20 width=4) (9 rows) -- Drop. gaussdb=#DROP TABLE IF EXISTS student,class_table;
In the preceding example, the output of the Materialize operator is as follows.
Item |
Description |
---|---|
Materialize |
Operator name. |
Streaming |
Operator name. |
Spawn on: datanode1 |
In the example, the data is distributed to all DNs. |
Example: Nested loop is used as the join operator.
-- Prepare data. gaussdb=# CREATE TABLE t1(a int,b int); CREATE TABLE gaussdb=# create index key_a on t1(a); CREATE INDEX gaussdb=# CREATE TABLE t2(a int,b int); CREATE TABLE -- Collect statistics. gaussdb=# ANALYZE t1; gaussdb=# ANALYZE t2; -- Execution result. gaussdb=# explain select t1.*,t2.* from t1 inner join t2 on t1.a < t2.b; QUERY PLAN ---------------------------------------------------------------------------- Streaming (type: GATHER) (cost=4.00..38.43 rows=133 width=16) Node/s: All datanodes -> Nested Loop (cost=0.00..32.24 rows=133 width=16) Join Filter: (t1.a < t2.b) -> Streaming(type: BROADCAST) (cost=0.00..15.18 rows=40 width=8) Spawn on: All datanodes -> Seq Scan on t1 (cost=0.00..13.13 rows=20 width=8) -> Materialize (cost=0.00..13.20 rows=20 width=8) -> Seq Scan on t2 (cost=0.00..13.13 rows=20 width=8) (9 rows) -- Drop. gaussdb=# DROP TABLE t1,t2;
In the preceding example, the output of the Materialize operator is as follows.
Item |
Description |
---|---|
Materialize |
Operator name. |
Seq Scan |
Operator name. |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot