Nested Loop Join

Description

A nested-loop join is the simplest join method, which is implemented in all relational database systems. It compares the data in two tables to check whether the join conditions are met.

In GaussDB, a nested-loop join scans the inner table for each row in the outer table searching for rows that meet the join conditions. This is similar to two nested loops. The outer loop traverses the outer table, and the inner loop traverses the inner table.

The time complexity of a nested-loop join is O(n x m), where n and m indicate the numbers of rows in the two tables. If the inner table can be scanned by using indexes, the time complexity can be reduced to O(n log m).

Typical Scenarios

Any of the inner or outer table is very small.
An inner table quickly locates the row that meets the condition (for example, an index has been created for the join column of the inner table) based on the join condition.
A nested-loop join can be executed under any join conditions if it has no restrictions.

Examples

Example: Join two small tables.

-- Prepare data.
gaussdb=# CREATE TABLE employee(id int, deptid int); 
CREATE TABLE 
gaussdb=# INSERT INTO employee VALUES(1, 1), (2,1),(3,2),(4, 1), (5,2); 
INSERT 0 5 
gaussdb=# CREATE TABLE manager(id int, deptid int); 
CREATE TABLE 
gaussdb=# INSERT INTO manager VALUES(1,1), (2,2),(3,1),(4,2); 
INSERT 0 4

-- Execution result.
gaussdb=# EXPLAIN SELECT * FROM employee e JOIN manager m ON e.deptid < m.deptid; 
                               QUERY PLAN                                 
------------------------------------------------------------------------- 
 Nested Loop  (cost=0.00..69341.37 rows=1539400 width=16) 
   Join Filter: (e.deptid < m.deptid) 
   ->  Seq Scan on employee e  (cost=0.00..31.49 rows=2149 width=8) 
   ->  Materialize  (cost=0.00..42.23 rows=2149 width=8) 
         ->  Seq Scan on manager m  (cost=0.00..31.49 rows=2149 width=8) 
(5 rows)

-- Drop.
gaussdb=# DROP TABLE employee,manager;

In the preceding example, the output of nested-loop join is as follows.

Item	Description
Nested Loop	Operator name.
Join Filter	Join predicate of the operator join. In the example, the condition is that e.deptid is less than m.deptid. During query execution, rows that meet these conditions are included in the final result set.

Example: An index has been created for join columns in an inner table.

-- Prepare data.
gaussdb=# CREATE TABLE t1(a int,b int);
CREATE TABLE 
gaussdb=#  CREATE INDEX key_a on t1(a);
CREATE INDEX
gaussdb=# CREATE TABLE t2(a int,b int); 
CREATE TABLE 

-- Collect statistics.
gaussdb=# ANALYZE t1;
ANALYZE
gaussdb=# ANALYZE t2;
ANALYZE

-- Execution result.
gaussdb=# explain select t1.*,t2.* from t1 inner join t2 on t1.a < t2.b;
                               QUERY PLAN
-------------------------------------------------------------------------
 Nested Loop  (cost=0.00..42926.55 rows=1539400 width=16)
   ->  Seq Scan on t2  (cost=0.00..31.49 rows=2149 width=8)
   ->  Index Scan using key_a on t1  (cost=0.00..12.80 rows=716 width=8)
         Index Cond: (a < t2.b)
(4 rows)


-- Drop.
gaussdb=# DROP TABLE t1,t2;

Example: any join condition

-- Prepare data.
gaussdb=# CREATE TABLE t1(a int,b int);
CREATE TABLE 
gaussdb=#  create index key_a on t1(a);
CREATE INDEX
gaussdb=# CREATE TABLE t2(a int,b int); 
CREATE TABLE 

-- Collect statistics.
gaussdb=# ANALYZE t1;
gaussdb=# ANALYZE t2;

-- Execution result.
gaussdb=# explain select t1.*,t2.* from t1 inner join t2 on t1.a != t2.b and t1.a < t2.a;
                               QUERY PLAN
-------------------------------------------------------------------------
 Nested Loop  (cost=0.00..46708.79 rows=1531703 width=16)
   ->  Seq Scan on t2  (cost=0.00..31.49 rows=2149 width=8)
   ->  Index Scan using key_a on t1  (cost=0.00..14.59 rows=713 width=8)
         Index Cond: (a < t2.a)
         Filter: (a <> t2.b)
(5 rows)


-- Drop.
gaussdb=# DROP TABLE t1,t2;

Parent topic: Table Join Modes

Previous topic: Table Join Modes

Next topic: Hash Join