Updated on 2025-05-29 GMT+08:00

Streaming

Description

If an SQL statement cannot be pushed down, a distributed execution plan is generated first. Streaming has three forms, which correspond to different data shuffle functions in the distributed structure.
  • Streaming (type: GATHER): The CN collects data from DNs.
  • Streaming (type: REDISTRIBUTE): Data is redistributed to all the DNs based on selected columns.
  • Streaming (type: BROADCAST): Data on the current DN is broadcast to other DNs.

Typical Scenarios

Distributed Execution Plan

Examples

-- Prepare data.
gaussdb=#DROP TABLE IF EXISTS t1;
gaussdb=#DROP TABLE IF EXISTS t2;
gaussdb=#CREATE TABLE t1(c1 int, c2 int);
gaussdb=#CREATE TABLE t2(c1 int, c2 int);

-- Execution result.
gaussdb=#EXPLAIN SELECT * FROM t1,t2 WHERE t1.c2 = t2.c2;
                                 QUERY PLAN
----------------------------------------------------------------------------
 Streaming (type: GATHER)  (cost=13.92..29.57 rows=20 width=16)
   Node/s: All datanodes
   ->  Hash Join  (cost=13.29..28.64 rows=20 width=16)
         Hash Cond: (t1.c2 = t2.c2)
         ->  Streaming(type: BROADCAST)  (cost=0.00..15.18 rows=40 width=8)
               Spawn on: All datanodes
               ->  Seq Scan on t1  (cost=0.00..13.13 rows=20 width=8)
         ->  Hash  (cost=13.13..13.13 rows=21 width=8)
               ->  Seq Scan on t2  (cost=0.00..13.13 rows=20 width=8)
(9 rows)

-- Drop.
gaussdb=#DROP TABLE IF EXISTS t1;
gaussdb=#DROP TABLE IF EXISTS t2;

Item

Description

Streaming

Specifies the operator name.

Spawn on: All datanodes

Distributes data to all DNs.