Partial Partition-Wise Join in the SMP Scenario
Partial partition-wise join means that one of the two joined tables is a partitioned table, and the other table can be of any type. A Stream Redistribute operator is added to the upper layer of the table of any type to distribute data and match data with the partitioned table.
The condition for generating a partial partition-wise join path is that the partition key of the partitioned table is a join key.
Examples
-- Create a hash partitioned table. gaussdb=# CREATE TABLE hash_part ( a INTEGER, b INTEGER, c INTEGER ) PARTITION BY HASH(a) ( PARTITION p1, PARTITION p2, PARTITION p3, PARTITION p4, PARTITION p5 ); CREATE TABLE -- Create a non-partitioned table. gaussdb=# CREATE TABLE normal_table ( a INTEGER, b INTEGER, c INTEGER ); CREATE TABLE -- Set query_dop to 5 to enable SMP. gaussdb=# SET query_dop = 5; SET gaussdb=# SET enable_force_smp = on; SET -- Disable the partition-wise join function in the SMP scenario. gaussdb=# SET enable_smp_partitionwise = off; SET -- View a non-partition-wise join plan. According to the plan, after data scan is completed using the Partition Iterator and Partitioned Seq Scan operators at two layers, data is redistributed using the Streaming(type: LOCAL REDISTRIBUTE) operator to ensure that data in the Join operator matches each other. gaussdb=# EXPLAIN (costs off) SELECT * FROM normal_table t1, hash_part t2 WHERE t1.a=t2.a; QUERY PLAN -------------------------------------------------------------------- Streaming(type: LOCAL GATHER dop: 1/5) -> Hash Join Hash Cond: (t1.a = t2.a) -> Streaming(type: LOCAL REDISTRIBUTE dop: 5/5) -> Seq Scan on normal_table t1 -> Hash -> Streaming(type: LOCAL REDISTRIBUTE dop: 5/5) -> Partition Iterator Iterations: 5 -> Partitioned Seq Scan on hash_part t2 Selected Partitions: 1..5 (11 rows) -- Enable the partition-wise join function in the SMP scenario. gaussdb=# SET enable_smp_partitionwise = on; SET -- View the execution plan of partition-wise join. According to the plan, the partial partition-wise join plan eliminates the Streaming operator of table hash_part, that is, data of the partitioned table does not need to be redistributed between threads, thereby reducing overheads of data transfer and improving performance of the Join operation. gaussdb=# EXPLAIN (costs off) SELECT * FROM normal_table t1, hash_part t2 WHERE t1.a=t2.a; QUERY PLAN -------------------------------------------------------------- Streaming(type: LOCAL GATHER dop: 1/5) -> Hash Join (Partition-wise Join) Hash Cond: (t1.a = t2.a) -> Streaming(type: LOCAL REDISTRIBUTE dop: 5/5) -> Seq Scan on normal_table t1 -> Hash -> Partition Iterator Iterations: 5 -> Partitioned Seq Scan on hash_part t2 Selected Partitions: 1..5 (10 rows) -- Drop the partitioned table. gaussdb=# DROP TABLE hash_part; DROP TABLE gaussdb=# DROP TABLE normal_table; DROP TABLE
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.