Partial Partition-Wise Join in the SMP Scenario
Partial partition-wise join means that one of the two joined tables is a partitioned table, and the other table can be of any type. A Stream Redistribute operator is added to the upper layer of the table of any type to distribute data and match data with the partitioned table.
The condition for generating a partial partition-wise join path is that the partition key of the partitioned table is a join key.
Examples
-- Create a hash partitioned table. gaussdb=# CREATE TABLE hash_part ( a INTEGER, b INTEGER, c INTEGER ) PARTITION BY HASH(a) ( PARTITION p1, PARTITION p2, PARTITION p3, PARTITION p4, PARTITION p5 ); CREATE TABLE -- Create a non-partitioned table. gaussdb=# CREATE TABLE normal_table ( a INTEGER, b INTEGER, c INTEGER ); CREATE TABLE -- Set query_dop to 5 to enable SMP. gaussdb=# SET query_dop = 5; SET gaussdb=# SET enable_force_smp = on; SET -- Disable the partition-wise join function in the SMP scenario. gaussdb=# SET enable_smp_partitionwise = off; SET -- View a non-partition-wise join plan. According to the plan, after data scan is completed using the Partition Iterator and Partitioned Seq Scan operators at two layers, data is redistributed using the Streaming(type: LOCAL REDISTRIBUTE) operator to ensure that data in the Join operator matches each other. gaussdb=# EXPLAIN (costs off) SELECT * FROM normal_table t1, hash_part t2 WHERE t1.a=t2.a; QUERY PLAN -------------------------------------------------------------------- Streaming(type: LOCAL GATHER dop: 1/5) -> Hash Join Hash Cond: (t1.a = t2.a) -> Streaming(type: LOCAL REDISTRIBUTE dop: 5/5) -> Seq Scan on normal_table t1 -> Hash -> Streaming(type: LOCAL REDISTRIBUTE dop: 5/5) -> Partition Iterator Iterations: 5 -> Partitioned Seq Scan on hash_part t2 Selected Partitions: 1..5 (11 rows) -- Enable the partition-wise join function in the SMP scenario. gaussdb=# SET enable_smp_partitionwise = on; SET -- View the execution plan of partition-wise join. According to the plan, the partial partition-wise join plan eliminates the Streaming operator of table hash_part, that is, data of the partitioned table does not need to be redistributed between threads, thereby reducing overheads of data transfer and improving performance of the Join operation. gaussdb=# EXPLAIN (costs off) SELECT * FROM normal_table t1, hash_part t2 WHERE t1.a=t2.a; QUERY PLAN -------------------------------------------------------------- Streaming(type: LOCAL GATHER dop: 1/5) -> Hash Join (Partition-wise Join) Hash Cond: (t1.a = t2.a) -> Streaming(type: LOCAL REDISTRIBUTE dop: 5/5) -> Seq Scan on normal_table t1 -> Hash -> Partition Iterator Iterations: 5 -> Partitioned Seq Scan on hash_part t2 Selected Partitions: 1..5 (10 rows) -- Drop the partitioned table. gaussdb=# DROP TABLE hash_part; DROP TABLE gaussdb=# DROP TABLE normal_table; DROP TABLE
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot