SMP Application Scenarios and Restrictions
The SMP feature improves the performance through operator parallelism and occupies more system resources, including CPU, memory, network, and I/O. Actually, SMP is a method consuming resources to save time. It improves system performance in appropriate scenarios and when resources are sufficient, but may deteriorate performance otherwise. The impacts of different resources on SMP performance are described as follows:
- Operators supporting parallel processing are used.
- Scan: Row-store and column-store ordinary and partitioned tables, and ORC-formatted OBS foreign tables can be sequentially scanned. Foreign tables imported using GDS can be scanned in parallel. All of the above does not support replication tables.
- Join: HashJoin, NestLoop
- Agg: HashAgg, SortAgg, PlainAgg, and WindowAgg, which supports only partition by, and does not support order by.
- Stream: Redistribute, Broadcast
- Other: Result, Subqueryscan, Unique, Material, Setop, Append, VectoRow, RowToVec
- SMP-unique operators
- Local Gather aggregates data of parallel threads within a DN
- Local Redistribute redistributes data based on the distributed key across threads within a DN
- Local Broadcast broadcasts data to each thread within a DN.
- Local RoundRobin distributes data in polling mode across threads within a DN.
- Split Redistribute redistributes data across parallel threads on different DNs.
- Split Broadcast broadcasts data to all parallel DN threads in the cluster.
Among these operators, Local operators exchange data between parallel threads within a DN, and non-Local operators exchange data across DNs.
- ExampleFigure 1 TPCH Q1 parallel execution plan
In this plan, the Partitioned CStore Scan and HashAgg operators are processed in parallel, and the Local Gather and Split Redistribute operators are added.
In this example, No. 6 operator is Split Redistribute, and dop: 2/2 next to the operator indicates that the degree of parallelism (DOP) of the sender and receiver is 2. No. 4 operator is Local Gather and is marked with dop: 1/2. The DOP of its sender thread is 2 and that of its receiver thread is 1. That is, No. 5 operator Hash Aggregate at a lower layer is executed with a DOP of 2, while the No. 1 to 3 operators at upper layers are serially executed. In this way, No. 4 operator aggregates parallel thread data within a DN.
You can view the parallelism situation of each operator in the dop information.
- Operators are processed on CNs.
- Statements that cannot be pushed down are executed.
- The subplan and initplan of a query and operators containing a subquery are executed.