Impact of Partitioned Tables on Import Performance
In the GaussDB kernel implementation, compared with the non-partitioned table, the partitioned table has partition routing overheads during data insertion. The overall data insertion overheads include: (1) heap base table insertion and (2) partition routing, as shown in Figure 1. The heap base table insertion solves the problem of importing tuples to the corresponding heap table and is shared by ordinary tables and partitioned tables. The partition routing solves the problem that the tuple is inserted into the corresponding partRel. In addition, the partition routing algorithm is shared by partitions and level-2 partitions. The difference is that the level-2 partition has one more routing operation than the partition, and calls the routing algorithm twice.
- Heap base table insertion in a partitioned table:
- The operator noise floor is optimized.
- Heap data insertion is optimized.
- Index insertion build (with indexes) is optimized.
- Partition routing in a partitioned table:
- The logic of the routing search algorithm is optimized.
- The routing noise floor is optimized, including enabling the partRel handle of the partitioned table and adding the logic overhead of function calling.
The performance of partition routing is reflected by a single INSERT statement with a large amount of data. In the UPDATE scenario, the system searches for the tuple to be updated, deletes the tuple, and then inserts new tuple. Therefore, the performance is not as good as that of a single INSERT statement.
Partitioning Type |
Routing Algorithm Complexity |
Implementation Description |
---|---|---|
Range partitioning |
O(logN) |
Implemented based on binary search |
Interval partitioning |
O(logN) |
Implemented based on binary search |
Hash partitioning |
O(1) |
Implemented based on the key-partOid hash table |
List partitioning |
O(1) |
Implemented based on the key-partOid hash table |
List-list partitioning |
O(1) + O(1) |
Implemented based on a hash table and another hash table |
List-range partitioning |
O(1) + O(1) = O(1) |
Implemented based on a hash table and binary search |
List-hash partitioning |
O(1) + O(1) = O(1) |
Implemented based on a hash table and another hash table |
Range-list partitioning |
O(1) + O(1) = O(1) |
Implemented based binary search and a hash table |
Range-range partitioning |
O(1) + O(1) = O(1) |
Implemented based on binary search and another binary search |
Range-hash partitioning |
O(1) + O(1) = O(1) |
Implemented based binary search and a hash table |
Hash-list partitioning |
O(1) + O(1) = O(1) |
Implemented based on a hash table and another hash table |
Hash-range partitioning |
O(1) + O(1) = O(1) |
Implemented based on a hash table and binary search |
Hash-hash partitioning |
O(1) + O(1) = O(1) |
Implemented based on a hash table and another hash table |
- In the x86 server scenario, the import performance of a partitioned table is 10% lower than that of an ordinary table, and the import performance of a level-2 partitioned table is 20% lower than that of an ordinary table.
- In the ARM server scenario, the performance decreases by 20% and 30% respectively. The main reason is that routing is performed in the in-memory computing enhancement scenario. The single-core instruction processing capability of mainstream x86 CPUs is slightly better than that of ARM CPUs.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot