Collecting Partition-Level Statistics
Collecting Statistics on a Specified Partition
Collecting statistics on a single partition of the current partitioned table is supported. The partitions whose statistics have been collected will be automatically updated and maintained when the statistics are collected again. This function applies to list partitioning, hash partitioning, and range partitioning.
gaussdb=# CREATE TABLE only_fisrt_part(id int,name varchar)PARTITION BY RANGE (id) (PARTITION id11 VALUES LESS THAN (1000000), PARTITION id22 VALUES LESS THAN (2000000), PARTITION max_id1 VALUES LESS THAN (MAXVALUE)); gaussdb=# INSERT INTO only_fisrt_part SELECT generate_series(1,5000),'test'; gaussdb=# ANALYZE only_fisrt_part PARTITION (id11); gaussdb=# ANALYZE only_fisrt_part PARTITION (id22); gaussdb=# ANALYZE only_fisrt_part PARTITION (max_id1); gaussdb=# SELECT relname, relpages, reltuples FROM pg_partition WHERE relname IN ('id11', 'id22', 'max_id1'); relname | relpages | reltuples ---------+----------+----------- id11 | 3400 | 5000 id22 | 0 | 0 max_id1 | 0 | 0 (3 rows) gaussdb=# \x gaussdb=# SELECT * FROM pg_stats WHERE tablename ='only_fisrt_part' AND partitionname ='id11'; -[ RECORD 1 ]----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- schemaname | public tablename | only_fisrt_part attname | name inherited | f null_frac | 0 avg_width | 5 n_distinct | 1 n_dndistinct | 0 most_common_vals | {test} most_common_freqs | {1} histogram_bounds | correlation | 1 most_common_elems | most_common_elem_freqs | elem_count_histogram | partitionname | id11 subpartitionname | -[ RECORD 2 ]----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- schemaname | public tablename | only_fisrt_part attname | id inherited | f null_frac | 0 avg_width | 4 n_distinct | -1 n_dndistinct | 0 most_common_vals | most_common_freqs | histogram_bounds | {1,50,100,150,200,250,300,350,400,450,500,550,600,650,700,750,800,850,900,950,1000,1050,1100,1150,1200,1250,1300,1350,1400,1450,1500,1550,1600,1650,1700,1750,1800,1850,1900,1950,2000,2050,2100,2150,2200,2250,2300,2350,2400,2450,2500,2550,2600,2650,2700,2750,2800,2850,2900,2950,3000,3050,3100,3150,3200,3250,3300,3350,3400,3450,3500,3550,3600,3650,3700,3750,3800,3850,3900,3950,4000,4050,4100,4150,4200,4250,4300,4350,4400,4450,4500,4550,4600,4650,4700,4750,4800,4850,4900,4950,5000} correlation | 1 most_common_elems | most_common_elem_freqs | elem_count_histogram | partitionname | id11 subpartitionname | gaussdb=# q \x -- Delete the partitioned table. gaussdb=# DROP TABLE only_fisrt_part;
Optimizer Using the Statistics of a Specified Partition
The optimizer preferentially uses the statistics of the specified partition. If statistics are not collected for the specified partition, the optimizer rewrites the partition clause for pruning optimization. For details, see Rewriting a Partition Clause for Pruning Optimization.
gaussdb=# SET enable_fast_query_shipping = off; gaussdb=# CREATE TABLE ONLY_FIRST_PART_TWO ( C1 INT, C2 BIGINT ) PARTITION BY RANGE(C1) ( PARTITION P_1 VALUES LESS THAN (1000), PARTITION P_2 VALUES LESS THAN (3000), PARTITION P_3 VALUES LESS THAN (MAXVALUE) ); gaussdb=# INSERT INTO only_first_part_two SELECT generate_series(1,5000), 0; gaussdb=# EXPLAIN SELECT * FROM only_first_part_two PARTITION (p_2); QUERY PLAN --------------------------------------------------------------------------------- Streaming (type: GATHER) (cost=0.88..2.89 rows=30 width=12) Node/s: All datanodes -> Partition Iterator (cost=0.00..1.14 rows=30 width=12) Iterations: 1 -> Partitioned Seq Scan on only_first_part_two (cost=0.00..1.14 rows=30 width=12) Selected Partitions: 2 (6 rows) gaussdb=# EXPLAIN SELECT * FROM only_first_part_two PARTITION (p_1) where c2 = 2; QUERY PLAN ----------------------------------------------------------------------------------- Streaming (type: GATHER) (cost=0.06..1.30 rows=1 width=12) Node/s: All datanodes -> Partition Iterator (cost=0.00..1.18 rows=1 width=12) Iterations: 1 -> Partitioned Seq Scan on only_first_part_two (cost=0.00..1.18 rows=1 width=12) Filter: (c2 = 0) Selected Partitions: 1 (7 rows) gaussdb=# DROP TABLE only_fisrt_part_two;
Rewriting a Partition Clause for Pruning Optimization
If there is no partition-level statistics, the optimizer row quantity estimation module logically rewrites the pseudo-predicate of a partition clause, and uses the rewritten pseudo-predicate to affect the selectivity calculation and the statistics of the entire table to obtain a relatively accurate estimated row quantity.
- This feature applies only to selectivity calculation.
- The feature does not support level-2 partitions.
- This feature supports only range partitioning and list partitioning.
- For range partitioning, only single-column partitioning keys can be rewritten. Multi-column partitioning keys cannot be rewritten.
- For list partitioning, to ensure performance, the maximum number of enumerated values for a specified list partition is 40.
- When the number of enumerated values for a specified list partition exceeds 40, this feature is no longer applicable.
- For the default partition, the number of enumerated values is the total number of enumerated values in all non-default partitions.
Example 1: Rewriting a range partition
gaussdb=# CREATE TABLE test_int4_maxvalue(id INT, name VARCHAR)
PARTITION BY RANGE(id)
(
PARTITION id1 VALUES LESS THAN(1000),
PARTITION id2 VALUES LESS THAN(2000),
PARTITION max_id VALUES LESS THAN(MAXVALUE)
);
gaussdb=# INSERT INTO test_int4_maxvalue SELECT GENERATE_SERIES(1,5000),'test';
gaussdb=# ANALYZE test_int4_maxvalue with global;
-- Query the specified partition id1.
gaussdb=# EXPLAIN SELECT * FROM test_int4_maxvalue PARTITION(id1);
QUERY PLAN
----------------------------------------------------------------------------------------
Data Node Scan (cost=0.00..0.00 rows=0 width=0)
Node/s: All datanodes
Remote SQL: SELECT id, name FROM public.test_int4_maxvalue PARTITION (id1)
Datanode Name: d1_datanode1
Partition Iterator (cost=0.00..7.91 rows=491 width=9)
Iterations: 1
-> Partitioned Seq Scan on test_int4_maxvalue (cost=0.00..7.91 rows=491 width=9)
Selected Partitions: 1
Datanode Name: d1_datanode2
Partition Iterator (cost=0.00..8.08 rows=508 width=9)
Iterations: 1
-> Partitioned Seq Scan on test_int4_maxvalue (cost=0.00..8.08 rows=508 width=9)
Selected Partitions: 1
(16 rows)
-- Query the max_id of the specified partition.
gaussdb=# EXPLAIN SELECT * FROM test_int4_maxvalue PARTITION(max_id);
QUERY PLAN
------------------------------------------------------------------------------------------
Data Node Scan (cost=0.00..0.00 rows=0 width=0)
Node/s: All datanodes
Remote SQL: SELECT id, name FROM public.test_int4_maxvalue PARTITION (max_id)
Datanode Name: d1_datanode1
Partition Iterator (cost=0.00..24.46 rows=1546 width=9)
Iterations: 1
-> Partitioned Seq Scan on test_int4_maxvalue (cost=0.00..24.46 rows=1546 width=9)
Selected Partitions: 3
Datanode Name: d1_datanode2
Partition Iterator (cost=0.00..23.55 rows=1455 width=9)
Iterations: 1
-> Partitioned Seq Scan on test_int4_maxvalue (cost=0.00..23.55 rows=1455 width=9)
Selected Partitions: 3
(16 rows)
-- Delete the partitioned table.
gaussdb=# DROP TABLE test_int4_maxvalue;
gaussdb=# CREATE TABLE test_default ( c1 INT, c2 BIGINT ) PARTITION BY LIST(c2) ( PARTITION p_1 VALUES (10000, 20000), PARTITION p_2 VALUES (300000, 400000, 500000), PARTITION p_3 VALUES (DEFAULT) ); gaussdb=# INSERT INTO test_default SELECT GENERATE_SERIES(1, 1000), 10000; gaussdb=# INSERT INTO test_default SELECT GENERATE_SERIES(1001, 2000), 600000; gaussdb=# ANALYZE test_default with global; -- Query the specified partition p_1. gaussdb=# EXPLAIN SELECT * FROM test_default PARTITION(p_1); QUERY PLAN ----------------------------------------------------------------------------------- Data Node Scan (cost=0.00..0.00 rows=0 width=0) Node/s: All datanodes Remote SQL: SELECT c1, c2 FROM public.test_default PARTITION (p_1) Datanode Name: d1_datanode1 Partition Iterator (cost=0.00..7.92 rows=492 width=12) Iterations: 1 -> Partitioned Seq Scan on test_default (cost=0.00..7.92 rows=492 width=12) Selected Partitions: 1 Datanode Name: d1_datanode2 Partition Iterator (cost=0.00..8.08 rows=508 width=12) Iterations: 1 -> Partitioned Seq Scan on test_default (cost=0.00..8.08 rows=508 width=12) Selected Partitions: 1 (16 rows) -- Query the specified partition p_3. gaussdb=# EXPLAIN SELECT * FROM test_default PARTITION(p_3); QUERY PLAN ----------------------------------------------------------------------------------- Data Node Scan (cost=0.00..0.00 rows=0 width=0) Node/s: All datanodes Remote SQL: SELECT c1, c2 FROM public.test_default PARTITION (p_3) Datanode Name: d1_datanode1 Partition Iterator (cost=0.00..8.24 rows=524 width=12) Iterations: 1 -> Partitioned Seq Scan on test_default (cost=0.00..8.24 rows=524 width=12) Selected Partitions: 3 Datanode Name: d1_datanode2 Partition Iterator (cost=0.00..7.76 rows=476 width=12) Iterations: 1 -> Partitioned Seq Scan on test_default (cost=0.00..7.76 rows=476 width=12) Selected Partitions: 3 (16 rows) -- Delete the partitioned table. gaussdb=# DROP TABLE test_default;
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot