Help Center/ GaussDB/ Developer Guide(Centralized_8.x)/ Configuring GUC Parameters/ GUC Parameters/ Query Planning/ Other Optimizer Options

Updated on 2024-08-20 GMT+08:00

View PDF

Other Optimizer Options

cost_model_version

Parameter description: Specifies the version of the optimizer cost model. It can be regarded as a protection parameter to disable the latest optimizer cost model and keep consistent with the plan of the earlier version. If the value of this parameter is changed, many SQL plans may be changed. Therefore, exercise caution when performing this operation.

Parameter type: integer.

Unit: none

Value range: 0, 1, 2, 3, or 4

0 indicates that the latest cost estimation model is used. The current version is equivalent to 4.
1 indicates that the original cost estimation model is used.
2: indicates that the enhanced COALESCE expression, hash join cost, and semi/anti join cost are used for estimation on the basis of 1.
3: indicates that the boundary correction estimator is used to estimate the NDV on the basis of 2. The hint of indexscan can be applied to indexonlyscan.
4: indicates that partition-level statistics are used for cost estimation on the basis of 3.

Default value: 0

Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: When upgrading the database, you are advised to set this parameter the same as that of the source version. When installing a new environment, you are advised to set this parameter to the default value.

explain_dna_file

Parameter description: Sets explain_perf_mode to run to export object files in CSV format.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

The value of this parameter must be an absolute path plus a file name with the extension .csv.

Value range: a string.

Default value: empty

explain_perf_mode

Parameter description: Specifies the display format of the explain command.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: normal, pretty, summary, and run

normal indicates that the default printing format is used.
pretty indicates a new format improved by using GaussDB. The new format contains a plan node ID, directly and effectively analyzing performance.
summary indicates that the analysis result on this information is printed in addition to the printed information specified by pretty.
run indicates that the system exports the printed information specified by summary as a CSV file for further analysis.

The display sequence may vary greatly according to the display format of explain. The examples of the normal and pretty formats are described as follows:

Example of the normal format:

                                                                                 QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=21.23..21.23 rows=1 width=306)
   Sort Key: supplier.s_suppkey
   CTE revenue
     ->  HashAggregate  (cost=12.88..12.88 rows=1 width=76)
           Group By Key: lineitem.l_suppkey
           ->  Partition Iterator  (cost=0.00..12.87 rows=1 width=44)
                 Iterations: 7
                 ->  Partitioned Seq Scan on lineitem  (cost=0.00..12.87 rows=1 width=44)
                       Filter: ((l_shipdate >= '1996-01-01 00:00:00'::timestamp(0) without time zone) AND (l_shipdate < '1996-04-01 00:00:00'::timestamp without time zone))
                       Selected Partitions:  1..7
   InitPlan 2 (returns $3)
     ->  Aggregate  (cost=0.02..0.03 rows=1 width=64)
           ->  CTE Scan on revenue  (cost=0.00..0.02 rows=1 width=32)
   ->  Nested Loop  (cost=0.00..8.30 rows=1 width=306)
         ->  CTE Scan on revenue  (cost=0.00..0.02 rows=1 width=40)
               Filter: (total_revenue = $3)
         ->  Partition Iterator  (cost=0.00..8.27 rows=1 width=274)
               Iterations: 7
               ->  Partitioned Index Scan using supplier_s_suppkey_idx on supplier  (cost=0.00..8.27 rows=1 width=274)
                     Index Cond: (s_suppkey = revenue.supplier_no)
                     Selected Partitions:  1..7
(21 rows)

Example of the pretty format:

 id |                                  operation                                   | E-rows | E-width |    E-costs
----+------------------------------------------------------------------------------+--------+---------+----------------
  1 | ->  Sort                                                                     |      1 |     306 | 21.230..21.235
  2 |    ->  Nested Loop (3,9)                                                     |      1 |     306 | 0.000..8.303
  3 |       ->  CTE Scan on revenue                                                |      1 |      40 | 0.000..0.022
  4 |    ->  HashAggregate  [3, CTE revenue]                                       |      1 |      76 | 12.875..12.885
  5 |       ->  Partition Iterator                                                 |      1 |      44 | 0.000..12.865
  6 |          ->  Partitioned Seq Scan on lineitem                                |      1 |      44 | 0.000..12.865
  7 |    ->  Aggregate  [4, InitPlan 2 (returns $3)]                               |      1 |      64 | 0.022..0.033
  8 |       ->  CTE Scan on revenue                                                |      1 |      32 | 0.000..0.020
  9 |       ->  Partition Iterator                                                 |      1 |     274 | 0.000..8.270
 10 |          ->  Partitioned Index Scan using supplier_s_suppkey_idx on supplier |      1 |     274 | 0.000..8.270
(10 rows)

                                                         Predicate Information (identified by plan id)
---------------------------------------------------------------------------------------------------------------------------------------------------------------
   5 --Partition Iterator
         Iterations: 7
   6 --Partitioned Seq Scan on lineitem
         Filter: ((l_shipdate >= '1996-01-01 00:00:00'::timestamp(0) without time zone) AND (l_shipdate < '1996-04-01 00:00:00'::timestamp without time zone))
         Selected Partitions:  1..7
   3 --CTE Scan on revenue
         Filter: (total_revenue = $3)
   9 --Partition Iterator
         Iterations: 7
  10 --Partitioned Index Scan using supplier_s_suppkey_idx on supplier
         Index Cond: (s_suppkey = revenue.supplier_no)
         Selected Partitions:  1..7
(12 rows)

Note: The plan blocks in the preceding two formats are different display formats of the same plan. In the pretty format, the parts in bold are the CET and InitPlan plan blocks, which may be inserted in the middle of the join block. When the join block is being read, skip the CTE and InitPlan blocks to find the inner table of the corresponding join block.

Default value: pretty

analysis_options

Parameter description: Specifies whether to enable function options in the corresponding options to use the corresponding location functions, including data verification and performance statistics. For details, see the options in the value range.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: a string.

LLVM_COMPILE indicates that the codegen compilation time of each thread is displayed on the explain performance page.
HASH_CONFLICT indicates that the log in the gs_log directory of the database node process displays the statistics of the hash table, including the hash table size, hash link length, and hash conflict.
STREAM_DATA_CHECK indicates that a CRC check is performed on data before and after network data transmission.

Default value: ALL,on(),off(LLVM_COMPILE,HASH_CONFLICT,STREAM_DATA_CHECK), which indicates that no location function is enabled.

cost_param

Parameter description: Specifies use of different estimation methods in specific customer scenarios, allowing estimated values approximating to onsite values. This parameter can control various methods simultaneously by performing AND (&) on the bit of each method. A method is selected if the result value is not 0.

When cost_param & 1 is set to a value other than 0, an improved mechanism is used for connecting the selectivity of non-equi-joins. This method is more accurate for estimating the selectivity of joins between two identical tables. At present, cost_param & 1=0 is not used. That is, a better formula is selected for calculation.

When cost_param & 2 is set to a value other than 0, the selectivity is estimated based on multiple filter criteria. The lowest selectivity among all filter criteria, but not the product of the selectivities for two tables under a specific filter criterion, is used as the total selectivity. This method is more accurate when a close correlation exists between the columns to be filtered.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: an integer ranging from 0 to INT_MAX.

Default value: 0

var_eq_const_selectivity

Parameter description: Determines whether to use the new selectivity model to estimate the integer const selectivity.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: Boolean

on indicates that the new selectivity model is used to calculate the selectivity of the integer const.
- If an integer does not fall into the MCV, is not NULL, but falls into the histogram, the left and right boundaries of the histogram are used for estimation. If the integer does not fall into the histogram, the number of rows in the table is used for estimation.
- If the integer is NULL or falls into the MCV, the original logic is used to calculate the selectivity.
off indicates that the original selectivity calculation model is used.

Default value: off

enable_partitionwise

Parameter description: Specifies whether to select an intelligent algorithm for joining partitioned tables.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: Boolean

on indicates that an intelligent algorithm is selected.
off indicates that an intelligent algorithm is not selected.

Default value: off

partition_page_estimation

Parameter description: Determines whether to optimize the estimation of partitioned table pages based on the pruning result.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: Boolean

on indicates that the pruning result is used to optimize the page estimation.
off indicates that the pruning result is not used to optimize the page estimation.

Default value: off

partition_iterator_elimination

Parameter description: Determines whether to eliminate the partition iteration operator to improve execution efficiency when the partition pruning result of a partitioned table is a partition.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: Boolean

on indicates that the partition iteration operator is eliminated.
off indicates that the partition iteration operator is not eliminated.

Default value: off

enable_partition_pseudo_predicate

Parameter description: Specifies whether to rewrite pseudo-predicates to calculate the selectivity of query in a specified partition.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: Boolean

on indicates that pseudo-predicate rewriting is used.
off indicates that pseudo-predicate rewriting is not used.

Default value: off

enable_functional_dependency

Parameter description: Determines whether the statistics about multiple columns generated by ANALYZE contain function dependency statistics and whether the function dependency statistics are used to calculate the selectivity.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: Boolean

on indicates the following functions: 1. The statistics about multiple columns generated by ANALYZE contain function dependency statistics. 2. Function dependency statistics are used to calculate the selectivity.
off indicates the following functions: 1. The statistics about multiple columns generated by ANALYZE do not contain function dependency statistics. 2. Function dependency statistics are not used to calculate the selectivity.

Default value: off

rewrite_rule

Parameter description: Specifies the optional query rewriting rules that are enabled. Some query rewriting rules are optional. Enabling them cannot always improve the query efficiency. In a specific customer scenario, you can set the query rewriting rules through this GUC parameter to achieve optimal query efficiency.

This parameter can control the combination of query rewriting rules, for example, there are multiple rewriting rules: rule1, rule2, rule3, and rule4. You can perform the following settings:

set rewrite_rule=rule1;          -- Enable query rewriting rule rule1
set rewrite_rule=rule2, rule3;     -- Enable query rewriting rules rule2 and rule3
set rewrite_rule=none;         -- Disable all optional query rewriting rules

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: a string

none: No optional query rewriting rules are used.
lazyagg: The Lazy Agg query rewriting rules are used to eliminate aggregation operations in subqueries.
magicset: The Magic Set query rewriting rules are used to associate subqueries which have aggregation operators with the main query in advance to reduce repeated scanning of sublinks.
uniquecheck: The Unique Check query rewriting rules are used to optimize the subquery statements in target columns without agg and check whether the number of returned rows is 1.
intargetlist: The In Target List query rewriting rules are used to improve subqueries in the target column.
predpushnormal: The Predicate Push query rewriting rules are used to push the predicate condition to the subquery.
predpushforce: The Predicate Push query rewriting rules are used to push down predicate conditions to subqueries and use indexes as much as possible for acceleration.
predpush: The optimal plan is selected based on the cost in predpushnormal and predpushforce.
disable_pullup_expr_sublink: The optimizer is not allowed to pull up sublinks of the expr_sublink type. For details about sublink classification and pull-up principles, see "SQL Optimization > Typical SQL Optimization Methods > Optimizing Subqueries" in Developer Guide.
enable_sublink_pullup_enhanced: Enhanced sublink query rewriting rules are used, including unrelated sublink pull-up of the WHERE and HAVING clauses and WinMagic rewriting optimization.
disable_pullup_not_in_sublink: The optimizer is not allowed to pull up sublinks related to NOT IN. For details about sublink classification and pull-up principles, see "SQL Optimization > Typical SQL Optimization Methods > Optimizing Subqueries" in Developer Guide.
disable_rownum_pushdown: The filter criterion ROWNUM in the parent query cannot be pushed down to the subquery.
disable_windowagg_pushdown: The filter criterion of the window function in the parent query cannot be pushed down to the subquery.

Default value: magicset

The partialpush and disablerep parameters can be set but do not take effect.

enable_pbe_optimization

Parameter description: Specifies whether the optimizer optimizes the query plan for statements executed in Parse Bind Execute (PBE) mode.

Parameter type: Boolean.

Unit: none

Value range:

on indicates that the optimizer optimizes the query plan for statements executed in PBE mode.
off indicates that the optimization is not performed.

Default value: off

Setting method: This is a SUSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value.

enable_global_plancache

Parameter description: Specifies whether to share the cache for the execution plans of statements in PBE queries and stored procedures. Enabling this function can reduce the memory usage of database nodes in high concurrency scenarios.

When enable_global_plancache is enabled, the value of local_syscache_threshold cannot be less than 16 MB to ensure that GPC takes effect. If the value of local_syscache_threshold is less than 16 MB, set it to 16 MB. If the value is greater than 16 MB, the actual value is used.

Parameter type: Boolean.

Unit: none

Value range:

on indicates that cache sharing is enabled for the execution plans of statements in PBE queries and stored procedures.
off indicates no sharing.

Default value: off

Setting method: This is a POSTMASTER parameter. Set it based on instructions provided in Table 1.

gpc_clean_timeout

Parameter description: When enable_global_plancache is set to on, if a plan in the shared plan list is not used within the period specified by gpc_clean_timeout, the plan will be deleted. This parameter is used to control the retention period of a shared plan that is not used.

This is a SIGHUP parameter. Set it based on instructions provided in Table 1.

Value range: an integer ranging from 300 to 86400

The unit is second.

Default value: 1800, that is, 30 minutes

enable_global_stats

This parameter has been discarded in the current version. Do not set it.

enable_opfusion

Parameter description: Specifies whether to optimize simple addition, deletion, modification, and query operations.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

The restrictions on simple queries are as follows:

Only index scan and index-only scan are supported, and the filter criteria of all WHERE statements are on indexes.
Only single tables can be added, deleted, modified, and queried. JOIN and USING operations are not supported.
Only row-store tables are supported. Partitioned tables and tables with triggers are not supported.
Information statistics features of active SQL statements and queries per second (QPS) are not supported.
Tables that are being scaled out or in are not supported.
System columns cannot be queried or modified.
Only simple SELECT statements are supported. For example:
```
SELECT c3 FROM t1 WHERE c1 = ? and c2 =10; 
```
Only columns in the target table can be queried. Columns c1 and c2 are index columns, which can be followed by constants or parameters. You can use for update.

Only simple INSERT statements are supported. For example:
```
INSERT INTO t1 VALUES (?,10,?); 
```
Only one VALUES is supported. The type in VALUES can be a constant or a parameter. RETURNING is not supported.

Only simple DELETE statements are supported. For example:
```
DELETE FROM t1 WHERE c1 = ? and c2 = 10;  
```
Columns c1 and c2 are index columns, which can be followed by constants or parameters.

Only simple UPDATE statements are supported. For example:
```
UPDATE t1 SET c3 = c3+? WHERE c1 = ? and c2 = 10; 
```
The values modified in column c3 can be constants, parameters, or a simple expression. Columns c1 and c2 are index columns, which can be followed by constants or parameters.

Value range: Boolean

on: used.
off: not used.

Default value: on

enable_plsql_opfusion

Parameter description: Optimizes simple add, delete, modify, and query statements in stored procedures to improve SQL execution performance.

For details about restrictions on simple add, delete, modify, and query statements, see enable_opfusion.

This parameter takes effect only when enable_opfusion is enabled.

Parameter type: Boolean.

Unit: none

Value range:

on: used.
off: not used.

Default value: on

Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value.

sql_beta_feature

Parameter description: Specifies the SQL engine's optional beta features to be enabled, including optimization of row count estimation and query equivalence estimation.

These optional features provide optimization for specific scenarios, but performance deterioration may occur in some scenarios for which testing is not performed. In a specific customer scenario, you can set the query rewriting rules through this GUC parameter to achieve optimal query efficiency.

This parameter determines the combination of the SQL engine's beta features, for example, feature1, feature2, feature3, and feature4. You can perform the following settings:

-- Enable beta feature feature1 of the SQL engine.
set sql_beta_feature=feature1;
-- Enable beta features feature2 and feature3 of the SQL engine.
set sql_beta_feature=feature2,feature3;
-- Disable all optional beta features of the SQL engine.
set sql_beta_feature=none;

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: a string

none: uses none of the beta optimizer features.
sel_semi_poisson: uses Poisson distribution to calibrate the equivalent semi-join and anti-join selectivity.
sel_expr_instr: uses the matching row count estimation method to provide more accurate estimation for instr(col, 'const') > 0, = 0, = 1.
param_path_gen: generates more possible parameterized paths.
rand_cost_opt: optimizes the random read cost of tables that have a small amount of data.
param_path_opt: uses the bloating ratio of the table to optimize the analysis information of indexes.
page_est_opt: optimizes the relpages estimation for the analysis information of table indexes.
no_unique_index_first: disables optimization of the primary key index scan path first.
join_sel_with_cast_func: supports type conversion functions when the number of join rows is estimated.
canonical_pathkey: The regular path key is generated in advance. (pathkey: a set of ordered key values of data.)

After this parameter is enabled, the semantics of the output data of statements such as ORDER BY may be different from that of the standard ones in the case of outer join. Contact Huawei technical support to determine whether to enable this parameter.
index_cost_with_leaf_pages_only: specifies whether index leaf nodes are included when the index cost is estimated.
a_style_coerce: enables the Decode type conversion rule to be compatible with O. For details, see the part related to case processing in ORA compatibility mode in "SQL Reference > Type Conversion > UNION, CASE, and Related Constructs" in Developer Guide.
predpush_same_level: enables the predpush hint to control parameterized paths at the same layer.
enable_plsql_smp: enables parallel execution of queries in stored procedures. Currently, only one query can be executed in parallel at a time, and no parallel execution plan is generated for autonomous transactions and queries in exceptions.
disable_bitmap_cost_with_lossy_pages: disables the computation of the cost of lossy pages in the bitmap path cost.
enable_upsert_execute_gplan: allows execution through gplan in the PBE scenario, if the UPDATE clause in the ON DUPLICATE KEY UPDATE statement contains parameters.
disable_merge_append_partition: disables the generation of the Merge Append path for partitioned tables.
disable_text_expr_flatten: disables the function of automatically inlining expressions during comparison between text and numeric types (numeric, bigint).

Default value: "sel_semi_poisson,sel_expr_instr,rand_cost_opt,param_path_opt,page_est_opt"

default_statistics_target

Parameter description: Specifies the default statistics target for table columns without a column-specific target set by running ALTER TABLE SET STATISTICS. The number of rows sampled during statistics collection is affected.

If this parameter is set to a positive number, the number of rows sampled in the statistics histogram is default_statistics_target x 300. If the parameter is set to a negative number, it indicates the percentage of statistics collected. The negative number corresponds to a percentage, for example, –5 means 5%. That is, the number of sampled rows is the total number of rows multiplied by 5%. This parameter affects only the target number of sampled rows in the statistics. The actual number of sampled rows is also affected by the memory parameter maintenance_work_mem.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: an integer ranging from –100 to 10000.

A larger positive number than the default value increases the time required to do ANALYZE, but might improve the quality of the optimizer's estimates.
Changing settings of this parameter may result in performance deterioration. If query performance deteriorates, you can:
1. Restore to the default statistics.
2. Use hints to force the optimizer to use the optimal query plan. For details, see "SQL Optimization > Hint-based Tuning" in Developer Guide.

Default value: 100

auto_statistic_ext_columns

Parameter description: Collects statistics about multiple columns based on the first K columns of the composite index in the data table. This GUC parameter indicates K. For example, if a composite index is (a,b,c,d,e) and the GUC parameter is set to 3, statistics about multiple columns are generated on columns (a,b) and (a,b,c). Multi-column statistics can make the optimizer estimate the cardinality more accurate when querying with combined conditions.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

The system catalog does not take effect.
The statistics take effect only when the types of all columns support the comparison functions '=' and '<'.
System pseudocolumns in indexes, such as tableoid and ctid, are not collected.
By default, distinct values, MCVs without NULL, and MCVs with NULL are collected. If the AI-based cardinality estimation parameter enable_ai_stats is enabled, MCVs are not collected. Instead, models for AI-based cardinality estimation are collected.
If the index for creating multi-column statistics is deleted and no other index contains the multi-column combination, the multi-column statistics will be deleted in the next ANALYZE operation.
If the value of this parameter decreases, the new index generates multi-column statistics based on the value of this parameter. The generated multi-column statistics that exceed the value of this parameter will not be deleted.
If you want to disable the multi-column statistics on a specific combination only, you can retain the value of this parameter and run the DDL command ALTER TABLE tablename disable statistics ((column list)) to disable the statistics on multiple columns in a specific combination.

Value range: an integer ranging from 1 to 4. The value 1 indicates that statistics about multiple columns are not automatically collected.

Default value: 1

constraint_exclusion

Parameter description: Specifies the query optimizer's use of table constraints to optimize queries.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: enumerated values

on indicates that constraints for all tables are examined.
off indicates that constraints for any table are not examined.
partition indicates that only constraints for inheritance child tables and UNION ALL subqueries are examined.

When constraint_exclusion is set to on, the optimizer compares query conditions with the table's CHECK constraints, and omits scanning tables for which the conditions contradict the constraints.

Default value: partition

Currently, constraint_exclusion is enabled by default only for cases that are often used to implement table partitioning. Turning this feature on for all tables imposes extra planning on simple queries, and provides no benefit for simple queries. If you have no partitioned tables, set it to off.

cursor_tuple_fraction

Parameter description: Specifies the optimizer's estimated fraction of a cursor's rows that are retrieved.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: a floating-point number ranging from 0.0 to 1.0.

Smaller values of this setting bias the optimizer towards using fast start plans for cursors, which will retrieve the first few rows quickly while perhaps taking a long time to fetch all rows. Larger values put more emphasis on the total estimated time. At the maximum setting of 1.0, cursors are planned exactly like regular queries, considering only the total estimated time and how soon the first rows might be delivered.

Default value: 0.1

from_collapse_limit

Parameter description: Specifies whether the optimizer merges sub-queries into upper queries based on the resulting FROM list. The optimizer merges sub-queries into upper queries if the resulting FROM list would have no more than this many items.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: an integer ranging from 1 to INT_MAX.

Smaller values reduce planning time but may lead to inferior execution plans.

Default value: 8

join_collapse_limit

Parameter description: Specifies whether the optimizer rewrites JOIN constructs (except FULL JOINS) into lists of FROM items based on the number of the items in the result list.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: an integer ranging from 1 to INT_MAX.

Setting this parameter to 1 prevents join reordering. As a result, the join order specified in the query will be the actual order in which the relations are joined. The query optimizer does not always choose the optimal join order. Therefore, advanced users can temporarily set this variable to 1, and then specify the join order they desire explicitly.
Smaller values reduce planning time but lead to inferior execution plans.

Default value: 8

plan_mode_seed

Parameter description: This is a debugging parameter. Currently, it supports only OPTIMIZE_PLAN and RANDOM_PLAN. The value 0 (for OPTIMIZE_PLAN) indicates the optimized plan using the dynamic planning algorithm. Other values are for RANDOM_PLAN, which indicates that the plan is randomly generated. –1 indicates that users do not specify the value of the seed identifier. In this case, the optimizer generates a random integer from 1 to 2147483647 and a random execution plan based on the generated integer. A GUC parameter value from 1 to 2147483647 is regarded as the seed identifier, based on which the optimizer generates a random execution plan.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: an integer ranging from –1 to 2147483647.

Default value: 0

If this parameter is set to a random execution plan, the optimizer generates a random execution plan that may not be the optimal one. Therefore, to guarantee the query performance, the default value 0 is recommended during upgrade, scale-out, scale-in, and O&M.
If this parameter is not set to 0, the specified plan hint will not be used.

hashagg_table_size

Parameter description: Specifies the hash table size during the execution of the HASH JOIN operation.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: an integer ranging from 0 to INT_MAX/2.

Default value: 0

enable_codegen

Parameter description: Specifies whether code optimization can be enabled. Currently, the code optimization uses the LLVM optimization.

Parameter type: Boolean.

Unit: none

Value range:

on indicates that code optimization is enabled.
off indicates that code optimization is disabled.

Default value: on

Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value.

codegen_compile_thread_num

Parameter description: Specifies the number of Codegen compilation threads.

Parameter type: integer.

Unit: none

Value range: 1 to 8

Default value: 1

Setting method: This is a SIGHUP parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value. If the number of threads is too large, the system performance may deteriorate. However, when there are a large number of concurrent services, you can increase the number of threads to improve the throughput performance.

llvm_max_memory

Parameter description: Specifies the maximum memory occupied by IRs (including cached and in-use IRs) generated during Codegen compilation. The memory used by Codegen is not applied for by preoccupation. It is a part of max_dynamic_memory and is restricted by the llvm_max_memory parameter.

Parameter type: integer.

Unit: KB

Value range: 0 to 2147483647. If the value exceeds the specified value, the original recursive execution logic, instead of the Codegen execution logic, is used. When the upper limit is reached and a downgrade is triggered, decreasing the value of llvm_max_memory cannot immediately release the memory occupied by extra IRs. The memory occupied by IRs is released after the corresponding SQL statements are executed.

Default value: 131072kB (128 MB)

Setting method: This is a SIGHUP parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value.

If the parameter is set to an excessively small value, the system does not use the Codegen execution logic, affecting the use of functions.
If the parameter is set to an excessively large value, LLVM compilation may occupy too many resources of other threads. As a result, the overall system performance deteriorates.

enable_codegen_print

Parameter description: Specifies whether the LLVM IR function can be printed in logs.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: Boolean

on indicates that the IR function can be printed in logs.
off indicates that the IR function cannot be printed in logs.

Default value: off

codegen_cost_threshold

Parameter description: The LLVM compilation takes some time to generate executable machine code. Therefore, LLVM compilation is beneficial only when the actual execution cost is more than the sum of the code required for generating machine code and the optimized execution cost. Parameter codegen_cost_threshold specifies a threshold. If the estimated execution cost exceeds the threshold, LLVM optimization is performed. codegen uses plan_rows of the execution operator as the cost to compare with the value of codegen_cost_threshold. You can run the explain command to view the value of plan_rows.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: an integer ranging from 0 to 2147483647.

Default value: 100000

enable_bloom_filter

Parameter description: Specifies whether the BloomFilter optimization can be used. This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: Boolean

on indicates that the BloomFilter optimization can be used.
off indicates that the BloomFilter optimization cannot be used.

Default value: on

enable_extrapolation_stats

Parameter description: Specifies whether the extrapolation logic is used for data of DATE type based on historical statistics. The logic can increase the accuracy of estimation for tables whose statistics are not collected in time, but will possibly provide an overlarge estimation due to incorrect extrapolation. Enable the logic only in scenarios where the data of DATE type is periodically inserted. This is a SUSET parameter. Set it based on instructions provided in Table 1.

Value range: Boolean

on indicates that the extrapolation logic is used.
off indicates that the extrapolation logic is not used.

Default value: off

autoanalyze

Parameter description: Specifies whether to automatically collect statistics on tables that have no statistics when a plan is generated. autoanalyze cannot be used for foreign or temporary tables. To collect statistics, manually perform the ANALYZE operation. If an exception occurs in the database during the execution of autoanalyze on a table, after the database is recovered, the system may still prompt you to collect the statistics of the table when you run the statement again. In this case, manually perform the ANALYZE operation on the table to synchronize statistics.

Parameter type: Boolean.

Unit: none

Value range:

on indicates that the table statistics are automatically collected.
off indicates that the table statistics are not automatically collected.

Default value: off

Setting method: This is a SUSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value.

This parameter does not take effect in centralized mode.

enable_analyze_check

Parameter description: Specifies whether it is allowed to check whether statistics were collected about tables whose reltuples and relpages are displayed as 0 in pg_class during plan generation.

Parameter type: Boolean.

Unit: none

Value range:

on indicates that the tables will be checked.
off indicates that the tables will not be checked.

Default value: off

Setting method: This is a SUSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value.

enable_sonic_hashagg

Parameter description: Specifies whether to use the hash aggregation operator designed for column-oriented hash tables when certain constraints are met.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: Boolean

on indicates that the hash aggregation operator designed for column-oriented hash tables is used when certain constraints are met.
off indicates that the hash aggregation operator designed for column-oriented hash tables is not used.

If enable_sonic_hashagg is enabled and the hash aggregation operator designed based on the column-oriented hash table is used when the query meets the constraint condition, the memory usage of the hash aggregation operator can be reduced. However, in scenarios where enable_codegen is enabled and the performance is significantly improved, the performance of the operator may deteriorate.
If enable_sonic_hashagg is enabled and the hash aggregation operator designed based on the column-oriented hash table is used when the query meets the constraint condition, the operator is displayed as Sonic Hash Aggregation in the execution plan and execution information of Explain Analyze/Performance; when the query does not meet the constraint condition, the operator is displayed as Hash Aggregation. For details, see "SQL Optimization > Introduction to the SQL Execution Plan > Description" in Developer Guide.

Default value: on

enable_sonic_hashjoin

Parameter description: Specifies whether to use the hash join operator designed for column-oriented hash tables when certain constraints are met.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: Boolean

on indicates that the hash join operator designed for column-oriented hash tables is used when certain constraints are met.
off indicates that the hash join operator designed for column-oriented hash tables is not used.

Currently, the parameter can be used only for Inner Join.
If enable_sonic_hashjoin is enabled, the memory usage of query using the Hash Inner operator can be reduced. However, in scenarios where the code generation technology can significantly improve performance, the performance of the operator may deteriorate.
If enable_sonic_hashjoin is enabled and the hash join operator designed based on the column-oriented hash table is used when the query meets the constraint condition, the operator is displayed as Sonic Hash Join in the execution plan and execution information of Explain Analyze/Performance; when the query does not meet the constraint condition, the operator is displayed as Hash Join. For details, see "SQL Optimization > Introduction to the SQL Execution Plan > Description" in Developer Guide.

Default value: on

enable_sonic_optspill

Parameter description: Specifies whether to optimize the number of files to be written to disks for the hash join operator designed for column-oriented hash tables. If this parameter is enabled, the number of files written to disks does not increase significantly when the hash join operator writes a large number of files to disks.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: Boolean

on indicates that the number of files to be written to disks for the hash join operator designed for column-oriented hash tables is optimized.
off indicates that the number of files to be written to disks for the hash join operator designed for column-oriented hash tables is not optimized.

Default value: on

plan_cache_mode

Parameter description: Specifies the policy for generating an execution plan in the prepared statement.

Parameter type: enumerated type

Unit: none

Value range:

auto indicates that the custom plan or generic plan is selected by default.
force_generic_plan indicates that the generic plan is forcibly used (soft parsing). The generic plan is a plan generated after you run a prepared statement. The plan policy binds parameters to the plan when you run the EXECUTE statement to execute the plan. The advantage of this scheme is that repeated optimizer overheads can be avoided in each execution. The disadvantage is that the plan may not be optimal when data skew occurs for the bound parameters and may result in poor plan execution performance. The bound parameters bind the types of parameters transferred for the first time. If the type of a parameter transferred into the same placeholder is different from the previous time, an error is reported.
force_custom_plan indicates that the custom plan is forcibly used (hard parsing). The custom plan is a plan generated after you run a prepared statement where parameters in the EXECUTE statement are embedded. The custom plan generates a plan based on specific parameters in the EXECUTE statement. This scheme generates a preferred plan based on specific parameters each time and has good execution performance. The disadvantage is that the plan needs to be regenerated before each execution, resulting in a large amount of repeated optimizer overhead.

This parameter is valid only for prepared statements. It is used when the parameterized field in a prepared statement has severe data skew.

Default value: auto

Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Set this parameter based on the actual service scenario.

enable_hypo_index

Parameter description: Determines whether the optimizer creates virtual indexes when executing the EXPLAIN command.

Parameter type: Boolean.

Unit: none

Value range:

on: A virtual index is created when the EXPLAIN command is executed.
off: No virtual index is created when the EXPLAIN command is executed.

Default value: off

Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value.

enable_auto_explain

Parameter description: Specifies whether to enable the function of automatically printing execution plans. This parameter is used to locate slow stored procedures or slow queries. It is valid for the currently connected database primary node and directly connected standby node.

Parameter type: Boolean.

Unit: none

Value range:

true: enabled.
false: disabled.

Default value: false

Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value. If you want to view the execution plan, enable this parameter. However, this causes the current system performance to deteriorate.

auto_explain_level

Parameter description: Specifies the log level for automatically printing execution plans.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: enumerated type. The value can be LOG or NOTICE.

LOG: Execution plans are printed as logs.
NOTICE: Execution plans are printed as notices.

Default value: LOG

auto_explain_log_min_duration

Parameter description: Specifies the minimum duration of execution plans that are automatically printed. Only execution plans whose duration is greater than the value of auto_explain_log_min_duration will be printed. For example, if this parameter is set to 0, all executed plans are printed. If this parameter is set to 3000, all executed plans are printed if the execution of a statement takes more than 3000 ms.

Parameter type: integer.

Unit: millisecond

Value range: 0 to 2147483647

Default value: 0

Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value.

query_dop

Parameter description: Specifies the user-defined degree of parallelism (DOP). If the SMP function is enabled, the system uses the specified degree of parallelism.

Parameter type: integer.

Unit: none

Value range: 1 to 64. 1 indicates that parallel query is disabled.

Default value: 1

Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value.

After enabling concurrent queries, ensure you have sufficient CPU, memory, and network to achieve the optimal performance.

enable_startwith_debug

Parameter description: Specifies whether to enable the start with or connect by parameter for debugging. If this parameter is enabled, information about all tail columns related to the start with feature is displayed.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: Boolean. The value true indicates that the function is enabled, and the value false indicates that the function is disabled.

Default value: false

enable_inner_unique_opt

Parameter description: Specifies that Inner Unique is optimized for nested loop join, hash join, and sort merge join. That is, the number of matching times is reduced when the attribute corresponding to the inner table in the join condition meets the uniqueness constraint.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: Boolean

on: used.
off: not used.

Default value: on

enable_indexscan_optimization

Parameter description: Specifies whether to optimize B-tree index scan (IndexScan and IndexOnlyScan) in Astore.

This is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: Boolean

on: used.
off: not used.

Default value: on

immediate_analyze_threshold

Parameter description: Specifies the threshold for automatically analyzing inserted data. When the amount of data inserted at a time reaches the original data amount multiplied by the value of immediate_analyze_threshold and the total number of rows exceeds 100, ANALYZE is automatically triggered.

Parameter type: integer.

Unit: none

Value range: 0 to 1000. If this parameter is set to 0, this function is disabled.

Default value: 0

Setting method: This is a SIGHUP parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Set this parameter to a small value for tables whose data changes rapidly and whose statistics need to be updated continuously. Set this parameter to a large value for tables whose statistics fluctuate greatly only after a certain amount of data is reached.

This function supports only permanent and unlogged tables. Temporary tables are not supported.
ANALYZE is not automatically triggered twice within 10 seconds for the same table.

enable_invisible_indexes

Parameter description: Specifies whether the optimizer can use invisible indexes.

After an index is set to invisible, the performance of query statements may be affected. If you do not want to change the index visibility status and want to use invisible indexes, set enable_invisible_indexes to on.

Parameter type: Boolean.

Unit: none

Value range:

on: The optimizer can use invisible indexes.
off: The optimizer cannot use invisible indexes.

Default value: off

Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value.

enable_dynamic_samplesize

Parameter description: Specifies whether to dynamically adjust the number of sampled rows. For a large table with more than one million rows, the number of sampled rows is dynamically adjusted during statistics collection to improve statistics accuracy.

Parameter type: Boolean.

Unit: none

Value range:

on: indicates that the function is enabled.

off: indicates that the function is disabled.

Default value: on

Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value.

The function of dynamically adjusting the number of sampled rows supports only absolute sampling.

STATS_HISTORY_RECORD_LIMIT

Parameter description: Specifies the maximum number of historical statistics that can be retained for each object (including tables, columns, partitions, and indexes). When collecting statistics about an object, the system saves the statistics to the historical statistics table. When the number of statistics about the object in the historical statistics table reaches the threshold and new statistics are collected, the earlier statistics are deleted.

Parameter type: integer.

Unit: none

Value range: 0 to 100

Default value: 10

Setting method: This is a SIGHUP parameter. Set it based on instructions provided in Table 1.

Setting suggestion: The default value is recommended. If you need to record statistics of more historical versions, increase the value of this parameter. However, the performance of ANALYZE may be affected.

STATS_HISTORY_RETENTION_TIME

Parameter description: Specifies the retention period of historical statistics about each object (including tables, columns, partitions, and indexes). When collecting statistics about an object, the system saves the statistics to the historical statistics table. If the retention period of the statistics about the object in the historical statistics table exceeds the threshold, the system deletes the statistics that exceed the retention period when collecting new statistics.

Parameter type: floating point

Unit: day

Value range: –1 or a value ranging from 0 to 365000. –1 indicates that the history statistics are not cleared over time.

Default value: 31

Setting method: This is a SIGHUP parameter. Set it based on instructions provided in Table 1.

Setting suggestion: The default value is recommended. If you need to record statistics of earlier versions, increase the value of this parameter. However, the performance of ANALYZE may be affected.

default_statistic_granularity

Parameter description: Specifies which partition-level statistics of a partitioned table are collected by default when PARTITION_MODE is not specified. This parameter does not take effect for non-partitioned tables.

Parameter type: enumerated type

Unit: none

Value range: enumerated values

ALL: collects statistics about the entire table, level-1 partitions, and level-2 partitions.
GLOBAL: indicates that the statistics of the entire table are collected.
PARTITION: indicates that statistics of the level-1 partition are collected.
GLOBAL_AND_PARTITION: indicates that statistics about the entire table and level-1 partitions are collected.
SUBPARTITION: collects statistics about level-2 partitions.
ALL_COMPLETE: collects statistics about the entire table, level-1 partitions, and level-2 partitions.

Default value: ALL

Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value. If partition-level statistics need to be collected, you can set this parameter as required. However, the ANALYZE performance may be affected.

Parent topic: Query Planning

Previous topic: Genetic Query Optimizer

Next topic: SPM