Updated on 2025-05-29 GMT+08:00

Genetic Query Optimizer

This section describes parameters related to genetic query optimizer. The genetic query optimizer (GEQO) is an algorithm that plans queries by using heuristic searching. This algorithm reduces planning time for complex queries and the costs of producing plans are sometimes inferior to those found by the normal exhaustive-search algorithm.

geqo

Parameter description: Specifies whether to enable the genetic query optimization.

If this parameter is modified by running the gs_guc reload command and the connection of a session on the current node is not from the client but from another node in the cluster to which the node belongs, this parameter does not take effect immediately on the session after the gs_guc reload command is executed. The setting takes effect only after the connection node is disconnected and then reconnected.

Parameter type: Boolean.

Unit: none

Value range:

  • on: used.
  • off: not used.

Default value: on

Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Generally, do not disable the parameter during the execution. geqo_threshold provides more subtle control of GEQO.

Risks and impacts of improper settings: If the GEQO parameter is disabled, dynamic planning is still used to enumerate all combinations when a query statement involves a large number of tables. As a result, the optimizer generates a plan with high cost.

geqo_threshold

Parameter description: Specifies the number of FROM items. Genetic query optimization is used to plan queries when the number of statements executed is greater than this value.

  • For simpler queries, it is best to use the regular, exhaustive-search planner; but for queries with many tables, it is better to use GEQO to manage the queries.
  • A FULL OUTER JOIN construct counts as only one FROM item.

Parameter type: integer.

Unit: none

Value range: 2 to 2147483647

Default value: 12

Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value. If the performance overhead of dynamic planning is acceptable, increase the value of this parameter. If the performance overhead of dynamic planning is unacceptable, decrease the value of this parameter.

Risks and impacts of improper settings: If the value is too large, the cost of enumerating all plans in dynamic planning may be unacceptable. If the value is too small, the plan accuracy may be affected.

geqo_effort

Parameter description: Specifies the trade-off between planning time and query plan quality in GEQO.

Parameter type: integer.

Unit: none

Value range: 1 to 10

Default value: 5

  • geqo_effort does not do anything directly. This parameter is only used to compute the default values for the other variables that influence GEQO behavior. You can also manually set other parameters.
  • Larger values increase the time spent in query planning, but also increase the probability that an efficient query plan is chosen.

Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value.

Risks and impacts of improper settings: If the value is too large, the query planning cost may be high. If the value is too small, the quality of the generated query plan may be poor.

geqo_pool_size

Parameter description: Specifies the pool size used by GEQO, that is, the number of individuals in the genetic population.

Parameter type: integer.

Unit: none

Value range: 0 to 2147483647

Default value: 0

If this parameter is set to 0, GaussDB selects a proper value based on geqo_effort and the number of tables. In other cases, the value of the parameter is at least 2, and the useful value typically ranges from 100 to 1000.

Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value.

Risks and impacts of improper settings: If the value is too large, the query planning cost may be high. If the value is too small, the quality of the generated query plan may be poor.

geqo_generations

Parameter description: Specifies the number of iterations of the GEQO.

Parameter type: integer.

Unit: none

Value range: 0 to 2147483647

Default value: 0

If it is set to 0, a suitable value is chosen based on geqo_pool_size. In other cases, the value of this parameter is at least 1, and useful values are typically from 100 to 1000.

Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value.

Risks and impacts of improper settings: If the value is too large, the query planning cost may be high. If the value is too small, the quality of the generated query plan may be poor.

geqo_selection_bias

Parameter description: Specifies the selection bias used by GEQO. The selection bias is the selective pressure within the population.

Parameter type: floating point.

Unit: none

Value range: 1.5 to 2

Default value: 2

Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value.

Risks and impacts of improper settings: If the parameter is set to a large value, the GEQO algorithm may focus on some excellent plans more quickly than exploring diversity in feasible plans. This may reduce the possibility that the algorithm finds the global optimal solution, but can improve the speed of finding a feasible better solution.

geqo_seed

Parameter description: Specifies the initial value of the random number generator used by GEQO to select random paths through the join order search space.

Parameter type: floating point.

Unit: none

Value range: 0 to 1

Default value: 0

Varying the value changes the set of join paths explored, and may result in a better or worse best path being found.

Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.

Setting suggestion: Retain the default value.

Risks and impacts of improper settings: Different values bring different randomness, which may affect the quality of connection path generation.