Updated on 2024-09-06 GMT+08:00

Function

To execute complex queries efficiently, the GaussDB(for MySQL) optimizer can convert some big IN predicates into IN subqueries. The conversion happens if the following conditions are met:

  • The GaussDB(for MySQL) kernel version is 2.0.42.230600 or later.
  • The number of elements in the IN list exceeds the value of rds_in_predicate_conversion_threshold.

Overview

In MySQL Community Edition, if column IN (const1, const2, ..... ) is executed and there is an index on the column, the optimizer usually performs a range scan. The parameter range_optimizer_max_mem_size controls the memory available to the range optimizer. If there are many elements in the IN list and the used memory exceeds the parameter value, the range scan will fail and the query performance deteriorates. To solve this problem, you can increase the parameter value to expand the memory that can be used. However, the memory is at the session level. It means that each session occupies the same memory, so the instance may be out of memory. Even if the range optimizer can be used, if the number of elements in the IN list exceeds the eq_range_index_dive_limit value, index statistics, instead of index dive is used. This may cause inaccurate estimation and performance rollback. After IN predicates into subqueries, the optimizer will continue to consider whether to convert the IN clause into a semijoin to improve performance. A specific conversion process is as follows.

select ... from lineitem where l_partkey in (...)

====>

select ... from lineitem where l_partkey in 
 (select tb._col_1 from (values (9628136),(19958441),...) tb)