文档首页/ 数据仓库服务 DWS/ 更多文档/ 用户指南（巴黎区域）/ 常见问题/ 数据库使用/ DWS查询时索引失效场景解析

更新时间：2025-09-24 GMT+08:00

查看PDF

DWS查询时索引失效场景解析

对表建立索引可提高数据库查询性能，但有时会出现建立了索引，但查询计划中却发现索引没有被使用的情况。针对这种情况，本文将列举几种常见的场景和优化方法。

场景一：返回结果集很大

以行存表的Seq Scan和Index Scan为例：

Seq Scan：按照表的记录的排列顺序从头到尾依次检索扫描，每次扫描要取到所有的记录。这也是最简单最基础的扫表方式，扫描的代价比较大。
Index Scan：对于指定的查询，先扫描一遍索引，从索引中找到符合要求的记录的位置（指针），再定位到表中具体的Page去获取，即先走索引，再读表数据。

因此，根据两种扫描方式的特点可知，多数情况下，Index Scan要比Seq Scan快。但是如果获取的结果集占所有数据的比重很大时（超过70%），这时Index Scan因为要先扫描索引再读表数据反而不如直接全表扫描的速度快。

场景二：未及时ANALYZE

ANALYZE更新表的统计信息，如果表未执行ANALYZE或最近一次执行完ANALYZE后表进行过数据量较大的增删操作，会导致统计信息不准，该场景下也可能导致查询表时没有使用索引。

优化方法：对表执行ANALYZE更新统计信息。

场景三：过滤条件使用了函数或隐式类型转化

如果在过滤条件中使用了计算、函数、隐式类型转化，都可能导致无法选择索引。

例如创建表，并在a，b，c三列上都分别创建了索引。

    
         CREATE TABLE test(a int, b text, c date);

在索引字段进行计算操作。

从下面的执行结果可以看出：where a = 101，where a = 102 - 1都使用了a列上的索引，但是where a + 1 = 102没有使用索引。

     
      
        
        
          explain verbose select * from test where a  = 101;
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
  id |                   operation                    | E-rows | E-distinct | E-memory | E-width | E-costs
 ----+------------------------------------------------+--------+------------+----------+---------+---------
   1 | ->  Streaming (type: GATHER)                   |      1 |            |          |      44 | 16.27
   2 |    ->  Index Scan using index_a on public.test |      1 |            | 1MB      |      44 | 8.27

 Predicate Information (identified by plan id)
 ---------------------------------------------
   2 --Index Scan using index_a on public.test
         Index Cond: (test.a = 101)

 Targetlist Information (identified by plan id)
 ----------------------------------------------
   1 --Streaming (type: GATHER)
         Output: a, b, c
         Node/s: dn_6005_6006
   2 --Index Scan using index_a on public.test
         Output: a, b, c
         Distribute Key: a

   ====== Query Summary =====
 -------------------------------
 System available mem: 3358720KB
 Query Max mem: 3358720KB
 Query estimated mem: 1024KB
(24 rows)

         

       

     
    

     
      
        
        
           explain verbose select * from test where a  = 102 - 1;
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
  id |                   operation                    | E-rows | E-distinct | E-memory | E-width | E-costs
 ----+------------------------------------------------+--------+------------+----------+---------+---------
   1 | ->  Streaming (type: GATHER)                   |      1 |            |          |      44 | 16.27
   2 |    ->  Index Scan using index_a on public.test |      1 |            | 1MB      |      44 | 8.27

 Predicate Information (identified by plan id)
 ---------------------------------------------
   2 --Index Scan using index_a on public.test
         Index Cond: (test.a = 101)

 Targetlist Information (identified by plan id)
 ----------------------------------------------
   1 --Streaming (type: GATHER)
         Output: a, b, c
         Node/s: dn_6005_6006
   2 --Index Scan using index_a on public.test
         Output: a, b, c
         Distribute Key: a

   ====== Query Summary =====
 -------------------------------
 System available mem: 3358720KB
 Query Max mem: 3358720KB
 Query estimated mem: 1024KB
(24 rows)

         

       

     
    

     
      
        
        
          explain verbose select * from test where a + 1 = 102;
                                         QUERY PLAN
--------------------------------------------------------------------------------------------
  id |           operation            | E-rows | E-distinct | E-memory | E-width | E-costs
 ----+--------------------------------+--------+------------+----------+---------+---------
   1 | ->  Streaming (type: GATHER)   |      1 |            |          |      44 | 22.21
   2 |    ->  Seq Scan on public.test |      1 |            | 1MB      |      44 | 14.21

 Predicate Information (identified by plan id)
 ---------------------------------------------
   2 --Seq Scan on public.test
         Filter: ((test.a + 1) = 102)

 Targetlist Information (identified by plan id)
 ----------------------------------------------
   1 --Streaming (type: GATHER)
         Output: a, b, c
         Node/s: All datanodes
   2 --Seq Scan on public.test
         Output: a, b, c
         Distribute Key: a

   ====== Query Summary =====
 -------------------------------
 System available mem: 3358720KB
 Query Max mem: 3358720KB
 Query estimated mem: 1024KB
(24 rows)

         

       

     
    

优化方式：尽量使用常量代替表达式，或者常量计算尽量写在等号的右侧。

在索引字段上使用函数。

从下面的执行结果可以看出：在索引列上使用函数也会导致无法选择索引。

     
      
        
        
          explain verbose select * from test where to_char(c, 'yyyyMMdd') = to_char(CURRENT_DATE,'yyyyMMdd');
                                                                 QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
  id |           operation            | E-rows | E-distinct | E-memory | E-width | E-costs
 ----+--------------------------------+--------+------------+----------+---------+---------
   1 | ->  Streaming (type: GATHER)   |      1 |            |          |      44 | 22.28
   2 |    ->  Seq Scan on public.test |      1 |            | 1MB      |      44 | 14.28

                                               Predicate Information (identified by plan id)
 ------------------------------------------------------------------------------------------------------------------------------------------
   2 --Seq Scan on public.test
         Filter: (to_char(test.c, 'yyyyMMdd'::text) = to_char(('2022-11-30'::pg_catalog.date)::timestamp with time zone, 'yyyyMMdd'::text))

 Targetlist Information (identified by plan id)
 ----------------------------------------------
   1 --Streaming (type: GATHER)
         Output: a, b, c
         Node/s: All datanodes
   2 --Seq Scan on public.test
         Output: a, b, c
         Distribute Key: a

   ====== Query Summary =====
 -------------------------------
 System available mem: 3358720KB
 Query Max mem: 3358720KB
 Query estimated mem: 1024KB
(24 rows)

         

       

     
    

     
      
        
        
          explain verbose select * from test where c = current_date;
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
  id |                   operation                    | E-rows | E-distinct | E-memory | E-width | E-costs
 ----+------------------------------------------------+--------+------------+----------+---------+---------
   1 | ->  Streaming (type: GATHER)                   |      1 |            |          |      44 | 16.27
   2 |    ->  Index Scan using index_c on public.test |      1 |            | 1MB      |      44 | 8.27

        Predicate Information (identified by plan id)
 ------------------------------------------------------------
   2 --Index Scan using index_c on public.test
         Index Cond: (test.c = '2022-11-30'::pg_catalog.date)

 Targetlist Information (identified by plan id)
 ----------------------------------------------
   1 --Streaming (type: GATHER)
         Output: a, b, c
         Node/s: All datanodes
   2 --Index Scan using index_c on public.test
         Output: a, b, c
         Distribute Key: a

   ====== Query Summary =====
 -------------------------------
 System available mem: 3358720KB
 Query Max mem: 3358720KB
 Query estimated mem: 1024KB
(24 rows)

         

       

     
    

优化方法：尽量减少索引列上没有必要的函数调用。

数据类型隐式转化。

此类场景较常见，例如字段b的类型是text，过滤条件是where b = 2，在生成计划时，text类型会隐式转化为bigint类型，实际的过滤条件变成where b::bigint = 2，导致b列上的索引失效。

     
      
        
        
          explain verbose select * from test where b = 2;
                                         QUERY PLAN
--------------------------------------------------------------------------------------------
  id |           operation            | E-rows | E-distinct | E-memory | E-width | E-costs
 ----+--------------------------------+--------+------------+----------+---------+---------
   1 | ->  Streaming (type: GATHER)   |      1 |            |          |      44 | 22.21
   2 |    ->  Seq Scan on public.test |      1 |            | 1MB      |      44 | 14.21

 Predicate Information (identified by plan id)
 ---------------------------------------------
   2 --Seq Scan on public.test
         Filter: ((test.b)::bigint = 2)

 Targetlist Information (identified by plan id)
 ----------------------------------------------
   1 --Streaming (type: GATHER)
         Output: a, b, c
         Node/s: All datanodes
   2 --Seq Scan on public.test
         Output: a, b, c
         Distribute Key: a

   ====== Query Summary =====
 -------------------------------
 System available mem: 3358720KB
 Query Max mem: 3358720KB
 Query estimated mem: 1024KB
(24 rows)

         

       

     
    

     
      
        
        
          explain verbose select * from test where b = '2';
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
  id |                   operation                    | E-rows | E-distinct | E-memory | E-width | E-costs
 ----+------------------------------------------------+--------+------------+----------+---------+---------
   1 | ->  Streaming (type: GATHER)                   |      1 |            |          |      44 | 16.27
   2 |    ->  Index Scan using index_b on public.test |      1 |            | 1MB      |      44 | 8.27

 Predicate Information (identified by plan id)
 ---------------------------------------------
   2 --Index Scan using index_b on public.test
         Index Cond: (test.b = '2'::text)

 Targetlist Information (identified by plan id)
 ----------------------------------------------
   1 --Streaming (type: GATHER)
         Output: a, b, c
         Node/s: All datanodes
   2 --Index Scan using index_b on public.test
         Output: a, b, c
         Distribute Key: a

   ====== Query Summary =====
 -------------------------------
 System available mem: 3358720KB
 Query Max mem: 3358720KB
 Query estimated mem: 1024KB
(24 rows)

         

       

     
    

优化方法：索引条件上的常量尽可能使用和索引列相同类型的常量，避免发生隐式类型转化。

场景四：用nestloop + indexscan代替hashjoin

此类语句的特征是两个表关联的时候，其中一个表上where条件过滤之后的结果集行数很小，同时，最终满足条件的结果集行数也很小。此时，使用nestloop+indexscan的效果往往要优于hashjoin。较优的执行计划如下：

可以看到第5层的Index Cond: (t1.b = t2.b)已经把join条件下推到了基表扫描上。

    
     
       
       
         explain verbose select t1.a,t1.b from t1,t2 where t1.b=t2.b and t2.a=4;
 id |                    operation                     | E-rows | E-distinct | E-memory | E-width | E-costs 
----+--------------------------------------------------+--------+------------+----------+---------+---------
  1 | ->  Streaming (type: GATHER)                     |     26 |            |          |       8 | 17.97
  2 |    ->  Nested Loop (3,5)                         |     26 |            | 1MB      |       8 | 11.97
  3 |       ->  Streaming(type: BROADCAST)             |      2 |            | 2MB      |       4 | 2.78
  4 |          ->  Seq Scan on public.t2               |      1 |            | 1MB      |       4 | 2.62
  5 |       ->  Index Scan using t1_b_idx on public.t1 |     26 |            | 1MB      |       8 | 9.05
(5 rows)

 Predicate Information (identified by plan id) 
-----------------------------------------------
   4 --Seq Scan on public.t2
         Filter: (t2.a = 4)
   5 --Index Scan using t1_b_idx on public.t1
         Index Cond: (t1.b = t2.b)
(4 rows)

 Targetlist Information (identified by plan id) 
------------------------------------------------
   1 --Streaming (type: GATHER)
         Output: t1.a, t1.b
         Node/s: All datanodes
   2 --Nested Loop (3,5)
         Output: t1.a, t1.b
   3 --Streaming(type: BROADCAST)
         Output: t2.b
         Spawn on: datanode2
         Consumer Nodes: All datanodes
   4 --Seq Scan on public.t2
         Output: t2.b
         Distribute Key: t2.a
   5 --Index Scan using t1_b_idx on public.t1
         Output: t1.a, t1.b
         Distribute Key: t1.a
(15 rows)

   ====== Query Summary =====    
---------------------------------
 System available mem: 9262694KB
 Query Max mem: 9471590KB
 Query estimated mem: 5144KB
(3 rows)

        

      

    
   

如果优化器没有选择这种执行计划，可以通过以下方式优化：

   
        set enable_index_nestloop = on;
set enable_hashjoin = off;
set enable_seqscan = off;

场景五：使用hint指定索引时指定的索引方式不对

DWS的plan hint当前支持指定的Scan方式有三种：tablescan、indexscan和indexonlyscan。

tablescan：全表扫描，比如行存表的Seq Scan，列存表的CStore Scan。
indexscan：先扫索引，再根据索引取表记录。
indexonlyscan：覆盖索引扫描，所需的返回结果能被所扫描的索引全部覆盖。与index scan相比，index only scan所包含的字段集合，涵盖了查询语句中的字段，这样，提取出相应的index就不必再根据索引取表记录了。

因此，对于需要indexonlyscan的场景，如果hint指定了indexscan，该hint是无法生效的：

    
     
       
       
         explain verbose select/*+ indexscan(test)*/ b from test where b = '1';
WARNING:  unused hint: IndexScan(test)
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
  id |                      operation                      | E-rows | E-distinct | E-memory | E-width | E-costs
 ----+-----------------------------------------------------+--------+------------+----------+---------+---------
   1 | ->  Streaming (type: GATHER)                        |      1 |            |          |      32 | 16.27
   2 |    ->  Index Only Scan using index_b on public.test |      1 |            | 1MB      |      32 | 8.27

   Predicate Information (identified by plan id)
 --------------------------------------------------
   2 --Index Only Scan using index_b on public.test
         Index Cond: (test.b = '1'::text)

   Targetlist Information (identified by plan id)
 --------------------------------------------------
   1 --Streaming (type: GATHER)
         Output: b
         Node/s: All datanodes
   2 --Index Only Scan using index_b on public.test
         Output: b
         Distribute Key: a

   ====== Query Summary =====
 -------------------------------
 System available mem: 3358720KB
 Query Max mem: 3358720KB
 Query estimated mem: 1024KB
(24 rows)

        

      

    
   

    
     
       
       
         explain verbose select/*+ indexonlyscan(test)*/ b from test where b = '1';
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
  id |                      operation                      | E-rows | E-distinct | E-memory | E-width | E-costs
 ----+-----------------------------------------------------+--------+------------+----------+---------+---------
   1 | ->  Streaming (type: GATHER)                        |      1 |            |          |      32 | 16.27
   2 |    ->  Index Only Scan using index_b on public.test |      1 |            | 1MB      |      32 | 8.27

   Predicate Information (identified by plan id)
 --------------------------------------------------
   2 --Index Only Scan using index_b on public.test
         Index Cond: (test.b = '1'::text)

   Targetlist Information (identified by plan id)
 --------------------------------------------------
   1 --Streaming (type: GATHER)
         Output: b
         Node/s: All datanodes
   2 --Index Only Scan using index_b on public.test
         Output: b
         Distribute Key: a

   ====== Query Summary =====
 -------------------------------
 System available mem: 3358720KB
 Query Max mem: 3358720KB
 Query estimated mem: 1024KB
(24 rows)

        

      

    
   

优化方法：使用hint时正确指定indexscan和indexonlyscan。

场景六：全文检索GIN索引

为了加速文本搜索，进行全文检索时可以创建GIN索引：

    
         CREATE INDEX idxb ON test using gin(to_tsvector('english',b));

创建GIN索引时，必须使用to_tsvector的两参数版本，并且只有当查询时也使用了两参数版本，且参数值与索引中相同时，才会使用该索引：

to_tsvector()函数有两个版本，只输入一个参数的版本和输入两个参数的版本。输入一个参数时，系统默认采用default_text_search_config所指定的分词器。创建索引时必须使用to_tsvector的两参数版本，否则索引内容可能不一致。

    
     
       
       
         explain verbose select  * from test where to_tsvector(b) @@ to_tsquery('cat') order by 1;
                                          QUERY PLAN
-----------------------------------------------------------------------------------------------
  id |             operation             | E-rows | E-distinct | E-memory | E-width | E-costs
 ----+-----------------------------------+--------+------------+----------+---------+---------
   1 | ->  Streaming (type: GATHER)      |      2 |            |          |      44 | 22.23
   2 |    ->  Sort                       |      2 |            | 16MB     |      44 | 14.23
   3 |       ->  Seq Scan on public.test |      1 |            | 1MB      |      44 | 14.21

        Predicate Information (identified by plan id)
 -----------------------------------------------------------
   3 --Seq Scan on public.test
         Filter: (to_tsvector(test.b) @@ '''cat'''::tsquery)

 Targetlist Information (identified by plan id)
 ----------------------------------------------
   1 --Streaming (type: GATHER)
         Output: a, b, c
         Merge Sort Key: test.a
         Node/s: All datanodes
   2 --Sort
         Output: a, b, c
         Sort Key: test.a
   3 --Seq Scan on public.test
         Output: a, b, c
         Distribute Key: a

   ====== Query Summary =====
 -------------------------------
 System available mem: 3358720KB
 Query Max mem: 3358720KB
 Query estimated mem: 1024KB
(29 rows)

        

      

    
   

    
     
       
       
         explain verbose select  * from test where to_tsvector('english',b) @@ to_tsquery('cat') order by 1;
                                              QUERY PLAN
-------------------------------------------------------------------------------------------------------
  id |                 operation                 | E-rows | E-distinct | E-memory | E-width | E-costs
 ----+-------------------------------------------+--------+------------+----------+---------+---------
   1 | ->  Streaming (type: GATHER)              |      2 |            |          |      44 | 20.03
   2 |    ->  Sort                               |      2 |            | 16MB     |      44 | 12.03
   3 |       ->  Bitmap Heap Scan on public.test |      1 |            | 1MB      |      44 | 12.02
   4 |          ->  Bitmap Index Scan            |      1 |            | 1MB      |       0 | 8.00

                      Predicate Information (identified by plan id)
 ---------------------------------------------------------------------------------------
   3 --Bitmap Heap Scan on public.test
         Recheck Cond: (to_tsvector('english'::regconfig, test.b) @@ '''cat'''::tsquery)
   4 --Bitmap Index Scan
         Index Cond: (to_tsvector('english'::regconfig, test.b) @@ '''cat'''::tsquery)

 Targetlist Information (identified by plan id)
 ----------------------------------------------
   1 --Streaming (type: GATHER)
         Output: a, b, c
         Merge Sort Key: test.a
         Node/s: All datanodes
   2 --Sort
         Output: a, b, c
         Sort Key: test.a
   3 --Bitmap Heap Scan on public.test
         Output: a, b, c
         Distribute Key: a

   ====== Query Summary =====
 -------------------------------
 System available mem: 3358720KB
 Query Max mem: 3358720KB
 Query estimated mem: 2048KB
(32 rows)

        

      

    
   

优化方式：查询时使用to_tsvector的两参数版本，且保证参数值与索引中相同。

父主题： 数据库使用

上一篇：DWS列存表的常用信息查询

下一篇：如何使用DWS自定义函数改写CRC32()函数

意见反馈

文档内容是否对您有帮助？

有帮助没帮助

提供反馈

提交成功！非常感谢您的反馈，我们会继续努力做到更好！

系统繁忙，请稍后重试

在使用文档中是否遇到以下问题

内容与产品页面不一致

内容不易理解

缺失示例代码

步骤不可操作

搜不到想要的内容

缺少最佳实践

意见反馈（选填）

0/500

请至少选择一项反馈信息并填写问题反馈

字符长度不能超过500

直接提交取消