最新动态
功能总览
服务公告
- 产品公告
- 产品发布说明
产品介绍
- 图解云数据库GaussDB
- 什么是云数据库GaussDB
- 应用场景
- 常用概念
- 产品优势
- 实例说明
- 数据库实例规格
- 数据库产品类型
- 安全
- 权限管理
- 约束与限制
- GaussDB与其他服务的关系
- 兼容性说明
计费说明
- 计费概述
- 计费项
- 计费模式
- 计费样例
- 变更计费模式
- 续费
- 费用账单
- 欠费说明
- 停止计费
- 成本管理
快速入门
- 购买并通过界面化工具DAS连接GaussDB实例（推荐）
- 购买并通过gsql连接GaussDB实例
- GaussDB入门实践
用户指南
- 选型建议
- 权限管理
  - 创建用户并授权使用GaussDB
  - 自定义策略
- 购买GaussDB实例
- 连接GaussDB实例
  - GaussDB实例连接方式介绍
  - 通过数据管理服务DAS连接实例
  - 通过gsql连接实例
  - 通过Navicat连接实例
  - 通过DBeaver连接实例
- 数据库迁移
  - GaussDB迁移方案总览
  - 使用DRS将Oracle数据库迁移至GaussDB
  - 使用DRS将MySQL数据库迁移到GaussDB
  - 使用DAS的导出和导入功能迁移GaussDB数据
  - 使用copy to/from命令导出导入数据
  - JDBC中使用COPY命令导出导入数据
  - 使用gs_dump和gs_dumpall命令导出数据
  - 使用gs_restore命令导入数据
  - 使用gs_loader工具导入数据
- 使用数据库
  - 数据库使用概述
  - 创建GaussDB数据库
  - 创建GaussDB数据库用户
- 实例管理
  - 查看GaussDB实例总览
  - 设置GaussDB实例安全组规则
  - 绑定和解绑GaussDB实例弹性公网IP
  - 设置GaussDB实例回收站策略
  - 导出GaussDB实例列表
  - 退订GaussDB包周期实例
  - 停止GaussDB实例
  - 启动GaussDB实例
  - 重启GaussDB实例
  - 删除GaussDB按需实例
  - 重建GaussDB实例
  - 停止GaussDB节点
  - 启动GaussDB节点
  - 重启GaussDB节点
- 变更实例
  - 修改GaussDB实例名称
  - 修改GaussDB数据库端口
  - 修改M兼容端口
  - 变更GaussDB实例的CPU和内存规格
  - 只读节点变更
  - 同步数据至单副本实例
  - 查看并修改高级特性
  - 扩容和缩容实例
  - 扩容磁盘
  - 变更部署形态
  - DN主备倒换
- 版本升级
  - 版本升级概述
  - 热补丁升级
  - 就地升级
  - 灰度升级
- 数据备份
  - 备份概述
  - 执行备份
  - 管理备份
- 数据恢复
  - GaussDB数据误操作恢复方案
  - 通过备份文件恢复GaussDB实例
  - 通过备份文件恢复GaussDB库/表
  - 恢复GaussDB实例到指定时间点
  - 恢复GaussDB库/表到指定时间点
- 参数管理
  - 数据库实例支持修改的参数
  - 修改GaussDB实例参数
  - 查看GaussDB实例参数修改历史
  - 导出GaussDB实例参数
  - 创建GaussDB自定义参数模板
  - 管理GaussDB实例参数模板
- 监控与告警
  - GaussDB支持的监控指标
  - 查看GaussDB监控指标
  - 查看GaussDB监控大盘
  - 创建GaussDB实例告警规则
  - 事件监控
- 日志与审计
  - 下载GaussDB错误日志和慢日志
  - 下载GaussDB强切日志
  - 在CTS查看GaussDB云服务操作日志
  - 对接LTS并查看数据库审计日志
- 配额调整
  - 调整GaussDB云服务资源配额
  - 调整企业项目下的GaussDB资源配额
- SQL诊断
  - 配置SQL限流
- 容灾管理
  - 约束限制
  - 搭建容灾关系
  - 查询容灾任务
  - 容灾升主
  - 结束容灾
  - 删除容灾任务
  - 主实例故障后重建灾备任务
  - 容灾主备关系切换
  - 容灾回切
  - 灾备实例容灾演练
  - 主实例日志保持
  - 灾备实例限制说明
- GaussDB任务管理
- GaussDB标签管理
- 重置GaussDB实例管理员密码
开发指南
- 开发指南（分布式_V2.0-8.x）
- 开发指南（集中式_V2.0-8.x）
- 开发指南（分布式_V2.0-3.x）
- 开发指南（集中式_V2.0-3.x）
- 开发指南（分布式_V2.0-2.x）
- 开发指南（集中式_V2.0-2.x）
工具参考
- 工具参考（分布式_V2.0-8.x）
  - 数据库连接工具
    - gsql连接数据库
  - 数据导入导出工具
- 工具参考（集中式_V2.0-8.x）
  - 数据库连接工具
    - gsql连接数据库
  - 数据导入导出工具
- 工具参考（分布式_V2.0-3.x）
- 工具参考（集中式_V2.0-3.x）
  - gsql
  - gs_loader
- 工具参考（分布式_V2.0-2.x）
- 工具参考（集中式_V2.0-2.x）
  - gsql
  - gs_loader
特性指南
- 特性指南（分布式_V2.0-8.x）
- 特性指南（集中式_V2.0-8.x）
- 特性指南（分布式_V2.0-3.x）
- 特性指南（集中式_V2.0-3.x）
最佳实践
- GaussDB安全配置建议
- 扩缩容最佳实践
- 备份恢复最佳实践
性能白皮书
- 测试方法
- 测试数据
API参考
- 使用前必读
- API概览
- 如何调用API
- API（推荐）
- 历史API
- 权限策略和授权项
  - 策略及授权项说明
  - GaussDB授权分类
- 附录
SDK参考
- SDK概述
场景代码示例
常见问题
- 产品咨询
- GaussDB资源冻结/解冻/释放/删除/退订
- 资源及磁盘管理
- 数据库连接
- 数据库存储
  1. 数据超过了GaussDB实例的最大存储容量怎么办
  2. 如何查看GaussDB的存储空间使用情况
- 数据库基本使用
- 备份与恢复
- 数据库监控
  1. GaussDB实例的哪些监控指标需要重点关注
  2. GaussDB实例内存使用率指标的计算方法
- 扩容及规格变更
  1. GaussDB实例在扩容和规格变更期间是否一直可用
- 数据库参数修改
- 日志管理
  1. 如何查看GaussDB执行过的所有SQL日志
  2. 如何查看GaussDB数据库的死锁日志
- 网络安全
  1. 如何防止任意源连接GaussDB数据库
  2. 将根证书导入Windows/Linux操作系统
兼容性参考
- 与Oracle兼容性参考（分布式）
- 与Oracle兼容性参考（集中式）
- 与MySQL兼容性参考（分布式）
- 与MySQL兼容性参考（集中式）
视频帮助
文档下载
通用参考
- 产品术语
- 云服务等级协议（SLA）
- 白皮书资源
- 支持区域
- 系统权限

本文导读

展开导读

文档首页/ 云数据库 GaussDB/ 开发指南（分布式_V2.0-3.x）/ SQL调优指南/ 实际调优案例/ 案例：调整查询重写GUC参数rewrite_rule

案例：调整查询重写GUC参数rewrite_rule

更新时间：2025-03-12 GMT+08:00

查看PDF

rewrite_rule包含了多个查询重写规则：magicset、partialpush、uniquecheck、disablerep、intargetlist、predpush等。下面简要说明一下其中重要的几个规则的使用场景：

案例环境准备

为了便于规则的使用场景演示，需准备建表语句如下：

--清理环境
DROP SCHEMA IF EXISTS rewrite_rule_guc_test CASCADE; 
CREATE SCHEMA rewrite_rule_guc_test;
SET current_schema=rewrite_rule_guc_test;
--创建测试表
CREATE TABLE t(c1 INT, c2 INT, c3 INT, c4 INT);
CREATE TABLE t1(c1 INT, c2 INT, c3 INT, c4 INT);
CREATE TABLE t2(c1 INT, c2 INT, c3 INT, c4 INT);

部分下推参数partialpush的使用

查询下推到DN分布式执行，可以大大加速查询。如果查询语句中有一个不能下推的因素，整个语句就不能下推，无法生成Stream计划在DN分布式执行，性能通常较差。

举例如下查询：

gaussdb=# set rewrite_rule='none'; 
SET
gaussdb=# explain (verbose on, costs off)  select group_concat(tt.c1, tt.c2) from (select t1.c1,t2.c2 from t1,t2 where t1.c1=t2.c2) tt(c1,c2);
                                 QUERY PLAN
----------------------------------------------------------------------------
 Aggregate
   Output: group_concat(t1.c1, t2.c2 SEPARATOR ',')
   ->  Hash Join
         Output: t1.c1, t2.c2
         Hash Cond: (t1.c1 = t2.c2)
         ->  Data Node Scan on t1 "_REMOTE_TABLE_QUERY_"
               Output: t1.c1
               Node/s: All datanodes
               Remote query: SELECT c1 FROM ONLY public.t1 WHERE true
         ->  Hash
               Output: t2.c2
               ->  Data Node Scan on t2 "_REMOTE_TABLE_QUERY_"
                     Output: t2.c2
                     Node/s: All datanodes
                     Remote query: SELECT c2 FROM ONLY public.t2 WHERE true

其中group_concat()函数无法下推，导致走到RemoteQuery的计划：

首先下发select c1 from t1 where true语句到DN读取全部t1表的数据。
然后下发select c2 from t2 where true语句到DN读取全部t2表的数据。
获取需要的数据之后，在CN上做HASH JOIN。
最后结果参与group_concat运算并返回最终结果。

该计划很慢，原因是网络传输了大量数据，然后在CN上执行HASH JOIN，不能充分利用集群资源。

通过增加partialpush查询重写参数，可以把1,2,3下推到DN分布式执行，极大提升语句的性能：

gaussdb=#  set rewrite_rule='partialpush'; 
SET
gaussdb=#  explain (verbose on, costs off) select two_sum(tt.c1, tt.c2) from (select t1.c1,t2.c2 from t1,t2 where t1.c1=t2.c2) tt(c1,c2);
                       QUERY PLAN
---------------------------------------------------------
 Subquery Scan on tt
   Output: two_sum(tt.c1, tt.c2)
   ->  Streaming (type: GATHER)  --Gather以下计划在DN分布式执行
         Output: t1.c1, t2.c2
         Node/s: All datanodes
         ->  Nested Loop
               Output: t1.c1, t2.c2
               Join Filter: (t1.c1 = t2.c2)
               ->  Seq Scan on public.t1
                     Output: t1.c1, t1.c2, t1.c3
                     Distribute Key: t1.c1
               ->  Materialize
                     Output: t2.c2
                     ->  Streaming(type: REDISTRIBUTE)
                           Output: t2.c2
                           Distribute Key: t2.c2
                           Spawn on: All datanodes
                           Consumer Nodes: All datanodes
                           ->  Seq Scan on public.t2
                                 Output: t2.c2
                                 Distribute Key: t2.c1
(21 rows)

目标列子查询提升参数intargetlist

通过将目标列中子查询提升，转为JOIN，往往可以极大提升查询性能。举例如下查询：

gaussdb=#  set rewrite_rule='none'; 
SET
gaussdb=#  explain (verbose on, costs off) select c1,(select avg(c2) from t2 where t2.c2=t1.c2) from t1 where t1.c1<100 order by t1.c2;
                              QUERY PLAN
-----------------------------------------------------------------------
 Streaming (type: GATHER)
   Output: t1.c1, ((SubPlan 1)), t1.c2
   Merge Sort Key: t1.c2
   Node/s: All datanodes
   ->  Sort
         Output: t1.c1, ((SubPlan 1)), t1.c2
         Sort Key: t1.c2
         ->  Seq Scan on public.t1
               Output: t1.c1, (SubPlan 1), t1.c2
               Distribute Key: t1.c1
               Filter: (t1.c1 < 100)
               SubPlan 1
                 ->  Aggregate
                       Output: avg(t2.c2)
                       ->  Result
                             Output: t2.c2
                             Filter: (t2.c2 = t1.c2)
                             ->  Materialize
                                   Output: t2.c2
                                   ->  Streaming(type: BROADCAST)
                                         Output: t2.c2
                                         Spawn on: All datanodes
                                         Consumer Nodes: All datanodes
                                         ->  Seq Scan on public.t2
                                               Output: t2.c2
                                               Distribute Key: t2.c1
(26 rows)

由于目标列中的相关子查询(select avg(c2) from t2 where t2.c2=t1.c2)无法提升的缘故，导致每扫描t1的一行数据，就会触发子查询的一次执行，效率低下。如果打开intargetlist参数会把子查询提升转为JOIN，来提升查询的性能：

gaussdb=#  set rewrite_rule='intargetlist';
SET
gaussdb=#  explain (verbose on, costs off) select c1,(select avg(c2) from t2 where t2.c2=t1.c2) from t1 where t1.c1<100 order by t1.c2;
                          QUERY PLAN
---------------------------------------------------------------
 Streaming (type: GATHER)
   Output: t1.c1, (avg(t2.c2)), t1.c2
   Merge Sort Key: t1.c2
   Node/s: All datanodes
   ->  Sort
         Output: t1.c1, (avg(t2.c2)), t1.c2
         Sort Key: t1.c2
         ->  Hash Right Join
               Output: t1.c1, (avg(t2.c2)), t1.c2
               Hash Cond: (t2.c2 = t1.c2)
               ->  Streaming(type: BROADCAST)
                     Output: (avg(t2.c2)), t2.c2
                     Spawn on: All datanodes
                     Consumer Nodes: All datanodes
                     ->  HashAggregate
                           Output: avg(t2.c2), t2.c2
                           Group By Key: t2.c2
                           ->  Streaming(type: REDISTRIBUTE)
                                 Output: t2.c2
                                 Distribute Key: t2.c2
                                 Spawn on: All datanodes
                                 Consumer Nodes: All datanodes
                                 ->  Seq Scan on public.t2
                                       Output: t2.c2
                                       Distribute Key: t2.c1
               ->  Hash
                     Output: t1.c1, t1.c2
                     ->  Seq Scan on public.t1
                           Output: t1.c1, t1.c2
                           Distribute Key: t1.c1
                           Filter: (t1.c1 < 100)
(31 rows)

提升无agg的子查询uniquecheck

子链接提升需要保证对于每个条件只有一行输出，对于有agg的子查询可以自动提升，对于无agg的子查询如：

select t1.c1 from t1 where t1.c1 = (select t2.c1 from t2 where t1.c1=t2.c2) ;

重写为：

select t1.c1 from t1 join (select t2.c1 from t2 where t2.c1 is not null group by t2.c1(unique check)) tt(c1) on tt.c1=t1.c1;

需注意，上述SQL中的unique check表示t2.c1需要进行检查，非正常SQL表达，该SQL无法直接执行。为了保证语义等价，子查询tt必须保证对于每个group by t2.c1只能有一行输出。打开uniquecheck查询重写参数保证可以提升并且等价，如果在运行时输出了多于一行的数据，就会报错。

gaussdb=#  set rewrite_rule='uniquecheck';
SET
gaussdb=#  explain verbose select t1.c1 from t1 where t1.c1 = (select t2.c1 from t2 where t1.c1=t2.c1) ;
                               QUERY PLAN
------------------------------------------------------------------------
 Streaming (type: GATHER)
   Output: t1.c1
   Node/s: All datanodes
   ->  Nested Loop
         Output: t1.c1
         Join Filter: (t1.c1 = subquery."?column?")
         ->  Seq Scan on public.t1
               Output: t1.c1, t1.c2, t1.c3
               Distribute Key: t1.c1
         ->  Materialize
               Output: subquery."?column?", subquery.c1
               ->  Subquery Scan on subquery
                     Output: subquery."?column?", subquery.c1
                     ->  HashAggregate
                           Output: t2.c1, t2.c1
                           Group By Key: t2.c1
                           Filter: (t2.c1 IS NOT NULL)
                           Unique Check Required   --如果在运行时输出了多于一行的数据，就会报错。
                           ->  Index Only Scan using t2idx on public.t2
                                 Output: t2.c1
                                 Distribute Key: t2.c1
(21 rows)

注意：因为分组group by t2.c1 unique check发生在过滤条件tt.c1=t1.c1之前，可能导致原来不报错的查询重写之后报错。举例：

有t1,t2表，其中的数据为：

gaussdb=#  select * from t1 order by c2;
 c1 | c2 | c3
----+----+----
  1 |  1 |  1
  2 |  2 |  2
  3 |  3 |  3
  4 |  4 |  4
  5 |  5 |  5
  6 |  6 |  6
  7 |  7 |  7
  8 |  8 |  8
  9 |  9 |  9
 10 | 10 | 10
(10 rows)

gaussdb=#  select * from t2 order by c1;
 c1 | c2 | c3
----+----+----
  1 |  1 |  1
  2 |  2 |  2
  3 |  3 |  3
  4 |  4 |  4
  5 |  5 |  5
  6 |  6 |  6
  7 |  7 |  7
  8 |  8 |  8
  9 |  9 |  9
 10 | 10 | 10
 11 | 11 | 11
 11 | 11 | 11
 12 | 12 | 12
 12 | 12 | 12
 13 | 13 | 13
 13 | 13 | 13
 14 | 14 | 14
 14 | 14 | 14
 15 | 15 | 15
 15 | 15 | 15
 16 | 16 | 16
 16 | 16 | 16
 17 | 17 | 17
 17 | 17 | 17
 18 | 18 | 18
 18 | 18 | 18
 19 | 19 | 19
 19 | 19 | 19
 20 | 20 | 20
 20 | 20 | 20
(30 rows)

分别关闭和打开uniquecheck参数对比，打开之后报错。

gaussdb=#   select t1.c1 from t1 where t1.c1 = (select t2.c1 from t2 where t1.c1=t2.c2) ;
 c1
----
  6
  7
  3
  1
  2
  4
  5
  8
  9
 10
(10 rows)

gaussdb=#  set rewrite_rule='uniquecheck';
SET
gaussdb=#   select t1.c1 from t1 where t1.c1 = (select t2.c1 from t2 where t1.c1=t2.c2) ;
ERROR:  more than one row returned by a subquery used as an expression

将条件下推到子查询中predpush、predpushnormal、predpushforce

通常优化器以查询块为单位进行优化，不同查询块独立优化，如果有涉及到跨查询块的谓词条件，难以从全局角度考虑谓词应用的位置。predpush可以将谓词下推到子查询块中，在父查询块中的数据量较小或子查询中可以利用索引的场景下能够提升性能。涉及到predpush的rewrite_rule规则有3个，分别是：

predpushnormal：尝试下推谓词到子查询中，需要利用STREAM算子，如BROADCAST来实现分布式计划。
predpushforce：尝试下推谓词到子查询中，尽量利用参数化路径的索引扫描。
predpush：利用代价在predpushnormal和predpushforce中选择一个最优的分布式计划，但是会增加优化时间。

以下是关闭和开启该查询重写规则的计划示例：

gaussdb=# set enable_fast_query_shipping=off; -- 关闭fqs优化
SET
gaussdb=# show rewrite_rule;
 rewrite_rule
--------------
 magicset
(1 row)

gaussdb=# explain (costs off) select * from t1, (select sum(c2), c1 from t2 group by c1) st2 where st2.c1 = t1.c1;
              QUERY PLAN
--------------------------------------
 Streaming (type: GATHER)
   Node/s: All datanodes
   ->  Nested Loop
         Join Filter: (t1.c1 = t2.c1)
         ->  HashAggregate
               Group By Key: t2.c1
               ->  Seq Scan on t2
         ->  Seq Scan on t1
(8 rows)


gaussdb=# set rewrite_rule='predpushnormal';
SET
gaussdb=# explain (costs off) select * from t1, (select sum(c2), c1 from t2 group by c1) st2 where st2.c1 = t1.c1;
                 QUERY PLAN
---------------------------------------------
 Streaming (type: GATHER)
   Node/s: All datanodes
   ->  Nested Loop
         ->  Seq Scan on t1
         ->  HashAggregate
               Group By Key: t2.c1
               ->  Result
                     Filter: (t1.c1 = t2.c1)
                     ->  Seq Scan on t2
(9 rows)

--可以看到过滤条件被推到子查询中执行。


gaussdb=# set rewrite_rule='predpushforce';
SET

gaussdb=# explain (costs off) select * from t1, (select sum(c2), c1 from t2 group by c1) st2 where st2.c1 = t1.c1;
                     QUERY PLAN
----------------------------------------------------
 Streaming (type: GATHER)
   Node/s: All datanodes
   ->  Nested Loop
         ->  Seq Scan on t1
         ->  HashAggregate
               Group By Key: t2.c1
               ->  Index Scan using t2_c1_idx on t2
                     Index Cond: (t1.c1 = c1)
(8 rows)

--结合predpush hint一起使用，可以看到使用了参数化路径。

gaussdb=# set rewrite_rule = 'predpush';
SET
gaussdb=# explain (costs off) select * from t1, (select sum(c2), c1 from t2 group by c1) st2 where st2.c1 = t1.c1;
                     QUERY PLAN
----------------------------------------------------
 Streaming (type: GATHER)
   Node/s: All datanodes
   ->  Nested Loop
         ->  Seq Scan on t1
         ->  HashAggregate
               Group By Key: t2.c1
               ->  Result
                     Filter: (t1.c1 = t2.c1)
                     ->  Seq Scan on t2
(9 rows)

禁止复制表的子查询提升参数disablerep

复制表只需在一个DN节点上做查询，提升后可能发生性能劣化，举例如下：

gaussdb=# create table t_rep(a int) distribute by replication;
CREATE TABLE
gaussdb=# create table t_dis(a int);
NOTICE:  The 'DISTRIBUTE BY' clause is not specified. Using 'a' as the distribution column by default.
HINT:  Please use 'DISTRIBUTE BY' clause to specify suitable data distribution column.
CREATE TABLE
gaussdb=# set rewrite_rule = '';
SET
gaussdb=# explain (costs off) select * from t_dis where a = any(select a from t_rep) or a > 100;
                          QUERY PLAN
---------------------------------------------------------------
 Streaming (type: GATHER)
   Node/s: All datanodes
   ->  Hash Left Join
         Hash Cond: (t_dis.a = subquery.a)
         Filter: ((subquery.a IS NOT NULL) OR (t_dis.a > 100))
         ->  Seq Scan on t_dis
         ->  Hash
               ->  Subquery Scan on subquery
                     Filter: (Hash By subquery.a)
                     ->  HashAggregate
                           Group By Key: t_rep.a
                           ->  Seq Scan on t_rep
(12 rows)

对复制表来说，所有DN上存储的数据相同，故不需在所有节点上都进行扫描。

gaussdb=# set rewrite_rule = disablerep;
SET
gaussdb=# explain (costs off) select * from t_dis where a = any(select a from t_rep) or a > 100;
                    QUERY PLAN
---------------------------------------------------
 Streaming (type: GATHER)
   Node/s: All datanodes
   ->  Seq Scan on t_dis
         Filter: ((hashed SubPlan 1) OR (a > 100))
         SubPlan 1
           ->  Seq Scan on t_rep
(6 rows

父主题： 实际调优案例

上一篇：案例：改写SQL消除in-clause

下一篇：案例：使用DN Gather减少计划中的Stream节点

意见反馈

文档内容是否对您有帮助？

有帮助没帮助

提供反馈

提交成功！非常感谢您的反馈，我们会继续努力做到更好！您可在我的云声建议查看反馈及问题处理状态。

系统繁忙，请稍后重试

在使用文档中是否遇到以下问题

内容与产品页面不一致

内容不易理解

缺失示例代码

步骤不可操作

搜不到想要的内容

缺少最佳实践

意见反馈（选填）

0/500

请至少选择一项反馈信息并填写问题反馈

字符长度不能超过500

直接提交取消

如您有其它疑问，您也可以通过华为云社区问答频道来与我们联系探讨

盘古Doer提问云社区提问