更新时间:2024-10-14 GMT+08:00

Stream方式的Hint

功能描述

指明stream使用的方法,可以为broadcast和redistribute,或者直接指定生成gather计划。

语法格式

1
2
[no] broadcast|redistribute(table_list)
gather(REL|JOIN|ALL)

参数说明

  • broadcast和redistribute
    • no表示hint的stream方式不使用。
    • table_list为进行stream操作的单表或多表join结果集,见参数说明
  • gather
    gather hint可以指定三种计划生成方式:
    • REL:只生成基于基表的gather路径,然后再在CN上执行剩余计划。
    • JOIN:尽可能生成基于join的gather路径,在能下推的join子计划上面(join下面不包含重分布节点)添加gather路径,剩余计划在CN上执行。对于需要重分布节点的join计划则生成不出这种基于join的gather路径,会回退生成基于基表的gather路径。

      在指定Hint(JOIN)后,对于分布表和复制表做连接的情况会导致生成不出来Hint(JOIN)期望的计划,因为优化器已经寻找更优的计划进行替代。

    • ALL:基于最优方式选择Gather Rel或Gather Join路径。

示例

示例中原语句使用如下hint:

1
2
explain
select /*+ no redistribute(store_sales store_returns item store) leading(((store_sales store_returns item store) customer)) */ i_product_name product_name ...

原计划中,(store_sales store_returns item store)和customer做join时,前者做了重分布,此hint表示禁止前者混合表做重分布,但仍然保持join顺序,则生成计划如下所示:

对语句进行Gather Hint指定:

  1. 生成基表Gather计划 /* +GATHER(REL)*/。
    openGauss=# explain select /*+ GATHER(REL)*/* from t1, t2, t3 where t1.c2 = t2.c2 and t2.c2 = t3.c2;
    QUERY PLAN
    -------------------------------------------------------------------------------
    Hash Join  (cost=3.29..5.08 rows=10 width=24)
    Hash Cond: (t1.c2 = t3.c2)
    ->  Hash Join  (cost=1.64..3.30 rows=10 width=16)
    Hash Cond: (t1.c2 = t2.c2)
    ->  Streaming (type: GATHER)  (cost=0.31..1.52 rows=10 width=8)
    Node/s: All datanodes
    ->  Seq Scan on t1  (cost=0.00..1.05 rows=10 width=8)
    ->  Hash  (cost=1.52..1.52 rows=10 width=8)
    ->  Streaming (type: GATHER)  (cost=0.31..1.52 rows=10 width=8)
    Node/s: All datanodes
    ->  Seq Scan on t2  (cost=0.00..1.05 rows=10 width=8)
    ->  Hash  (cost=1.52..1.52 rows=10 width=8)
    ->  Streaming (type: GATHER)  (cost=0.31..1.52 rows=10 width=8)
    Node/s: All datanodes
    ->  Seq Scan on t3  (cost=0.00..1.05 rows=10 width=8)
    (15 rows)
  2. 生成可下推计划的Join Gather计划 /*+ GATHER(REL)*/。
    openGauss=# explain select /*+ GATHER(JOIN)*/* from t1, t2, t3 where t1.c1 = t2.c1 and t2.c2 = t3.c2;
    QUERY PLAN
    ---------------------------------------------------------------------------
    Hash Join  (cost=2.76..4.48 rows=10 width=24)
    Hash Cond: (t2.c2 = t3.c2)
    ->  Streaming (type: GATHER)  (cost=1.43..2.70 rows=10 width=16)
    Node/s: All datanodes
    ->  Hash Join  (cost=1.11..2.23 rows=10 width=16)
    Hash Cond: (t1.c1 = t2.c1)
    ->  Seq Scan on t1  (cost=0.00..1.05 rows=10 width=8)
    ->  Hash  (cost=1.05..1.05 rows=10 width=8)
    ->  Seq Scan on t2  (cost=0.00..1.05 rows=10 width=8)
    ->  Hash  (cost=1.52..1.52 rows=10 width=8)
    ->  Streaming (type: GATHER)  (cost=0.31..1.52 rows=10 width=8)
    Node/s: All datanodes
    ->  Seq Scan on t3  (cost=0.00..1.05 rows=10 width=8)
    (13 rows)
  3. 生成最优方式的Gather计划 /*+ GATHER(ALL)*/。
    会基于最优方式及规则选择GATHER(REL)或者GATHER(JOIN)路径。
    openGauss=# explain select /*+ GATHER(ALL)*/* from t1, t2, t3 where t1.c1 = t2.c1 and t2.c2 = t3.c2;
    QUERY PLAN
    ---------------------------------------------------------------------------
    Hash Join  (cost=2.76..4.48 rows=10 width=24)
    Hash Cond: (t2.c2 = t3.c2)
    ->  Streaming (type: GATHER)  (cost=1.43..2.70 rows=10 width=16)
    Node/s: All datanodes
    ->  Hash Join  (cost=1.11..2.23 rows=10 width=16)
    Hash Cond: (t1.c1 = t2.c1)
    ->  Seq Scan on t1  (cost=0.00..1.05 rows=10 width=8)
    ->  Hash  (cost=1.05..1.05 rows=10 width=8)
    ->  Seq Scan on t2  (cost=0.00..1.05 rows=10 width=8)
    ->  Hash  (cost=1.52..1.52 rows=10 width=8)
    ->  Streaming (type: GATHER)  (cost=0.31..1.52 rows=10 width=8)
    Node/s: All datanodes
    ->  Seq Scan on t3  (cost=0.00..1.05 rows=10 width=8)
    (13 rows)