更新时间:2024-12-25 GMT+08:00
分享

MERGE INTO

命令功能

通过MERGE INTO命令,根据一张表或子查询的连接条件对另外一张表进行查询,连接条件匹配上的进行UPDATE或DELETE,无法匹配的执行INSERT。这个语法仅需要一次全表扫描就完成了全部同步工作,执行效率要高于INSERT+UPDATE。

命令格式

MERGE INTO tableIdentifier AS target_alias

USING (sub_query | tableIdentifier) AS source_alias

ON <merge_condition>

[ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ]

[ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ]

[ WHEN NOT MATCHED [ AND <condition> ] THEN <not_matched_action> ]

<merge_condition> =A equal bool condition

<matched_action> =

DELETE |

UPDATE SET * |

UPDATE SET column1 = expression1 [, column2 = expression2 ...]

<not_matched_action> =

INSERT * |

INSERT (column1 [, column2 ...]) VALUES (value1 [, value2 ...])

参数描述

表1 UPDATE参数

参数

描述

tableIdentifier

在其中执行MergeInto操作的Hudi表的名称。

target_alias

目标表的别名。

sub_query

子查询。

source_alias

源表或源表达式的别名。

merge_condition

将源表或表达式和目标表关联起来的条件

condition

过滤条件,可选。

matched_action

当满足条件时进行Delete或Update操作

not_matched_action

当不满足条件时进行Insert操作

注意事项

  1. merge-on condition当前只支持主键列。
  2. 当前仅支持对COW表进行部分字段更新,且更新值必须包含预合并列,MOR表需要在Update语法中给出全部字段。

示例

  • 部分字段更新
    create table h0(id int, comb int, name string, price int) using hudi options(primaryKey = 'id', preCombineField = 'comb') LOCATION 'obs://bucket/path/h0';
    create table s0(id int, comb int, name string, price int) using hudi options(primaryKey = 'id', preCombineField = 'comb') LOCATION 'obs://bucket/path/s0';
    insert into h0 values(1, 1, 1, 1);
    insert into s0 values(1, 1, 1, 1);
    insert into s0 values(2, 2, 2, 2);
    //写法1
    merge into h0 using s0
    on h0.id = s0.id
    when matched then update set h0.id = s0.id, h0.comb = s0.comb, price = s0.price * 2;
    //写法2
    merge into h0 using s0
    on h0.id = s0.id
    when matched then update set id = s0.id,
    name = h0.name,
    comb = s0.comb + h0.comb,
    price = s0.price + h0.price;
  • 缺省字段更新和插入
    create table h0(id int, comb int, name string, price int, flag boolean) using hudi 
      options(primaryKey = 'id', preCombineField = 'comb') LOCATION 'obs://bucket/path/h0';
    create table s0(id int, comb int, name string, price int, flag boolean) using hudi 
      options(primaryKey = 'id', preCombineField = 'comb') LOCATION 'obs://bucket/path/s0';
    insert into h0 values(1, 1, 1, 1, false);
    insert into s0 values(1, 2, 1, 1, true);
    insert into s0 values(2, 2, 2, 2, false);
    
    merge into h0 as target
    using (
    select id, comb, name, price, flag from s0
    ) source
    on target.id = source.id
    when matched then update set *
    when not matched then insert *;
  • 多条件更新和删除
    create table h0(id int, comb int, name string, price int, flag boolean) using hudi 
      options(primaryKey = 'id', preCombineField = 'comb') LOCATION 'obs://bucket/path/h0';
    create table s0(id int, comb int, name string, price int, flag boolean) using hudi
       options(primaryKey = 'id', preCombineField = 'comb') LOCATION 'obs://bucket/path/s0';
    insert into h0 values(1, 1, 1, 1, false);
    insert into h0 values(2, 2, 1, 1, false);
    insert into s0 values(1, 1, 1, 1, true);
    insert into s0 values(2, 2, 2, 2, false);
    insert into s0 values(3, 3, 3, 3, false);
    
    merge into h0
    using (
    select id, comb, name, price, flag from s0
    ) source
    on h0.id = source.id
    when matched and flag = false then update set id = source.id, comb = h0.comb + source.comb, price = source.price * 2
    when matched and flag = true then delete
    when not matched then insert *;

系统响应

可以检查任务状态是否成功,查看任务日志确认有无异常。

相关文档