CLUSTERING

约束与限制

本章节仅适用于MRS 3.2.0-LTS及之后版本。

命令功能

对Hudi表进行clustering操作，具体作用可以参考Hudi Clustering操作说明章节。

命令格式

执行clustering：
call run_clustering(table=>'[table]', path=>'[path]', predicate=>'[predicate]', order=>'[order]');
查看clustering计划：
call show_clustering(table=>'[table]', path=>'[path]', limit=>'[limit]');

参数描述

表1 参数描述
参数	描述	是否必填
table	需要查询的Hudi表名称，支持database.tablename格式。	否
path	需要查询的Hudi表路径。	否
predicate	指定Clustering的分区过滤谓词条件，用于限定Clustering操作的数据范围。	否
order	指定Clustering的排序字段名，支持多个字段以逗号分隔。	否
limit	展示查询结果的条数。	否

示例

以下示例中，hudi_table1为待操作的Hudi表名，需替换为实际的表名。

call show_clustering(table => 'hudi_table1');

call run_clustering(table => 'hudi_table1', predicate => '(ts >= 1006L and ts < 1008L) or ts >= 1009L', order => 'ts');

call run_clustering(path => '/user/hive/warehouse/hudi_test2', predicate => "dt = '2021-08-28'", order => 'id');