使用说明
使用高级压缩的功能,用户必须申请License才能使用。具体情况请联系华为工程师。
- 执行如下命令开启压缩功能:
gaussdb=# ALTER DATABASE SET ilm = on;
检查当前数据库的public schema中是否存在gsilmpolicy_seq和gsilmtask_seq。
gaussdb=# \d List of relations Schema | Name | Type | Owner | Storage --------+-----------------+----------+------------+------------------------------------------------------ public | gsilmpolicy_seq | sequence | omm | | public | gsilmtask_seq | sequence | omm | |
或者:
gaussdb=# SELECT a.oid, a.relname from pg_class a inner join pg_namespace b on a.relnamespace = b.oid WHERE (a.relname = 'gsilmpolicy_seq' OR a.relname = 'gsilmtask_seq') AND b.nspname = 'public'; oid | relname -------+----------------- 17002 | gsilmpolicy_seq 17004 | gsilmtask_seq (2 rows)
生成异常会报warning:
WARNING: ILM sequences are already existed while initializing
- 为表添加压缩策略。
- 新建带策略的表:
gaussdb=# CREATE TABLE ilm_table_1 (col1 int, col2 text) ilm add policy row store compress advanced row after 3 days of no modification on (col1 < 1000);
- 为存量表添加策略:
gaussdb=# CREATE TABLE ilm_table_2 (col1 int, col2 text); gaussdb=# ALTER TABLE ilm_table_2 ilm add policy row store compress advanced row after 3 days of no modification;
- 检查策略视图中是否新增数据:
gaussdb=# SELECT * FROM gs_my_ilmpolicies; policy_name | policy_type | tablespace | enabled | deleted -------------+---------------+------------+---------+--------- p1 | DATA MOVEMENT | | YES | NO p2 | DATA MOVEMENT | | YES | NO (2 rows)
- 检查策略详细信息视图中是否新增了符合刚刚设置的策略:
gaussdb=# SELECT * FROM gs_my_ilmdatamovementpolicies; policy_name | action_type | scope | compression_level | tier_tablespace | tier_status | condition_type | condition_days | custom_function | policy_subtype | action_clause | tier_to -------------+-------------+-------+-------------------+-----------------+-------------+------------------------+----------------+-----------------+----------------+---------------+--------- p1 | COMPRESSION | ROW | ADVANCED | | | LAST MODIFICATION TIME | 3 | | | | p2 | COMPRESSION | ROW | ADVANCED | | | LAST MODIFICATION TIME | 3 | | | | (2 rows)
- 检查策略与目标表是否对应:
gaussdb=# SELECT * FROM gs_my_ilmobjects; policy_name | object_owner | object_name | subobject_name | object_type | inherited_from | tbs_inherited_from | enabled | deleted -------------+--------------+-------------+----------------+-------------+----------------------+--------------------+---------+--------- p1 | public | ilm_table_1 | | TABLE | POLICY NOT INHERITED | | YES | NO p2 | public | ilm_table_2 | | TABLE | POLICY NOT INHERITED | | YES | NO (2 rows)
- 新建带策略的表:
- 执行压缩评估。
- 手动执行压缩评估。
为方便测试,本功能环境参数中提供POLICY_TIME属性,决定时间条件以天为单位还是以秒为单位。通过下面语句调整:
gaussdb=# CALL DBE_ILM_ADMIN.CUSTOMIZE_ILM(11, 1);
插入随机数据用于测试:gaussdb=# INSERT INTO ilm_table_1 select *, 'test_data' FROM generate_series(1, 10000); gaussdb=# DECLARE v_taskid number; gaussdb=# BEGIN DBE_ILM.EXECUTE_ILM(OWNER => 'public', OBJECT_NAME => 'ilm_table_1', TASK_ID => v_taskid, SUBOBJECT_NAME => NULL, POLICY_NAME => 'ALL POLICIES', EXECUTION_MODE => 2); RAISE INFO 'Task ID is:%', v_taskid; gaussdb=# END; /
如入参有误,会报对应的错误信息。无误则无输出(上述代码段添加了RAISE INFO语句打印当前task的id)。INFO: Task ID is:1
检查task信息:gaussdb=# SELECT * FROM gs_my_ilmtasks; task_id | task_owner | state | creation_time | start_time | completion_time ---------+------------+-----------+-------------------------------+-------------------------------+------------------------------- 1 | omm | COMPLETED | 2023-08-29 17:36:38.779555+08 | 2023-08-29 17:36:38.779555+08 | 2023-08-29 17:36:38.879485+08 (1 row)
检查评估结果:gaussdb=# SELECT * FROM gs_my_ilmevaluationdetails; task_id | policy_name | object_owner | object_name | subobject_name | object_type | selected_for_execution | job_name | comments ---------+-------------+--------------+-------------+----------------+-------------+------------------------+-------------------+---------- 1 | p1 | public | ilm_table_1 | | TABLE | SELECTED FOR EXECUTION | ilmjob$_postgres1 | (1 row)
检查压缩job信息:gaussdb=# SELECT * FROM gs_my_ilmresults; task_id | job_name | job_state | start_time | completion_time | comments | statistics ---------+-------------------+------------------------+-------------------------------+-------------------------------+----------+---------------------------------------- 1 | ilmjob$_postgres1 | COMPLETED SUCCESSFULLY | 2023-08-29 17:36:38.779555+08 | 2023-08-29 17:36:38.879485+08 | | SpaceSaving=0,BoundTime=0,LastBlkNum=0 (1 row)
- 触发后台自动调度评估。
gaussdb=# DECLARE V_HOUR INT := 22; V_MINUTE INT := 0; V_SECOND INT := 0; C_ADO_WINDOW_SCHEDULE_NAME TEXT := 'ado_window_schedule'; C_ADO_WINDOW_PROGRAM_NAME TEXT := 'ado_window_program'; C_MAINTENANCE_WINDOW_JOB_NAME TEXT := 'maintenance_window_job'; V_MAINTENCE_WINDOW_REPEAT TEXT; V_MAINTENCE_WINDOW_START TIMESTAMPTZ; V_BE_SCHEDULE_ENABLE BOOL; V_MAINTENANCE_WINDOW_EXIST INT; BEGIN SELECT COUNT(*) INTO V_MAINTENANCE_WINDOW_EXIST FROM PG_CATALOG.PG_JOB WHERE JOB_NAME = 'maintenance_window_job' AND DBNAME = 'template1'; IF CURRENT_DATABASE() != 'template1' THEN RAISE EXCEPTION 'Create maintenance_window FAILED, current database is not tempalte1'; END IF; IF V_MAINTENANCE_WINDOW_EXIST = 0 AND CURRENT_DATABASE() = 'template1' THEN SELECT CASE WHEN NOW() < CURRENT_DATE + INTERVAL '22 HOUR' THEN CURRENT_DATE + INTERVAL '22 HOUR' ELSE CURRENT_DATE + INTERVAL '1 DAY 22 HOUR' END INTO V_MAINTENCE_WINDOW_START; --1. prepare for maintence window schedule SELECT 'freq=daily;interval=1;byhour='||V_HOUR||';byminute='||V_MINUTE||';bysecond='||V_SECOND INTO V_MAINTENCE_WINDOW_REPEAT; BEGIN SELECT CASE WHEN VALUE = 1 THEN TRUE -- DBE_ILM_ADMIN.ILM_ENABLED ELSE FALSE END INTO V_BE_SCHEDULE_ENABLE FROM PG_CATALOG.GS_ILM_PARAM WHERE IDX = 7; -- DBE_ILM_ADMIN.ENABLED EXCEPTION WHEN OTHERS THEN V_BE_SCHEDULE_ENABLE := FALSE; END; --2. Create ado window schedule DBE_SCHEDULER.CREATE_SCHEDULE( SCHEDULE_NAME => C_ADO_WINDOW_SCHEDULE_NAME, START_DATE => '9999-01-01 00:00:01', REPEAT_INTERVAL => NULL, END_DATE => NULL, COMMENTS => 'ado window schedule'); --3. Create ado window program DBE_SCHEDULER.CREATE_PROGRAM( PROGRAM_NAME => C_ADO_WINDOW_PROGRAM_NAME, PROGRAM_TYPE => 'plsql_block', PROGRAM_ACTION => 'call prvt_ilm.be_execute_ilm();', NUMBER_OF_ARGUMENTS => 0, ENABLED => TRUE, COMMENTS => NULL); --4. Create maintenance window master job DBE_SCHEDULER.CREATE_JOB( JOB_NAME => C_MAINTENANCE_WINDOW_JOB_NAME, START_DATE => V_MAINTENCE_WINDOW_START, REPEAT_INTERVAL => V_MAINTENCE_WINDOW_REPEAT, END_DATE => NULL, JOB_TYPE => 'STORED_PROCEDURE'::TEXT, JOB_ACTION => 'prvt_ilm.be_active_ado_window'::TEXT, NUMBER_OF_ARGUMENTS => 0, ENABLED => V_BE_SCHEDULE_ENABLE, AUTO_DROP => FALSE, COMMENTS => 'maintenance window job'); END IF; END;
自动调度提供若干参数用于调整:gaussdb=# SELECT * FROM gs_adm_ilmparameters; name | value ----------------------------------+------- EXECUTION_INTERVAL | 15 RETENTION_TIME | 30 ENABLED | 1 POLICY_TIME | 0 ABS_JOBLIMIT | 10 JOB_SIZELIMIT | 1024 WIND_DURATION | 240 BLOCK_LIMITS | 40 ENABLE_META_COMPRESSION | 0 SAMPLE_MIN | 10 SAMPLE_MAX | 10 CONST_PRIO | 40 CONST_THRESHOLD | 90 EQVALUE_PRIO | 60 EQVALUE_THRESHOLD | 80 ENABLE_DELTA_ENCODE_SWITCH | 1 LZ4_COMPRESSION_LEVEL | 0 ENABLE_LZ4_PARTIAL_DECOMPRESSION | 1 (18 rows)
- EXECUTION_INTERVAL:自动调度任务执行间隔,默认每15分钟执行一次。
- RETENTION_TIME:历史压缩任务记录清理间隔,默认每30天清理一次。
- ENABLED:当前自动调度启用情况,默认为开启。
- POLICY_TIME:策略评估的时间单位,测试使用。默认以天为单位。
- ABS_JOBLIMIT:单次评估生成压缩任务数量上限,默认为10个。
- JOB_SIZELIMIT:单个压缩任务的IO上限,默认为1GB。
- WIND_DURATION:单次维护窗口的持续时间。
- BLOCK_LIMITS:控制实例级的行存压缩速率上限,默认是40,取值范围是0到10000(0表示不限制)。单位是block/ms,表示每毫秒最多压缩多少个block。速率上限计算方法:BLOCK_LIMITS*1000*BLOCKSIZE,以默认值40为例,其速率上限为:40*1000*8KB=320000KB/s。
- ENABLE_META_COMPRESSION:是否开启header压缩,默认为0,取值范围为0(关闭)和1(开启)。
设置此参数为1时,对于单行数据较短的表,压缩率会有一定提升,但是访问压缩行的性能会有较大幅度的下降。若数据库多是单行数据较长的表,不建议开启此参数。
- SAMPLE_MIN:常量编码和等值编码采样步长最小值,默认为10,取值范围[1, 100],支持小数输入,小数会自动向下取整。
- SAMPLE_MAX:常量编码和等值编码采样步长最大值,默认为10,取值范围[1, 100],支持小数输入,小数会自动向下取整。
- CONST_PRIO:常量编码优先级,默认为40,取值范围[0, 100],100表示关闭常量编码,支持小数输入,小数会自动向下取整。
- CONST_THRESHOLD:常量编码阈值,默认为90,取值范围[1, 100],表示一列常量值的占比超过该阈值时进行常量编码,支持小数输入,小数会自动向下取整。
- EQVALUE_PRIO:等值编码优先级,默认为60,取值范围[0, 100],100表示关闭等值编码,支持小数输入,小数会自动向下取整。
- EQVALUE_THRESHOLD:等值编码阈值,默认为80,取值范围[1, 100],表示两列数据的等值比例超过该阈值时进行等值编码,支持小数输入,小数会自动向下取整。
- ENABLE_DELTA_ENCODE_SWITCH:差值编码开关,默认为1,支持小数输入,0表示关闭,1表示开启,小数会自动向下取整。
- LZ4_COMPRESSION_LEVEL:lz4压缩等级,默认为0,取值范围[0, 16],支持小数输入,小数会自动向下取整。
- ENABLE_LZ4_PARTIAL_DECOMPRESSION:部分解压开关,默认为1,支持小数输入,0表示关闭,1表示开启,小数会自动向下取整。
以上参数均可通过DBE_ILM_ADMIN.CUSTOMIZE_ILM()接口调整。
维护窗口默认为每天22:00(北京时间)开启,可通过DBE_SCHEDULER提供的接口SET_ATTRIBUTE进行设置:\c template1 CALL DBE_ILM_ADMIN.DISABLE_ILM(); CALL DBE_ILM_ADMIN.ENABLE_ILM(); DECLARE newtime timestamptz := CLOCK_TIMESTAMP() + to_interval('2 seconds'); BEGIN DBE_SCHEDULER.set_attribute( name => 'maintenance_window_job', attribute => 'start_date', value => TO_CHAR(newtime, 'YYYY-MM-DD HH24:MI:SS') ); END; /
- 手动执行压缩评估。