更新时间:2024-11-12 GMT+08:00
分享

使用说明

使用高级压缩的功能,用户必须申请License才能使用。具体情况请联系华为工程师。

  1. 执行如下命令开启压缩功能:

    gaussdb=# ALTER DATABASE SET ilm = on;

    检查当前数据库的public schema中是否存在gsilmpolicy_seq和gsilmtask_seq。

    gaussdb=# \d
                                                List of relations
     Schema |      Name       |   Type   |   Owner    |                       Storage                        
    --------+-----------------+----------+------------+------------------------------------------------------
     public | gsilmpolicy_seq | sequence | omm |
     | 
     public | gsilmtask_seq   | sequence | omm |
     | 

    或者:

    gaussdb=# SELECT a.oid, a.relname from pg_class a inner join pg_namespace b on a.relnamespace = b.oid
        WHERE (a.relname = 'gsilmpolicy_seq' OR a.relname = 'gsilmtask_seq') AND b.nspname = 'public';
    
      oid  |     relname     
    -------+-----------------
     17002 | gsilmpolicy_seq
     17004 | gsilmtask_seq
    (2 rows)

    生成异常会报warning:

    WARNING:  ILM sequences are already existed while initializing

  2. 为表添加压缩策略。

    • 新建带策略的表:
      gaussdb=# CREATE TABLE ilm_table_1 (col1 int, col2 text)
          ilm add policy row store compress advanced row
          after 3 days of no modification on (col1 < 1000);
    • 为存量表添加策略:
      gaussdb=# CREATE TABLE ilm_table_2 (col1 int, col2 text);
      gaussdb=# ALTER TABLE ilm_table_2 ilm add policy row store
          compress advanced row after 3 days of no modification;
    • 检查策略视图中是否新增数据:
      gaussdb=# SELECT * FROM gs_my_ilmpolicies;
      
       policy_name |  policy_type  | tablespace | enabled | deleted 
      -------------+---------------+------------+---------+---------
       p1          | DATA MOVEMENT |            | YES     | NO
       p2          | DATA MOVEMENT |            | YES     | NO
      (2 rows)
    • 检查策略详细信息视图中是否新增了符合刚刚设置的策略:
      gaussdb=# SELECT * FROM gs_my_ilmdatamovementpolicies;
      
       policy_name | action_type | scope | compression_level | tier_tablespace | tier_status |     condition_type     | condition_days | custom_function | policy_subtype | action_clause | tier_to 
      -------------+-------------+-------+-------------------+-----------------+-------------+------------------------+----------------+-----------------+----------------+---------------+---------
       p1          | COMPRESSION | ROW   | ADVANCED          |                 |             | LAST MODIFICATION TIME |              3 |                 |                |               | 
       p2          | COMPRESSION | ROW   | ADVANCED          |                 |             | LAST MODIFICATION TIME |              3 |                 |                |               | 
      (2 rows)
    • 检查策略与目标表是否对应:
      gaussdb=# SELECT * FROM gs_my_ilmobjects;
       policy_name | object_owner | object_name | subobject_name | object_type |    inherited_from    | tbs_inherited_from | enabled | deleted 
      -------------+--------------+-------------+----------------+-------------+----------------------+--------------------+---------+---------
       p1          | public       | ilm_table_1 |                | TABLE       | POLICY NOT INHERITED |                    | YES     | NO
       p2          | public       | ilm_table_2 |                | TABLE       | POLICY NOT INHERITED |                    | YES     | NO
      (2 rows)
      

  3. 执行压缩评估。

    • 手动执行压缩评估。

      为方便测试,本功能环境参数中提供POLICY_TIME属性,决定时间条件以天为单位还是以秒为单位。通过下面语句调整:

      gaussdb=# CALL DBE_ILM_ADMIN.CUSTOMIZE_ILM(11, 1);
      插入随机数据用于测试:
      gaussdb=# INSERT INTO ilm_table_1 select *, 'test_data' FROM generate_series(1, 10000);
      
      gaussdb=# DECLARE
          v_taskid number;
      gaussdb=# BEGIN
          DBE_ILM.EXECUTE_ILM(OWNER           => 'public',
                              OBJECT_NAME     => 'ilm_table_1',
                              TASK_ID         => v_taskid,
                              SUBOBJECT_NAME  => NULL,
                              POLICY_NAME     => 'ALL POLICIES',
                              EXECUTION_MODE  => 2);
          RAISE INFO 'Task ID is:%', v_taskid;
      gaussdb=# END;
      /
      如入参有误,会报对应的错误信息。无误则无输出(上述代码段添加了RAISE INFO语句打印当前task的id)。
      INFO:  Task ID is:1
      检查task信息:
      gaussdb=# SELECT * FROM gs_my_ilmtasks;
      
      task_id | task_owner |   state   |         creation_time         |          start_time           |        completion_time        
      ---------+------------+-----------+-------------------------------+-------------------------------+-------------------------------
             1 | omm | COMPLETED | 2023-08-29 17:36:38.779555+08 | 2023-08-29 17:36:38.779555+08 | 2023-08-29 17:36:38.879485+08
      (1 row)
      检查评估结果:
      gaussdb=# SELECT * FROM gs_my_ilmevaluationdetails;
      
       task_id | policy_name | object_owner | object_name | subobject_name | object_type | selected_for_execution |      job_name     | comments 
      ---------+-------------+--------------+-------------+----------------+-------------+------------------------+-------------------+----------
             1 | p1          | public       | ilm_table_1 |                | TABLE       | SELECTED FOR EXECUTION | ilmjob$_postgres1 | 
      (1 row)
      检查压缩job信息:
      gaussdb=# SELECT * FROM gs_my_ilmresults;
      
       task_id |     job_name      |       job_state        |          start_time           |        completion_time        | comments |               statistics               
      ---------+-------------------+------------------------+-------------------------------+-------------------------------+----------+----------------------------------------
             1 | ilmjob$_postgres1 | COMPLETED SUCCESSFULLY | 2023-08-29 17:36:38.779555+08 | 2023-08-29 17:36:38.879485+08 |          | SpaceSaving=0,BoundTime=0,LastBlkNum=0
      (1 row)
    • 触发后台自动调度评估。

      使用初始用户登录template1数据库,创建维护窗口:

      gaussdb=# 
      DECLARE
          V_HOUR INT := 22;
          V_MINUTE INT := 0;
          V_SECOND INT := 0;
          C_ADO_WINDOW_SCHEDULE_NAME TEXT := 'ado_window_schedule';
          C_ADO_WINDOW_PROGRAM_NAME TEXT := 'ado_window_program';
          C_MAINTENANCE_WINDOW_JOB_NAME TEXT := 'maintenance_window_job';
          V_MAINTENCE_WINDOW_REPEAT TEXT;
          V_MAINTENCE_WINDOW_START TIMESTAMPTZ;
          V_BE_SCHEDULE_ENABLE BOOL;
          V_MAINTENANCE_WINDOW_EXIST INT;
      BEGIN
          SELECT COUNT(*) INTO V_MAINTENANCE_WINDOW_EXIST FROM PG_CATALOG.PG_JOB WHERE JOB_NAME = 'maintenance_window_job' AND DBNAME = 'template1';
          IF CURRENT_DATABASE() != 'template1' THEN
              RAISE EXCEPTION 'Create maintenance_window FAILED, current database is not tempalte1';
          END IF;
          IF V_MAINTENANCE_WINDOW_EXIST = 0 AND CURRENT_DATABASE() = 'template1' THEN
              SELECT
                  CASE
                      WHEN NOW() < CURRENT_DATE + INTERVAL '22 HOUR' THEN CURRENT_DATE + INTERVAL '22 HOUR'
                      ELSE CURRENT_DATE + INTERVAL '1 DAY 22 HOUR'
                  END INTO V_MAINTENCE_WINDOW_START;
              --1. prepare for maintence window schedule
              SELECT 'freq=daily;interval=1;byhour='||V_HOUR||';byminute='||V_MINUTE||';bysecond='||V_SECOND INTO V_MAINTENCE_WINDOW_REPEAT;
              BEGIN
                  SELECT
                      CASE
                          WHEN VALUE = 1 THEN TRUE -- DBE_ILM_ADMIN.ILM_ENABLED
                          ELSE FALSE
                      END INTO V_BE_SCHEDULE_ENABLE
                      FROM PG_CATALOG.GS_ILM_PARAM WHERE IDX = 7; -- DBE_ILM_ADMIN.ENABLED
              EXCEPTION
                  WHEN OTHERS THEN
                      V_BE_SCHEDULE_ENABLE := FALSE;
              END;
              --2. Create ado window schedule
              DBE_SCHEDULER.CREATE_SCHEDULE(
                  SCHEDULE_NAME => C_ADO_WINDOW_SCHEDULE_NAME,
                  START_DATE => '9999-01-01 00:00:01',
                  REPEAT_INTERVAL => NULL,
                  END_DATE => NULL,
                  COMMENTS => 'ado window schedule');
              --3. Create ado window program
              DBE_SCHEDULER.CREATE_PROGRAM(
                  PROGRAM_NAME => C_ADO_WINDOW_PROGRAM_NAME,
                  PROGRAM_TYPE => 'plsql_block',
                  PROGRAM_ACTION => 'call prvt_ilm.be_execute_ilm();',
                  NUMBER_OF_ARGUMENTS => 0,
                  ENABLED => TRUE,
                  COMMENTS => NULL);
              --4. Create maintenance window master job
              DBE_SCHEDULER.CREATE_JOB(
                  JOB_NAME => C_MAINTENANCE_WINDOW_JOB_NAME,
                  START_DATE => V_MAINTENCE_WINDOW_START,
                  REPEAT_INTERVAL => V_MAINTENCE_WINDOW_REPEAT,
                  END_DATE => NULL,
                  JOB_TYPE => 'STORED_PROCEDURE'::TEXT,
                  JOB_ACTION => 'prvt_ilm.be_active_ado_window'::TEXT,
                  NUMBER_OF_ARGUMENTS => 0,
                  ENABLED => V_BE_SCHEDULE_ENABLE,
                  AUTO_DROP => FALSE,
                  COMMENTS => 'maintenance window job');
          END IF;
      END;
      自动调度提供若干参数用于调整:
      gaussdb=# SELECT * FROM gs_adm_ilmparameters;
                     name               | value 
      ----------------------------------+-------
       EXECUTION_INTERVAL               |    15
       RETENTION_TIME                   |    30
       ENABLED                          |     1
       POLICY_TIME                      |     0
       ABS_JOBLIMIT                     |    10
       JOB_SIZELIMIT                    |  1024
       WIND_DURATION                    |   240
       BLOCK_LIMITS                     |    40
       ENABLE_META_COMPRESSION          |     0
       SAMPLE_MIN                       |    10
       SAMPLE_MAX                       |    10
       CONST_PRIO                       |    40
       CONST_THRESHOLD                  |    90
       EQVALUE_PRIO                     |    60
       EQVALUE_THRESHOLD                |    80
       ENABLE_DELTA_ENCODE_SWITCH       |     1
       LZ4_COMPRESSION_LEVEL            |     0
       ENABLE_LZ4_PARTIAL_DECOMPRESSION |     1
      (18 rows)
      • EXECUTION_INTERVAL:自动调度任务执行间隔,默认每15分钟执行一次。
      • RETENTION_TIME:历史压缩任务记录清理间隔,默认每30天清理一次。
      • ENABLED:当前自动调度启用情况,默认为开启。
      • POLICY_TIME:策略评估的时间单位,测试使用。默认以天为单位。
      • ABS_JOBLIMIT:单次评估生成压缩任务数量上限,默认为10个。
      • JOB_SIZELIMIT:单个压缩任务的IO上限,默认为1GB。
      • WIND_DURATION:单次维护窗口的持续时间。
      • BLOCK_LIMITS:控制实例级的行存压缩速率上限,默认是40,取值范围是0到10000(0表示不限制)。单位是block/ms,表示每毫秒最多压缩多少个block。速率上限计算方法:BLOCK_LIMITS*1000*BLOCKSIZE,以默认值40为例,其速率上限为:40*1000*8KB=320000KB/s。
      • ENABLE_META_COMPRESSION:是否开启header压缩,默认为0,取值范围为0(关闭)和1(开启)。

        设置此参数为1时,对于单行数据较短的表,压缩率会有一定提升,但是访问压缩行的性能会有较大幅度的下降。若数据库多是单行数据较长的表,不建议开启此参数。

      • SAMPLE_MIN:常量编码和等值编码采样步长最小值,默认为10,取值范围[1, 100],支持小数输入,小数会自动向下取整。
      • SAMPLE_MAX:常量编码和等值编码采样步长最大值,默认为10,取值范围[1, 100],支持小数输入,小数会自动向下取整。
      • CONST_PRIO:常量编码优先级,默认为40,取值范围[0, 100],100表示关闭常量编码,支持小数输入,小数会自动向下取整。
      • CONST_THRESHOLD:常量编码阈值,默认为90,取值范围[1, 100],表示一列常量值的占比超过该阈值时进行常量编码,支持小数输入,小数会自动向下取整。
      • EQVALUE_PRIO:等值编码优先级,默认为60,取值范围[0, 100],100表示关闭等值编码,支持小数输入,小数会自动向下取整。
      • EQVALUE_THRESHOLD:等值编码阈值,默认为80,取值范围[1, 100],表示两列数据的等值比例超过该阈值时进行等值编码,支持小数输入,小数会自动向下取整。
      • ENABLE_DELTA_ENCODE_SWITCH:差值编码开关,默认为1,支持小数输入,0表示关闭,1表示开启,小数会自动向下取整。
      • LZ4_COMPRESSION_LEVEL:lz4压缩等级,默认为0,取值范围[0, 16],支持小数输入,小数会自动向下取整。
      • ENABLE_LZ4_PARTIAL_DECOMPRESSION:部分解压开关,默认为1,支持小数输入,0表示关闭,1表示开启,小数会自动向下取整。

      以上参数均可通过DBE_ILM_ADMIN.CUSTOMIZE_ILM()接口调整。

      维护窗口默认为每天22:00(北京时间)开启,可通过DBE_SCHEDULER提供的接口SET_ATTRIBUTE进行设置:
      \c template1
       CALL DBE_ILM_ADMIN.DISABLE_ILM();
       CALL DBE_ILM_ADMIN.ENABLE_ILM();
      DECLARE
          newtime timestamptz := CLOCK_TIMESTAMP() + to_interval('2 seconds');
      BEGIN
          DBE_SCHEDULER.set_attribute(
              name          =>          'maintenance_window_job',
              attribute     =>          'start_date',
              value         =>          TO_CHAR(newtime, 'YYYY-MM-DD HH24:MI:SS')
          );
      END;
      /

相关文档