Updated on 2023-02-10 GMT+08:00

Data Redaction

GaussDB(DWS) provides the column-level dynamic data masking (DDM) function. For sensitive data (such as the ID card number, mobile number, and bank card number), the DDM function is used to redact the original data to protect data security and user privacy.

  • You can create a redaction policy for a specified table and set the effective scope of the policy.

    For details, see section CREATE REDACTION POLICY.

  • You can modify redaction policies using the provided syntax, including modifying expressions for the policy to take effect, renaming the policy, and adding, modifying, and deleting columns for data redaction.

    For details, see section ALTER REDACTION POLICY.

  • You can delete a redaction policy by deleting data redaction function information of the policy from all columns of a table.

    For details, see section DROP REDACTION POLICY.

  • You can use the built-in masking functions MASK_NONE, MASK_FULL, and MASK_PARTIAL, or create your own masking functions by using the C language or PL/pgSQL.

    For details, see section Data Redaction Functions.

  • Redaction policy information is stored in the system catalog PG_REDACTION_POLICY, and redacted column information is stored in the system catalog PG_REDACTION_COLUMN.
  • You can view information about the redaction policy and redacted columns in the system views REDACTION_POLICIES and REDACTION_COLUMNS.
  • Generally, you can run the SELECT statement to view the data redaction result. If a statement has the following features, sensitive data may be deliberately obtained. In this case, an error will be reported during statement execution.
    • The GROUP BY clause references the Target Entry containing redaction columns as the target column.
    • DISTINCT works on the output redaction columns.
    • The statement contains CTE.
    • Operations on sets are involved.
    • The target columns of a subquery are not redaction columns of the base table, but the expressions or function calls for redaction columns of the base table.
  • You can use COPY TO or GDS to export the redacted data. Due to the irreversibility of the data redaction, secondary redaction of the data is meaningless.
  • Do not set target columns of UPDATE, MERGE INTO, and DELETE statements to redaction columns.
  • The UPSERT statement allows you to insert update data through EXCLUDED. If data in the base table is updated by referencing redaction columns, the data may be modified by mistake. As a result, an error will be reported during the execution.

Examples

The following uses the employee table emp, administrator alice, and common users matu and july as examples to describe the data redaction process. The user alice is the owner of the emp table. The emp table contains private data such as the employee name, mobile number, email address, bank card number, and salary.

  1. Create users alice, matu, and july:
    1
    2
    3
    CREATE ROLE alice PASSWORD 'password';
    CREATE ROLE matu PASSWORD 'password';
    CREATE ROLE july PASSWORD 'password';
    
  2. Create the emp table as user alice, and insert three employee records into the table.
    1
    2
    3
    4
    5
    CREATE TABLE emp(id int, name varchar(20), phone_no varchar(11), card_no number, card_string varchar(19), email text, salary numeric(100, 4), birthday date);
    
    INSERT INTO emp VALUES(1, 'anny', '13420002340', 1234123412341234, '1234-1234-1234-1234', 'smithWu@163.com', 10000.00, '1999-10-02');
    INSERT INTO emp VALUES(2, 'bob', '18299023211', 3456345634563456, '3456-3456-3456-3456', '66allen_mm@qq.com', 9999.99, '1989-12-12');
    INSERT INTO emp VALUES(3, 'cici', '15512231233', NULL, NULL, 'jonesishere@sina.com', NULL, '1992-11-06');
    
  3. User alice grants the emp table read permission to users matu and july.
    1
    GRANT SELECT ON emp TO matu, july;
    
  4. Only user alice can view all employee information. Users matu and july cannot view bank card numbers and salary data of the employees. Create a redaction policy for the emp table and bind the redaction function to card_no, card_string, and salary columns, respectively.
    1
    2
    3
    4
    CREATE REDACTION POLICY mask_emp ON emp WHEN (current_user IN ('matu', 'july'))
     ADD COLUMN card_no WITH mask_full(card_no),
     ADD COLUMN card_string WITH mask_partial(card_string, 'VVVVFVVVVFVVVVFVVVV','VVVV-VVVV-VVVV-VVVV','#',1,12),
     ADD COLUMN salary WITH mask_partial(salary, '9', 1, length(salary) - 2);
    
  5. Switch to users matu and july and view the emp table, respectively.
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    SET ROLE matu PASSWORD 'password';
    SELECT * FROM emp;
     id | name |  phone_no   | card_no |     card_string     |        email         |   salary   |      birthday       
    ----+------+-------------+---------+---------------------+----------------------+------------+---------------------
      1 | anny | 13420002340 |       0 | ####-####-####-1234 | smithWu@163.com      | 99999.9990 | 1999-10-02 00:00:00
      2 | bob  | 18299023211 |       0 | ####-####-####-3456 | 66allen_mm@qq.com    |  9999.9990 | 1989-12-12 00:00:00
      3 | cici | 15512231233 |         |                     | jonesishere@sina.com |            | 1992-11-06 00:00:00
    (3 rows)
    
    SET ROLE july PASSWORD 'password';
    SELECT * FROM emp;
     id | name |  phone_no   | card_no |     card_string     |        email         |   salary   |      birthday       
    ----+------+-------------+---------+---------------------+----------------------+------------+---------------------
      1 | anny | 13420002340 |       0 | ####-####-####-1234 | smithWu@163.com      | 99999.9990 | 1999-10-02 00:00:00
      2 | bob  | 18299023211 |       0 | ####-####-####-3456 | 66allen_mm@qq.com    |  9999.9990 | 1989-12-12 00:00:00
      3 | cici | 15512231233 |         |                     | jonesishere@sina.com |            | 1992-11-06 00:00:00
    (3 rows)
    
  6. User matu also has the permission to view all employee information. Only user july is not allowed to view the information. Modify the effective scope of the redaction policy.
    1
    ALTER REDACTION POLICY mask_emp ON emp WHEN(current_user = 'july');
    
  7. Switch to users matu and july and view the emp table again, respectively.
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    SET ROLE matu PASSWORD 'password';
    SELECT * FROM emp;
     id | name |  phone_no   |     card_no      |     card_string     |        email         |   salary   |      birthday       
    ----+------+-------------+------------------+---------------------+----------------------+------------+---------------------
      1 | anny | 13420002340 | 1234123412341234 | 1234-1234-1234-1234 | smithWu@163.com      | 10000.0000 | 1999-10-02 00:00:00
      2 | bob  | 18299023211 | 3456345634563456 | 3456-3456-3456-3456 | 66allen_mm@qq.com    |  9999.9900 | 1989-12-12 00:00:00
      3 | cici | 15512231233 |                  |                     | jonesishere@sina.com |            | 1992-11-06 00:00:00
    (3 rows)
    
    SET ROLE july PASSWORD 'password';
    SELECT * FROM emp;
     id | name |  phone_no   | card_no |     card_string     |        email         |   salary   |      birthday       
    ----+------+-------------+---------+---------------------+----------------------+------------+---------------------
      1 | anny | 13420002340 |       0 | ####-####-####-1234 | smithWu@163.com      | 99999.9990 | 1999-10-02 00:00:00
      2 | bob  | 18299023211 |       0 | ####-####-####-3456 | 66allen_mm@qq.com    |  9999.9990 | 1989-12-12 00:00:00
      3 | cici | 15512231233 |         |                     | jonesishere@sina.com |            | 1992-11-06 00:00:00
    (3 rows)
    
  8. The information in the phone_no, email, and birthday columns is private data. Update redaction policy mask_emp and add three redaction columns.
    1
    2
    3
    ALTER REDACTION POLICY mask_emp ON emp ADD COLUMN phone_no WITH mask_partial(phone_no, '*', 4);
    ALTER REDACTION POLICY mask_emp ON emp ADD COLUMN email WITH mask_partial(email, '*', 1, position('@' in email));
    ALTER REDACTION POLICY mask_emp ON emp ADD COLUMN birthday WITH mask_full(birthday);
    
  9. Switch to user july and view emp table data.
    1
    2
    3
    4
    5
    6
    7
    8
    SET ROLE july PASSWORD 'password';
    SELECT * FROM emp;
     id | name |  phone_no   | card_no |     card_string     |        email         |   salary   |      birthday       
    ----+------+-------------+---------+---------------------+----------------------+------------+---------------------
      1 | anny | 134******** |       0 | ####-####-####-1234 | ********163.com      | 99999.9990 | 1970-01-01 00:00:00
      2 | bob  | 182******** |       0 | ####-####-####-3456 | ***********qq.com    |  9999.9990 | 1970-01-01 00:00:00
      3 | cici | 155******** |         |                     | ************sina.com |            | 1970-01-01 00:00:00
    (3 rows)
    
  10. Query redaction_policies and redaction_columns to view details about the current redaction policy mask_emp.
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    SELECT * FROM redaction_policies;
     object_schema | object_owner | object_name | policy_name |            expression             | enable | policy_description | inherited 
    ---------------+--------------+-------------+-------------+-----------------------------------+--------+--------------------+-----------
     public        | alice        | emp         | mask_emp    | ("current_user"() = 'july'::name) | t      |                    | f       
    (1 row)
    
    SELECT object_name, column_name, function_info FROM redaction_columns;
     object_name | column_name |                                             function_info                                             
    -------------+-------------+-------------------------------------------------------------------------------------------------------
     emp         | card_no     | mask_full(card_no)
     emp         | card_string | mask_partial(card_string, 'VVVVFVVVVFVVVVFVVVV'::text, 'VVVV-VVVV-VVVV-VVVV'::text, '#'::text, 1, 12)
     emp         | email       | mask_partial(email, '*'::text, 1, "position"(email, '@'::text))
     emp         | salary      | mask_partial(salary, '9'::text, 1, (length((salary)::text) - 2))
     emp         | birthday    | mask_full(birthday)
     emp         | phone_no    | mask_partial(phone_no, '*'::text, 4)
    (6 rows)
    
  11. Add the salary_info column. To replace the salary information in text format with *.*, you can create a user-defined redaction function. In this step, you can use the PL/pgSQL to define the redaction function mask_regexp_salary. To create a redaction column, you simply need to customize the function name and parameter list. For details, see User-Defined Functions.
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    ALTER TABLE emp ADD COLUMN salary_info TEXT;
    UPDATE emp SET salary_info = salary::text;
    
    CREATE FUNCTION mask_regexp_salary(salary_info text) RETURNS text AS
    $$
     SELECT regexp_replace($1, '[0-9]+','*','g');
    $$
    LANGUAGE SQL
    STRICT SHIPPABLE;
    
    ALTER REDACTION POLICY mask_emp ON emp ADD COLUMN salary_info WITH mask_regexp_salary(salary_info);
    
    SET ROLE july PASSWORD 'password';
    SELECT id, name, salary_info FROM emp;
     id | name | salary_info 
    ----+------+-------------
      1 | anny | *.*
      2 | bob  | *.*
      3 | cici | 
    (3 rows)
    
  12. If there is no need to set a redaction policy for the emp table, delete redaction policy mask_emp.
    1
    DROP REDACTION POLICY mask_emp ON emp;