Help Center/ GaussDB(DWS)/ Best Practices/ Data Development/ Encrypting and Decrypting Data Columns
Updated on 2024-10-29 GMT+08:00

Encrypting and Decrypting Data Columns

Data encryption is widely used in various information systems as a technology to effectively prevent unauthorized access and prevent data leakage. As the core of the information system, the GaussDB(DWS) data warehouse also provides data encryption functions, including transparent encryption and encryption using SQL functions. This section describes SQL function encryption.

Currently, GaussDB(DWS) does not support decrypting data encrypted in Oracle, Teradata, and MySQL databases. The encryption and decryption of Oracle, Teradata, and MySQL databases are different from those of GaussDB(DWS). GaussDB(DWS) can only decrypt unencrypted data migrated from Oracle, Teradata, and MySQL databases.

Background

  • Hash Functions

    The hash function is also called the digest algorithm. It maps input data of an arbitrary length to an output of fixed length. For example, Hash(data)=result. This process is irreversible. That is, the hash function does not have an inverse function, and data cannot be obtained from the result. In scenarios where plaintext passwords should not be stored (passwords are sensitive) or known by system administrators, hash algorithms should be used to store one-way hash values of passwords.

    In actual use, salt values and iteration are added to prevent same hash values generated by same passwords, hence to prevent rainbow table attacks.

  • Symmetric Encryption Algorithms

    Symmetric encryption algorithms use the same key to encrypt and decrypt data. There are two subcategories of symmetric encryption algorithms: block ciphers and stream ciphers.

    Block ciphers break the plaintext into fixed-length groups of bits known as blocks and Each block then gets encrypted as a unit. And if there's not enough data to completely fill a block, "padding" is then used to ensure that the blocks meet the fixed-length requirements. Due to padding, the length of the ciphertext obtained by block ciphers is greater than that of the plaintext.

    In stream ciphers, encryption and decryption parties use same pseudo-random encrypted data stream as keys, and plaintext data is sequentially encrypted by these keys. In practice, data is encrypted one bit at a time using an XOR operation. Stream cyphers do not need to be padded. Therefore the length of the obtained ciphertext is same as the length of the plaintext.

    Figure 1 Symmetric encryption algorithms

Technical Details

GaussDB(DWS) provides hash functions and symmetric cryptographic algorithms to encrypt and decrypt data columns. Hash functions support sha256, sha384, sha512, and SM3. Symmetric cryptographic algorithms support AES128, AES192, AES256, and SM4.

  • Hash Functions
    • md5(string)

      Use MD5 to encrypt string and return a hexadecimal value. MD5 is insecure and is not recommended.

    • gs_hash(hashstr, hashmethod)

      Obtains the digest string of a hashstr string based on the algorithm specified by hashmethod. hashmethod can be sha256, sha384, sha512, or sm3.

  • Symmetric Encryption Algorithms
    • gs_encrypt(encryptstr, keystr, cryptotype, cryptomode, hashmethod)

      Encrypts an encryptstr string using the keystr key based on the encryption algorithm specified by cryptotype and cryptomode and the HMAC algorithm specified by hashmethod, and returns the encrypted string.

    • gs_decrypt(decryptstr, keystr, cryptotype, cryptomode, hashmethod)

      Decrypts a decryptstr string using the keystr key based on the encryption algorithm specified by cryptotype and cryptomode and the HMAC algorithm specified by hashmethod, and returns the decrypted string. The keystr used for decryption must be consistent with that used for encryption.

    • gs_encrypt_aes128(encryptstr, keystr)

      Encrypts encryptstr strings using keystr as the key and returns encrypted strings. The length of keystr ranges from 1 to 16 bytes.

    • gs_decrypt_aes128(decryptstr, keystr)

      Decrypts a decryptstr string using the keystr key and returns the decrypted string. The keystr used for decryption must be consistent with that used for encryption. keystr cannot be empty.

    For more information about functions, see Using Functions for Encryption and Decryption.

Examples

  1. Connect to the database.

    For details, see Using the CLI to Connect to a GaussDB(DWS) Cluster.

  2. Create the table student with the columns id, name, and score. Then use hash functions to encrypt and save names, and use symmetric cryptographic algorithms to save scores.

    1
    2
    3
    4
    5
    CREATE TABLE student (id int, name text, score text, subject text);
    
    INSERT INTO student VALUES (1, gs_hash('alice', 'sha256'), gs_encrypt('95', '12345', 'aes128', 'cbc', 'sha256'),gs_encrypt_aes128('math', '1234'));
    INSERT INTO student VALUES (2, gs_hash('bob', 'sha256'), gs_encrypt('92', '12345', 'aes128', 'cbc', 'sha256'),gs_encrypt_aes128('english', '1234'));
    INSERT INTO student VALUES (3, gs_hash('peter', 'sha256'), gs_encrypt('98', '12345', 'aes128', 'cbc', 'sha256'),gs_encrypt_aes128('science', '1234'));
    

  3. Query the table student without using keys. The query result shows that the encrypted data in the name and score columns cannot be viewed even if you have the SELECT permission.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    SELECT * FROM student;
     id |                               name                               |                                                          score                                                           |
                                    subject
    ----+------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------+-----------
    -----------------------------------------------------------------------------------
      1 | 2bd806c97f0e00af1a1fc3328fa763a9269723c8db8fac4f93af71db186d6e90 | AAAAAAAAAABAuUC3VQ+MvPCDAaTUySl1e2gGLr4/ATdCUjTEvova3cb/Ba3ZKqIn1yNVGEFBvJnTq/3sLF4//Gm8qG7AyfNbbqdW3aYErLVpbE/QWFX9Ig== | aFEWQR2gkj
    iu6sfsAad+dHzfFDHePZ6xd44zyekh+qVFlh9FODZ0DoaFAJXctwUsiqaiitTxW8cCSEaNjS/E7Ke1ruY=
      2 | 81b637d8fcd2c6da6359e6963113a1170de795e4b725b84d1e0b4cfd9ec58ce9 | AAAAAAAAAABAuUC3VQ+MvPCDAaTUySl1taXxAoDqE793hgyCJvC0ESdAX5Mtgdq2LXI1f5ZxraQ73WIJVtIBX8oe3gTDxoXGlHbHht4kzM4U8dOwr5rjgg== | aFEWQR2gkj
    iu6sfsAad+dM8tPTDo/Pds6ZmqdmjGiKxf39+Wzx5NoQ6c8FrzihnRzgc0fycWSu5YGWNOKYWhRsE84Ac=
      3 | 026ad9b14a7453b7488daa0c6acbc258b1506f52c441c7c465474c1a564394ff | AAAAAAAAAACnyusORPeApqMUgh56ucQu3uso/Llw5MbPFMkOXuspEzhhnc9vErwOFe6cuGtx8muEyHCX7V5yXs+8FxhNh3n5L3419LDWJJLY2O4merHpSg== | zomphRfHV4
    H32hTtgkio1PyrobVO8N+hN7kAKwtygKP2E7Aaf1vsjmtLHcL88jyeJNe1lxe0fAvodzPJAxAuV3UJN4M=
    (3 rows)
    

  4. Query the table student using keys. The query result shows that the data is decrypted by the function gs_decrypt (corresponding to gs_encrypt) and can be viewed.

    1
    2
    3
    4
    5
    6
    7
    SELECT id, gs_decrypt(score, '12345', 'aes128', 'cbc', 'sha256'),gs_decrypt_aes128(subject, '1234') FROM student;
     id | gs_decrypt | gs_decrypt_aes128
    ----+------------+-------------------
      1 | 95         | math
      2 | 92         | english
      3 | 98         | science
    (3 rows)