Updated on 2024-11-29 GMT+08:00

Configuring Transparent Encryption for Hive

Scenario

HDFS implements transparent encryption. After an encrypted partition is configured, the encryption and decryption process is implemented by the HDFS client when you store data in the encrypted partition, which is transparent to upper-layer applications. When storing Hive service data to HDFS, configure the Hive root directory on HDFS as the encrypted partition. Transparent encryption is supported by default.

Prerequisites

  • The KMS, HDFS, and Hive services have been installed in the cluster and are running properly.
  • The HDFS service has been interconnected with KMS. For details, see Interconnecting HDFS with KMS.
  • The key used for encryption has been created. For details, see Key Management.
  • The cluster client has been installed in a directory, for example, /opt/client.

Procedure

  1. Log in to the cluster client and go to the client installation directory.

    For example, run the following command:

    cd /opt/client

  2. Import environment variables of the client and run the kinit command as user hdfs in the system to authenticate the user.

    source bigdata_env

    kinit hdfs

  3. Create a new Hive root directory and use the generated key to set the new Hive root directory as an encrypted partition.

    hdfs dfs -mkdir Hive root directory

    hdfs dfs -chown Hive username:hadoop Hive root directory

    hdfs crypto -createZone -keyName key_name -path Hive root directory

    hdfs crypto -listZones // Check the encrypted partition.

    For example, run the following commands:

    hdfs dfs -mkdir /hive

    hdfs dfs -chown hiveuser:hadoop /hive

    hdfs crypto -createZone -keyName key1 -path /hive

    hdfs crypto -listZones

  4. Create a foreign table in the preceding directory.

    For example, run the following command:

    create external table test (id int) location '/hive';