Updated on 2024-09-23 GMT+08:00

Configuring HDFS Data Encryption During Transmission

This section describes how to configure encryption for HDFS security channels.

This topic is available for MRS 3.x or later.

Configuring HDFS Security Channel Encryption

The channel between components is not encrypted by default. You can set parameters to enable security channel encryption.

To modify parameters, log in to FusionInsight Manager, choose Cluster > Services > HDFS, and click Configurations then All Configurations. Enter a parameter name in the search box.

After the configuration, restart the corresponding service for the settings to take effect.

Table 1 Parameters

Configuration Item

Description

Default Value

hadoop.rpc.protection

NOTICE:
  • The setting takes effect only after the service is restarted. Rolling restart is not supported.
  • After the setting, you need to download the client configuration file again. Otherwise, HDFS cannot provide the read and write services.
  • After the setting, you need to restart the executor. Otherwise, the job management and file management functions on the console become unavailable.

Indicates whether the RPC channels of each module in Hadoop are encrypted. The channels include:

  • RPC channels for clients to access HDFS
  • RPC channels between modules in HDFS, for example, between DataNode and NameNode
  • RPC channels for clients to access Yarn
  • RPC channels between NodeManager and ResourceManager
  • RPC channels for Spark to access Yarn and HDFS
  • RPC channels for MapReduce to access Yarn and HDFS
  • RPC channels for HBase to access HDFS
NOTE:

The setting takes effect globally, that is, the encryption attribute of the RPC channel of each module in the Hadoop takes effect.

  • Security mode: privacy
  • Normal mode: authentication
NOTE:
  • authentication: indicates that only authentication is required.
  • integrity: indicates that authentication and consistency check need to be performed.
  • privacy: indicates that authentication, consistency check, and encryption need to be performed.

dfs.encrypt.data.transfer

Indicates whether the HDFS data transfer channels and the channels for clients to access HDFS are encrypted. The HDFS data transfer channels include the data transfer channels between DataNodes and the Data Transfer (DT) channels for clients to access DataNodes. The value true indicates that the channels are encrypted. The channels are not encrypted by default.

NOTE:
  • This parameter is available only when hadoop.rpc.protection is set to privacy.
  • If a large amount of service data is transmitted, enabling encryption by default severely affects system performance.
  • If data transmission encryption is configured for one cluster in the trusted cluster, the same data transmission encryption must be configured for the peer cluster.

false

dfs.encrypt.data.transfer.algorithm

Indicates the algorithm to encrypt the HDFS data transfer channels and the channels for clients to access HDFS. This parameter is available only when dfs.encrypt.data.transfer is set to true.

NOTE:

The default value is 3des, indicating that 3DES algorithm is used to encrypt data. The value can also be set to rc4. However, to avoid security risks, you are not advised to set the parameter to this value.

3des

dfs.encrypt.data.transfer.cipher.suites

This parameter can be left empty or set to AES/CTR/NoPadding to specify the cipher suite for data encryption. If this parameter is not specified, the encryption algorithm specified by dfs.encrypt.data.transfer.algorithm is used for data encryption. The default value is AES/CTR/NoPadding.

AES/CTR/NoPadding