Configuring Security Authentication for Spark Applications
Prerequisites
You have enabled Kerberos authentication for the MRS cluster.
Scenario Description
In a cluster with Kerberos authentication enabled, the components must be mutually authenticated before communicating with each other to ensure communication security.
In some cases, Spark needs to communicate with Hadoop and HBase when users develop Spark applications. Codes for security authentication need to be written into the Spark applications to ensure that the Spark applications can work properly.
Three security authentication modes are available:
- Command authentication:
Before running the Spark applications or using the CLI to connect to Spark SQL, run the following command on the Spark client for authentication:
kinit Component service user
- Configuration authentication:
You can specify security authentication information in any of the following ways:
- In the spark-default.conf configuration file of the client, set spark.yarn.keytab and spark.yarn.principal to specify the authentication information.
- Add the following parameters to the bin/spark-submit command to specify authentication information.
--conf spark.yarn.keytab=<keytab file path> --conf spark.yarn.principal=<Principal account>
- Add the following parameters to the bin/spark-submit command to specify authentication information.
--keytab <keytab file path> --principal <Principal account>
- Code authentication:
Obtain the principal and keytab files of the client for authentication.
The following table lists the authentication method used by the sample code in the cluster with Kerberos authentication enabled.
Table 1 Security authentication method Sample Code
Mode
Security Authentication Method
spark-examples-normal
yarn-client
Command authentication, configuration authentication, or code authentication
yarn-cluster
Either command authentication or configuration authentication
spark-examples-security
(including security authentication code)
yarn-client
Code authentication
yarn-cluster
Not supported
- In the preceding table, the yarn-cluster mode does not support security authentication in the Spark project code, because authentication must be completed before the application is started.
- The security authentication code of the Python sample project is not provided. You are advised to set security authentication parameters in the command for running applications.
Security Authentication Code (Java)
Currently, the sample code invokes the LoginUtil class for security authentication in a unified manner.
In the Spark sample project code, different sample projects use different authentication codes. Basic security authentication or ZooKeeper authentication is used. The following table describes the example authentication parameters used in the sample project. Change the parameter values based on the site requirements.
Parameter |
Example Value |
Description |
---|---|---|
userPrincipal |
sparkuser |
Principal account used for authentication. You can obtain the account from the administrator. |
userKeytabPath |
/opt/FIclient/user.keytab |
Keytab file used for authentication. You can obtain the file from the administrator. |
krb5ConfPath |
/opt/FIclient/KrbClient/kerberos/var/krb5kdc/krb5.conf |
Path and name of the krb5.conf file |
ZKServerPrincipal |
zookeeper/hadoop.hadoop.com |
Principal of the ZooKeeper server. Contact the administrator to obtain the account. |
- Basic security authentication:
Spark Core and Spark SQL applications do not need to access HBase or ZooKeeper. They need only the basic authentication code. Add the following code to the applications and set security authentication parameters as required:
String userPrincipal = "sparkuser"; String userKeytabPath = "/opt/FIclient/user.keytab"; String krb5ConfPath = "/opt/FIclient/KrbClient/kerberos/var/krb5kdc/krb5.conf"; Configuration hadoopConf = new Configuration(); LoginUtil.login(userPrincipal, userKeytabPath, krb5ConfPath, hadoopConf);
- ZooKeeper authentication:
The sample projects of Spark Streaming, accessing Spark SQL applications through JDBC, and Spark on HBase do not only require basic security authentication, but also need to add the principal of the ZooKeeper server to complete security authentication. Add the following code to the applications and set security authentication parameters as required:
String userPrincipal = "sparkuser"; String userKeytabPath = "/opt/FIclient/user.keytab"; String krb5ConfPath = "/opt/FIclient/KrbClient/kerberos/var/krb5kdc/krb5.conf"; String ZKServerPrincipal = "zookeeper/hadoop.hadoop.com"; String ZOOKEEPER_DEFAULT_LOGIN_CONTEXT_NAME = "Client"; String ZOOKEEPER_SERVER_PRINCIPAL_KEY = "zookeeper.server.principal"; Configuration hadoopConf = new Configuration(); LoginUtil.setJaasConf(ZOOKEEPER_DEFAULT_LOGIN_CONTEXT_NAME, userPrincipal, userKeytabPath); LoginUtil.setZookeeperServerPrincipal(ZOOKEEPER_SERVER_PRINCIPAL_KEY, ZKServerPrincipal); LoginUtil.login(userPrincipal, userKeytabPath, krb5ConfPath, hadoopConf);
Security Authentication Code (Scala)
Currently, the sample code invokes the LoginUtil class for security authentication in a unified manner.
In the Spark sample project code, different sample projects use different authentication codes. Basic security authentication or ZooKeeper authentication is used. The following table describes the example authentication parameters used in the sample project. Change the parameter values based on the site requirements.
Parameter |
Example Value |
Description |
---|---|---|
userPrincipal |
sparkuser |
Principal account used for authentication. You can obtain the account from the administrator. |
userKeytabPath |
/opt/FIclient/user.keytab |
Keytab file used for authentication. You can obtain the file from the administrator. |
krb5ConfPath |
/opt/FIclient/KrbClient/kerberos/var/krb5kdc/krb5.conf |
Path and name of the krb5.conf file |
ZKServerPrincipal |
zookeeper/hadoop.hadoop.com |
Principal of the ZooKeeper server. Contact the administrator to obtain the account. |
- Basic security authentication:
Spark Core and Spark SQL applications do not need to access HBase or ZooKeeper. They need only the basic authentication code. Add the following code to the applications and set security authentication parameters as required:
val userPrincipal = "sparkuser" val userKeytabPath = "/opt/FIclient/user.keytab" val krb5ConfPath = "/opt/FIclient/KrbClient/kerberos/var/krb5kdc/krb5.conf" val hadoopConf: Configuration = new Configuration() LoginUtil.login(userPrincipal, userKeytabPath, krb5ConfPath, hadoopConf);
- ZooKeeper authentication:
The sample projects of Spark Streaming, accessing Spark SQL applications through JDBC, and Spark on HBase do not only require basic security authentication, but also need to add the principal of the ZooKeeper server to complete security authentication. Add the following code to the applications and set security authentication parameters as required:
val userPrincipal = "sparkuser" val userKeytabPath = "/opt/FIclient/user.keytab" val krb5ConfPath = "/opt/FIclient/KrbClient/kerberos/var/krb5kdc/krb5.conf" val ZKServerPrincipal = "zookeeper/hadoop.hadoop.com" val ZOOKEEPER_DEFAULT_LOGIN_CONTEXT_NAME: String = "Client" val ZOOKEEPER_SERVER_PRINCIPAL_KEY: String = "zookeeper.server.principal" val hadoopConf: Configuration = new Configuration(); LoginUtil.setJaasConf(ZOOKEEPER_DEFAULT_LOGIN_CONTEXT_NAME, userPrincipal, userKeytabPath) LoginUtil.setZookeeperServerPrincipal(ZOOKEEPER_SERVER_PRINCIPAL_KEY, ZKServerPrincipal) LoginUtil.login(userPrincipal, userKeytabPath, krb5ConfPath, hadoopConf);
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot