安全集群使用HiBench工具运行sparkbench获取不到realm
问题
运行HiBench6的sparkbench任务,如Wordcount,任务执行失败。
“bench.log”中显示Yarn任务执行失败。
登录Yarn WebUI,查看对应application的失败信息,显示如下:
Exception in thread "main" org.apache.spark.SparkException: Unable to load YARN support at org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$1(SparkHadoopUtil.scala:390) at org.apache.spark.deploy.SparkHadoopUtil$.yarn$lzycompute(SparkHadoopUtil.scala:385) at org.apache.spark.deploy.SparkHadoopUtil$.yarn(SparkHadoopUtil.scala:385) at org.apache.spark.deploy.SparkHadoopUtil$.get(SparkHadoopUtil.scala:410) at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:796) at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:821) at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala) Caused by: java.lang.IllegalArgumentException: Can't get Kerberos realm at org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:65) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:288) at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:336) at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:51) at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil.<init>(YarnSparkHadoopUtil.scala:49) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) at org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$1(SparkHadoopUtil.scala:387) ... 6 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.security.authentication.util.KerberosUtil.getDefaultRealm(KerberosUtil.java:88) at org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:63) ... 16 more Caused by: KrbException: Cannot locate default realm at sun.security.krb5.Config.getDefaultRealm(Config.java:1029) ... 22 more
回答
失败原因是C80SPC200版本开始,创建集群不再替换/etc/krb5.conf文件,改为通过配置参数指定到客户端内krb5路径,而HiBench并不引用客户端配置文件。
解决方案:
将客户端/opt/client/KrbClient/kerberos/var/krb5kdc/krb5.conf,copy覆盖集群内所有节点的/etc/krb5.conf,注意替换前需要备份。