Cloudera CDH对接OBS
部署视图
安装版本
硬件:1Master+3Core(配置:8U32G,操作系统:CentOS 7.5)
软件:CDH 6.0.1
部署视图
更新OBSA-HDFS工具
- 下载与hadoop版本配套的OBSA-HDFS工具:下载地址。
并将OBSA-HDFS工具jar包(如hadoop-huaweicloud-3.1.1-hw-53.8.jar)上传到CDH各节点/opt/obsa-hdfs目录中。
- hadoop-huaweicloud-x.x.x-hw-y.jar包含义:前三位x.x.x为配套hadoop版本号;最后一位y为OBSA版本号,y值最大为最新版本。如:hadoop-huaweicloud-3.1.1-hw-53.8.jar,3.1.1是配套hadoop版本号,53.8是OBSA的版本号。
- 如hadoop版本为3.1.x,则选择hadoop-huaweicloud-3.1.1-hw-53.8.jar。
- 增加hadoop-huaweicloud的jar包。
在CDH集群各节点执行以下命令,命令请根据hadoop-huaweicloud的jar包名字及实际CDH版本进行适配使用。
- 执行如下命令,将OBSA-HDFS工具的jar包放到/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/目录中。
cp /opt/obsa-hdfs/hadoop-huaweicloud-3.1.1-hw-53.8.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/
- 执行如下命令,建立各目录的软连接,将hadoop-huaweicloud的jar包放入如下目录。
ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud-3.1.1-hw-53.8.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar
ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/cloudera-navigator-server/libs/cdh6/hadoop-huaweicloud.jar
ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/common_jars/hadoop-huaweicloud.jar
ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/lib/cdh6/hadoop-huaweicloud.jar
ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/cloudera-scm-telepub/libs/cdh6/hadoop-huaweicloud.jar
ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/hadoop-huaweicloud.jar
ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/client/hadoop-huaweicloud.jar
ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/spark/jars/hadoop-huaweicloud.jar
ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/impala/lib/hadoop-huaweicloud.jar
ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop-mapreduce/hadoop-huaweicloud.jar
ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/lib/cdh5/hadoop-huaweicloud.jar
ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/cloudera-scm-telepub/libs/cdh5/hadoop-huaweicloud.jar
ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/cloudera-navigator-server/libs/cdh5/hadoop-huaweicloud.jar
- 执行如下命令,将OBSA-HDFS工具的jar包放到/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/目录中。
HDFS和Yarn集群对接OBS配置项
- 在HDFS集群配置中选择“高级”,在core-site.xml的群集范围高级配置代码段(安全阀)增加OBS的ak、sk、endpoint和impl配置,对应名称为fs.obs.access.key、fs.obs.secret.key、fs.obs.endpoint、fs.obs.impl。
- 访问密钥AK/SK和终端节点Endpoint请根据实际填写,AK/SK获取方式请参见访问密钥(AK/SK),Endpoint获取方式请参见终端节点(Endpoint)和访问域名。
- fs.obs.impl配置为org.apache.hadoop.fs.obs.OBSFileSystem。
- 修改后“重启”或“滚动重启”HDFS集群,再重启“部署客户端配置”。
- 进入YARN集群,重启“部署客户端配置”。
- 查看节点中/etc/hadoop/conf/core-site.xml中是否已增加OBS的ak、sk、endpoint和impl配置。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
<property> <name>fs.obs.access.key</name> <value>*****</value> </property> <property> <name>fs.obs.secret.key</name> <value>*****************</value> </property> <property> <name>fs.obs.endpoint</name> <value>{Target Endpoint}</value> </property> <property> <name>fs.obs.impl</name> <value>org.apache.hadoop.fs.obs.OBSFileSystem</value> </property>
Spark集群对接OBS配置项
- Spark应用对接OBS,需要在YARN集群中进行core-site.xml配置,包括:ak、sk、endpoint、impl等。
- core-site.xml配置完成后“重启”YARN集群,再重启Spark集群的“部署客户端配置”。
Hive集群对接OBS配置项
- Hive应用对接OBS,需要在Hive集群中进行core-site.xml配置,包括:ak、sk、endpoint、impl等。
- core-site.xml配置完成后“重启”Hive集群,再重启Hive集群的“部署客户端配置”。