更新时间:2024-10-17 GMT+08:00

Cloudera CDH对接OBS

部署视图

安装版本

硬件:1Master+3Core(配置:8U32G,操作系统:CentOS 7.5)

软件:CDH 6.0.1

部署视图

更新OBSA-HDFS工具

  1. 下载与hadoop版本配套的OBSA-HDFS工具:下载地址

    并将OBSA-HDFS工具jar包(如hadoop-huaweicloud-3.1.1-hw-53.8.jar)上传到CDH各节点/opt/obsa-hdfs目录中。
    • hadoop-huaweicloud-x.x.x-hw-y.jar包含义:前三位x.x.x为配套hadoop版本号;最后一位y为OBSA版本号,y值最大为最新版本。如:hadoop-huaweicloud-3.1.1-hw-53.8.jar,3.1.1是配套hadoop版本号,53.8是OBSA的版本号。
    • 如hadoop版本为3.1.x,则选择hadoop-huaweicloud-3.1.1-hw-53.8.jar。

  2. 增加hadoop-huaweicloud的jar包。

    在CDH集群各节点执行以下命令,命令请根据hadoop-huaweicloud的jar包名字及实际CDH版本进行适配使用。

    1. 执行如下命令,将OBSA-HDFS工具的jar包放到/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/目录中。

      cp /opt/obsa-hdfs/hadoop-huaweicloud-3.1.1-hw-53.8.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/

    2. 执行如下命令,建立各目录的软连接,将hadoop-huaweicloud的jar包放入如下目录。

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud-3.1.1-hw-53.8.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/cloudera-navigator-server/libs/cdh6/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/common_jars/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/lib/cdh6/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/cloudera-scm-telepub/libs/cdh6/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/client/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/spark/jars/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/impala/lib/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop-mapreduce/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/lib/cdh5/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/cloudera-scm-telepub/libs/cdh5/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/cloudera-navigator-server/libs/cdh5/hadoop-huaweicloud.jar

HDFS和Yarn集群对接OBS配置项

  1. 在HDFS集群配置中选择“高级”,在core-site.xml的群集范围高级配置代码段(安全阀)增加OBS的ak、sk、endpoint和impl配置,对应名称为fs.obs.access.key、fs.obs.secret.key、fs.obs.endpoint、fs.obs.impl。

    1. 访问密钥AK/SK和终端节点Endpoint请根据实际填写,AK/SK获取方式请参见访问密钥(AK/SK),Endpoint获取方式请参见终端节点(Endpoint)和访问域名
    2. fs.obs.impl配置为org.apache.hadoop.fs.obs.OBSFileSystem。

  2. 修改后“重启”或“滚动重启”HDFS集群,再重启“部署客户端配置”。
  3. 进入YARN集群,重启“部署客户端配置”。
  4. 查看节点中/etc/hadoop/conf/core-site.xml中是否已增加OBS的ak、sk、endpoint和impl配置。

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    <property>
    <name>fs.obs.access.key</name>
    <value>*****</value>
    </property>
    <property>
    <name>fs.obs.secret.key</name>
    <value>*****************</value>
    </property>
    <property>
    <name>fs.obs.endpoint</name>
    <value>{Target Endpoint}</value>
    </property>
    <property>
    <name>fs.obs.impl</name>
    <value>org.apache.hadoop.fs.obs.OBSFileSystem</value>
    </property>
    

Spark集群对接OBS配置项

  1. Spark应用对接OBS,需要在YARN集群中进行core-site.xml配置,包括:ak、sk、endpoint、impl等。
  2. core-site.xml配置完成后“重启”YARN集群,再重启Spark集群的“部署客户端配置”。

Hive集群对接OBS配置项

  1. Hive应用对接OBS,需要在Hive集群中进行core-site.xml配置,包括:ak、sk、endpoint、impl等。
  2. core-site.xml配置完成后“重启”Hive集群,再重启Hive集群的“部署客户端配置”。