Updated on 2024-07-03 GMT+08:00

Connecting Cloudera CDH to OBS

Deployment View

Version Information

Hardware: 1 Master + 3 Cores (flavor: 8U32G; OS: CentOS 7.5)

Software: CDH 6.0.1

Deployment View

Updating OBSA-HDFS

  1. Download the OBSA-HDFS that matches the Hadoop version.

    Upload the OBSA-HDFS JAR package (for example, hadoop-huaweicloud-3.1.1-hw-53.8.jar) to the /opt/obsa-hdfs directory of each CDH node.
    • In a hadoop-huaweicloud-x.x.x-hw-y.jar package name, x.x.x indicates the Hadoop version number, and y indicates the OBSA version number. For example, in hadoop-huaweicloud-3.1.1-hw-53.8.jar, 3.1.1 is the Hadoop version number, and 53.8 is the OBSA version number.
    • If the Hadoop version is 3.1.x, select hadoop-huaweicloud-3.1.1-hw-53.8.jar.

  2. Add the downloaded JAR package of hadoop-huaweicloud.

    Perform the following operations on each CDH cluster node (replace the JAR package name and CDH version number with the ones actually used).

    1. Save the OBSA-HDFS JAR package in the /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/ directory:

      cp /opt/obsa-hdfs/hadoop-huaweicloud-3.1.1-hw-53.8.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/

    2. Create a soft link for each directory and save the JAR package to the following directories:

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud-3.1.1-hw-53.8.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/cloudera-navigator-server/libs/cdh6/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/common_jars/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/lib/cdh6/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/cloudera-scm-telepub/libs/cdh6/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/client/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/spark/jars/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/impala/lib/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop-mapreduce/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/lib/cdh5/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/cloudera-scm-telepub/libs/cdh5/hadoop-huaweicloud.jar

      ln -s /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-huaweicloud.jar /opt/cloudera/cm/cloudera-navigator-server/libs/cdh5/hadoop-huaweicloud.jar

Connecting OBS to HDFS and Yarn Clusters

  1. In the advanced configuration area of the HDFS cluster, configure fs.obs.access.key, fs.obs.secret.key, fs.obs.endpoint, and fs.obs.impl, corresponding to the OBS AK, SK, endpoint, and IMPL, in the core-site.xml.

    1. Enter the actually used AK/SK pair and endpoint. To obtain them, see Access Keys (AK/SK) and Endpoints and Domain Names, respectively.
    2. Set fs.obs.impl to org.apache.hadoop.fs.obs.OBSFileSystem.

  2. Restart or roll restart the HDFS cluster, and then restart the client.
  3. Go to the YARN cluster and restart the client.
  4. Check whether the AK, SK, endpoint, and impl have been configured in file /etc/hadoop/conf/core-site.xml on the node.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    <property>
    <name>fs.obs.access.key</name>
    <value>*****</value>
    </property>
    <property>
    <name>fs.obs.secret.key</name>
    <value>*****************</value>
    </property>
    <property>
    <name>fs.obs.endpoint</name>
    <value>{Target Endpoint}</value>
    </property>
    <property>
    <name>fs.obs.impl</name>
    <value>org.apache.hadoop.fs.obs.OBSFileSystem</value>
    </property>
    

Connecting OBS to a Spark Cluster

  1. Configure related items (including AK, SK, endpoint, and impl) in file core-site.xml in the YARN cluster.
  2. Restart the YARN cluster and then the Spark cluster client.

Connecting OBS to a Hive Cluster

  1. Configure related items (including AK, SK, endpoint, and impl) in file core-site.xml in the Hive cluster.
  2. Restart the Hive cluster and then the client.