Updated on 2022-07-11 GMT+08:00

Running HCatalog and Viewing Results

Running HCatalog Example Projects

  1. In the lower left corner of the IDEA page, click Terminal to access the terminal. Run the mvn clean install command to compile the package.

    If BUILD SUCCESS is displayed, the compilation is successful, as shown in the following figure. The hcatalog-example-*.jar package is generated in the target directory of the sample project.

    The preceding JAR file names are for reference only. The actual names may vary.

  2. Upload the hcatalog-example-*.jar file generated in the target directory in the previous step to the specified directory on Linux, for example, /opt/hive_client, marked as $HCAT_CLIENT, and ensure that the Hive and YARN clients have been installed. Execute environment variables for the HCAT_CLIENT to take effect.

    export HCAT_CLIENT=/opt/hive_client 

  3. Run the following command to configure environment parameters (client installation path /opt/client is used as an example):

    export HADOOP_HOME=/opt/client/HDFS/hadoop 
    export HIVE_HOME=/opt/client/Hive/Beeline 
    export HCAT_HOME=$HIVE_HOME/../HCatalog 
    export LIB_JARS=$HCAT_HOME/lib/hive-hcatalog-core-3.1.0.jar,$HCAT_HOME/lib/hive-metastore-3.1.0.jar,$HCAT_HOME/lib/hive-standalone-metastore-3.1.0.jar,$HIVE_HOME/lib/hive-exec-3.1.0.jar,$HCAT_HOME/lib/libfb303-0.9.3.jar,$HCAT_HOME/lib/slf4j-api-1.7.30.jar,$HCAT_HOME/lib/jdo-api-3.0.1.jar,$HCAT_HOME/lib/antlr-runtime-3.5.2.jar,$HCAT_HOME/lib/datanucleus-api-jdo-4.2.4.jar,$HCAT_HOME/lib/datanucleus-core-4.1.17.jar,$HCAT_HOME/lib/datanucleus-rdbms-fi-4.1.19.jar,$HCAT_HOME/lib/log4j-api-2.10.0.jar,$HCAT_HOME/lib/log4j-core-2.10.0.jar
    export HADOOP_CLASSPATH=$HCAT_HOME/lib/hive-hcatalog-core-3.1.0.jar:$HCAT_HOME/lib/hive-metastore-3.1.0.jar:$HCAT_HOME/lib/hive-standalone-metastore-3.1.0.jar:$HIVE_HOME/lib/hive-exec-3.1.0.jar:$HCAT_HOME/lib/libfb303-0.9.3.jar:$HADOOP_HOME/etc/hadoop:$HCAT_HOME/conf:$HCAT_HOME/lib/slf4j-api-1.7.30.jar:$HCAT_HOME/lib/jdo-api-3.0.1.jar:$HCAT_HOME/lib/antlr-runtime-3.5.2.jar:$HCAT_HOME/lib/datanucleus-api-jdo-4.2.4.jar:$HCAT_HOME/lib/datanucleus-core-4.1.17.jar:$HCAT_HOME/lib/datanucleus-rdbms-fi-4.1.19.jar:$HCAT_HOME/lib/log4j-api-2.10.0.jar:$HCAT_HOME/lib/log4j-core-2.10.0.jar
    • Change the version numbers of the JAR files specified in LIB_JARS and HADOOP_CLASSPATH based on the actual environment. For example, if the version number of the hive-hcatalog-core JAR file in $HCAT_HOME/lib is 3.1.0-hw-ei-302001, change $HCAT_HOME/lib/hive-hcatalog-core-3.1.0.jar in LIB_JARS to $HCAT_HOME/lib/hive-hcatalog-core-3.1.0-hw-ei-302001.jar.
    • If the multi-instance function is enabled for Hive, perform the configuration in export HIVE_HOME=/opt/client/Hive/Beeline. For example, if Hive 1 is used, ensure that the Hive 1 client has been installed before using it. Change the value of export HIVE_HOME to /opt/client/Hive1/Beeline.

  4. Prepare for the running:

    1. Use the Hive client to create source table t1 in beeline: create table t1(col1 int);

      Insert the following data into t1:

       
          +----------+--+ 
          | t1.col1  | 
          +----------+--+ 
          | 1        | 
          | 1        | 
          | 1        | 
          | 2        | 
          | 2        | 
          | 3        |     
    2. Create destination table t2: create table t2(col1 int,col2 int);

  5. Use the Yarn client to submit tasks:

    yarn --config $HADOOP_HOME/etc/hadoop jar $HCAT_CLIENT/hcatalog-example-1.0-SNAPSHOT.jar com.huawei.bigdata.HCatalogExample -libjars $LIB_JARS t1 t2

  6. View the running result. The data in t2 is as follows:

    0: jdbc:hive2://192.168.1.18:2181,192.168.1.> select * from t2; 
     +----------+----------+--+ 
     | t2.col1  | t2.col2  | 
     +----------+----------+--+ 
     | 1        | 3        | 
     | 2        | 2        | 
     | 3        | 1        | 
     +----------+----------+--+