Updated on 2024-12-13 GMT+08:00

Using Kudu from Scratch

Kudu is a columnar storage manager developed for the Apache Hadoop platform. Kudu shares the common technical properties of Hadoop ecosystem applications. It is horizontally scalable and supports highly available operations.

Prerequisites

The cluster client has been installed. For example, the client is installed in the /opt/hadoopclient directory. The client directory in the following operations is only an example. Change it to the actual installation directory.

Procedure

  1. Log in to the node where the client is installed as the client installation user.

    Run the su - omm command to switch to user omm.

  2. Run the following command to go to the client installation directory:

    cd /opt/hadoopclient

  3. Run the following command to configure environment variables:

    source bigdata_env

  4. Run the Kudu command line tool.

    Run the command line tool of the Kudu component to view help information.

    kudu -h

    The command output is as follows:

    Usage: kudu <command> [<args>]
     
    <command> can be one of the following:
             cluster   Operate on a Kudu cluster
            diagnose   Diagnostic tools for Kudu servers and clusters
                  fs   Operate on a local Kudu filesystem
                 hms   Operate on remote Hive Metastores
       local_replica   Operate on local tablet replicas via the local filesystem
              master   Operate on a Kudu Master
                 pbc   Operate on PBC (protobuf container) files
                perf   Measure the performance of a Kudu cluster
      remote_replica   Operate on remote tablet replicas on a Kudu Tablet Server
               table   Operate on Kudu tables
              tablet   Operate on remote Kudu tablets
                test   Various test actions
             tserver   Operate on a Kudu Tablet Server
                 wal   Operate on WAL (write-ahead log) files

    The Kudu command line tool does not support DDL and DML operations, but provides the refined query function for the cluster, master, tserver, fs, and table parameters.

    Common operations:

    • Check the tables in the current cluster.

      kudu table list KuduMaster instance IP1:7051, KuduMaster instance IP2:7051, KuduMaster instance IP3:7051

    • Query the configurations of the KuduMaster instance of the Kudu service.

      kudu master get_flags KuduMaster instance IP:7051

    • Query the schema of a table.

      kudu table describe KuduMaster instance IP1:7051, KuduMaster instance IP2:7051, KuduMaster instance IP3:7051 Table name

    • Delete a table.

      kudu table delete KuduMaster instance IP1:7051, KuduMaster instance IP2:7051, KuduMaster instance IP3:7051 Table name

      To obtain the IP address of the KuduMaster instance, choose Components > Kudu > Instances on the cluster details page.