Accessing Hive Using Python

Function

Use Python to connect to Hive to execute data analysis tasks.

Example Code

Submit a data analysis task using a Python program python-examples/pyCLI_sec.py.

Import the HAConnection class.

 from pyhs2.haconnection import HAConnection

Declare the HiveServer IP address list. In this example, hosts indicates HiveServer nodes, and xxx.xxx.xxx.xxx indicates a service IP address.
```
hosts = ["xxx.xxx.xxx.xxx", "xxx.xxx.xxx.xxx"] 
```
If the HiveServer instance is migrated, the original sample program becomes invalid. In this case, you need to update the HiveServer IP address used in the sample program.

Create the connection and execute HiveQL statements. The sample code only queries all tables. You can modify the HiveQL statements as you need and output the queried column names and results to the console.

   try: 
       with HAConnection(hosts = hosts, 
                          port = 21066, 
                          authMechanism = "KERBEROS", 
                          configuration = conf) as haConn: 
           with haConn.getConnection() as conn: 
               with conn.cursor() as cur: 
                   # show databases 
                   print cur.getdatabases() 
                    
                   # execute query 
                   cur.execute("show tables") 
                    
                   # return column info from query 
                   print cur.getschema() 
                    
                   # fetch table results 
                   for i in cur.fetch(): 
                       print i 
                        
   except exception, e: 
       print e

Parent topic: Developing an Application

Previous topic: Access Hive with HCatalog

Next topic: Accessing Hive Using Python 3