Querying Data
Function
This topic describes how to use Hive Query Language (HQL) to query and analyze data. You can query and analyze data using the following methods:
- Use common features for SELECT query, such as JOIN.
- Load data to a specified partition.
- Use Hive-provided functions.
- Query and analyze data using user-defined functions (UDFs). For details about how to create and define UDFs, see Creating User-Defined Hive Functions.
To perform the following operations on a cluster enabled with the security service, you must have related permissions for tables. For details about permission requirements, see Overview.
Example Codes
-- Query the contact information of employees whose salaries are paid in USD. SELECT a.name, b.tel_phone, b.email FROM employees_info a JOIN employees_contact b ON(a.id = b.id) WHERE usd_flag='D'; -- Query the IDs and names of employees who were hired in 2014, and load query results to the partition with the hiring time of 2014 in the employees_info_extended table. INSERT OVERWRITE TABLE employees_info_extended PARTITION (entrytime = '2014') SELECT a.id, a.name, a.usd_flag, a.salary, a.deductions, a.address, b.tel_phone, b.email FROM employees_info a JOIN employees_contact b ON (a.id = b.id) WHERE a.entrytime = '2014'; -- Use the existing Hive function COUNT() to count the number of records in the employees_info table. SELECT COUNT(*) FROM employees_info; -- Query information about employees whose email addresses end with "cn". SELECT a.name, b.tel_phone FROM employees_info a JOIN employees_contact b ON (a.id = b.id) WHERE b.email like '%cn';
Extensions
- Configure intermediate Hive data encryption.
Set the table format to RCFile (recommended) or SequenceFile, and the encryption algorithm to ARC4Codec. SequenceFile is a unique Hadoop file format, and RCFile is a Hive file format with optimized column storage. When a big table is queried, RCFile provides higher performance than SequenceFile.
set hive.exec.compress.output=true; set hive.exec.compress.intermediate=true; set hive.intermediate.compression.codec=org.apache.hadoop.io.encryption.arc4.ARC4Codec;
- For details about UDFs, see Creating User-Defined Hive Functions.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot