Hive Development Plan
Scenario Description
Assume that you need to develop a Hive data analysis application to manage the employee information of an enterprise. Table 1 and Table 2 provide employee information.
Development Guidelines
- Prepare data.
- Create three tables: employee information table employees_info, employee contact table employees_contact, and extended employee information table employees_info_extended.
- Employee information table employees_info contains fields such as employee ID, name, salary currency, salary, tax category, work place, and hire date. In salary currency, R indicates RMB and D indicates USD.
- Fields in the employees_contact table include the employee ID, phone number, and email address.
- Fields in the employees_info_extended table include the employee ID, name, mobile phone number, e-mail address, salary currency, salary, tax category, and work place. The partition field is the hire date.
For details about table creation codes, see Creating a Hive Table.
- Load employee information to employees_info.
For details about data loading codes, see Loading Hive Data.
Table 1 provides employee information.
Table 1 Employee information Employee ID
Name
Salary Currency
Salary
Tax Category
Work Place
Hire Date
1
Wang
R
8000.01
personal income tax&0.05
China:Shenzhen
2014
3
Tom
D
12000.02
personal income tax&0.09
America:NewYork
2014
4
Jack
D
24000.03
personal income tax&0.09
America:Manhattan
2014
6
Linda
D
36000.04
personal income tax&0.09
America:NewYork
2014
8
Zhang
R
9000.05
personal income tax&0.05
China:Shanghai
2014
- Load employee contact information to employees_contact.
Table 2 provides employee contact information.
- Create three tables: employee information table employees_info, employee contact table employees_contact, and extended employee information table employees_info_extended.
- Analyze data.
For details about data analysis codes, see Querying Hive Data.
- Query contact information of employees whose salaries are paid in USD.
- Query the IDs and names of employees who were hired in 2014, and load the query results to the partition with the hire date of 2014 in the employees_info_extended table.
- Collect the number of records in the employees_info table.
- Query information about employees whose email addresses end with "cn".
- Submit a data analysis task to collect the number of records in the employees_info table. For details, see Analyzing Hive Data.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot