Typical Scenario
Description
Assume that you need to develop a Hive data analysis application to manage the employee information described in Table 1 and Table 2.
Development Guidelines
- Prepare data.
- Create three tables: employee information table employees_info, employee contact table employees_contact, and extended employee information table employees_info_extended.
- Employee information table employees_info contains fields such as employee ID, name, salary currency, salary, tax category, work place, and hire date. In salary currency, R indicates RMB and D indicates USD.
- Fields in the employees_contact table include the employee ID, phone number, and email address.
- Fields in the employees_info_extended table include the employee ID, name, mobile phone number, e-mail address, salary currency, salary, tax category, and work place. The partition field is the hire date.
For table creation codes, see Creating a Table.
- Load employee information to employees_info.
For data loading codes, see Loading Data.
Table 1 describes employee information.
Table 1 Employee information ID
Name
Salary Currency
Salary
Tax Category
Work Place
Hiring Date
1
Wang
R
8000.01
personal income tax&0.05
Country1:City1
2014
3
Tom
D
12000.02
personal income tax&0.09
Country2:City2
2014
4
Jack
D
24000.03
personal income tax&0.09
Country3:City3
2014
6
Linda
D
36000.04
personal income tax&0.09
Country4:City4
2014
8
Zhang
R
9000.05
personal income tax&0.05
Country5:City5
2014
- Load employee contact information to employees_contact.
Table 2 describes employee contact information.
- Load extended employee information to employees_info_extended.
Table 3 describes the extended employee information.
Table 3 Extended employee information ID
Name
Mobile Phone Number
E-mail Address
Salary Currency
Salary
Tax Category
Work Place
Hiring Date
1
Wang
135 XXXX XXXX
xxxx@xx.com
R
8000.01
personal income tax&0.05
Country1:City1
2014
3
Tom
159 XXXX XXXX
xxxxx@xx.com.cn
D
12000.02
personal income tax&0.09
Country2:City2
2014
4
Jack
186 XXXX XXXX
xxxx@xx.org
D
24000.03
personal income tax&0.09
Country3:City3
2014
6
Linda
189 XXXX XXXX
xxxx@xxx.cn
D
36000.04
personal income tax&0.09
Country4:City4
2014
8
Zhang
134 XXXX XXXX
xxxx@xxxx.cn
R
9000.05
personal income tax&0.05
Country5:City5
2014
- Create three tables: employee information table employees_info, employee contact table employees_contact, and extended employee information table employees_info_extended.
- Analyze data.
For data analysis codes, see Querying Data.
- Query contact information of employees whose salaries are paid in USD.
- Query the IDs and names of employees who were hired in 2014, and load the query results to the partition with the hiring date of 2014 in employees_info_extended.
- Collect the number of records in the employees_info table.
- Query information about employees whose email addresses end with "cn".
- Submit a data analysis task to collect the number of records in the employees_info table. For details about the implementation, see Using the JDBC interface to submit a data analysis task.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.