Updated on 2022-08-12 GMT+08:00

Supporting Traditional Relational Database Syntax in Hive

Overview

Hive supports the following types of traditional relational database syntax:

  • Grouping
  • EXCEPT and INTERSECT

Grouping

Syntax description:

  • Grouping takes effect only when the Group by statement contains ROLLUP or CUBE.
  • The result set generated by CUBE contains all the combinations of values in the selected columns.
  • The result set generated by ROLLUP contains the combinations of a certain layer structure in the selected columns.
  • Grouping: If a row is added by using the CUBE or ROLLUP operator, the output value of the added row is 1. If the row is not added by using the CUBE or ROLLUP operator, the output value of the added row is 0.

For example, the table_test table exists in Hive and the table structure is as follows:

+----------------+-------------------+--+
| table_test.id  | table_test.value  |
+----------------+-------------------+--+
| 1              | 10                |
| 1              | 15                |
| 2              | 20                |
| 2              | 5                 |
| 2              | 13                |
+----------------+-------------------+--+

Run the following statement:

select id,grouping(id),sum(value) from table_test group by id with rollup;

The result is as follows:

+-------+-----------------+------+--+
|  id   | groupingresult  | sum  |
+-------+-----------------+------+--+
| 1     | 0               | 25   |
| NULL  | 1               | 63   |
| 2     | 0               | 38   |
+-------+-----------------+------+--+

EXCEPT and INTERSECT

Syntax description:

  • EXCEPT returns the difference of two result sets (that is, non-duplicated values return only one query).
  • INTERSECT returns the intersection of two result sets (that is, non-duplicated values return by both queries).

For example, two tables test_table1 and test_table2 exist in Hive.

The table structure of test_table1 is as follows:

+-----------------+--+
| test_table1.id  |
+-----------------+--+
| 1               |
| 2               |
| 3               |
| 4               |
+-----------------+--+

The table structure of test_table2 is as follows:

+-----------------+--+
| test_table2.id  |
+-----------------+--+
| 2               |
| 3               |
| 4               |
| 5               |
+-----------------+--+
  • Run the following EXCEPT statement:

    select id from test_table1 except select id from test_table2;

    The result is as follows:

    +--------------+--+
    | _alias_0.id  |
    +--------------+--+
    | 1            |
    +--------------+--+
  • Run the following INTERSECT statement:

    select id from test_table1 intersect select id from test_table2;

    The result is as follows:

    +--------------+--+
    | _alias_0.id  |
    +--------------+--+
    | 2            |
    | 3            |
    | 4            |
    +--------------+--+