GROUP BY
GROUP BY
GROUP BY groups the output rows of a SELECT statement into groups that contain matching values. A simple GROUP BY can contain any expression consisting of input columns, or select the sequence number of the output column by position.
The following queries are equivalent:
SELECT count(*), nationkey FROM customer GROUP BY 2; SELECT count(*), nationkey FROM customer GROUP BY nationkey;
GROUP BY can group the output by the input column names that do not appear in the output of the SELECT statement.
Example:
SELECT count(*) FROM customer GROUP BY mktsegment; GROUPING SETS
You can specify multiple columns for grouping. The result column that does not belong to the grouping column is set to NULL. Queries with complex grouping syntax (GROUPING SETS, CUBE, or ROLLUP) read the underlying data source only once, while queries using UNION ALL read the underlying data three times. This is why queries that use UNION ALL can produce inconsistent results when the data source is not deterministic.
-- Create a shipping table: create table shipping(origin_state varchar(25),origin_zip integer,destination_state varchar(25) ,destination_zip integer,package_weight integer); --Insert data. insert into shipping values ('California',94131,'New Jersey',8648,13), ('California',94131,'New Jersey',8540,42), ('California',90210,'Connecticut',6927,1337), ('California',94131,'Colorado',80302,5), ('New York',10002,'New Jersey',8540,3), ('New Jersey',7081,'Connecticut',6708,225); -- Query the grouping sets: SELECT origin_state, origin_zip, destination_state, sum( package_weight ) FROM shipping GROUP BY GROUPING SETS ( ( origin_state ), ( origin_state, origin_zip ), ( destination_state )); --Logically, this query is equivalent to the union all of multiple group queries. SELECT origin_state, NULL,NULL,sum( package_weight ) FROM shipping GROUP BY origin_state UNION ALL SELECT origin_state,origin_zip,NULL,sum( package_weight ) FROM shipping GROUP BY origin_state,origin_zip UNION ALL SELECT NULL,NULL,destination_state,sum( package_weight ) FROM shipping GROUP BY destination_state; --Result origin_state | origin_zip | destination_state | _col3 --------------|------------|-------------------|------- New Jersey | NULL | NULL | 225 California | 94131 | NULL | 60 California | NULL | NULL | 1397 New York | 10002 | NULL | 3 NULL | NULL | New Jersey | 58 NULL | NULL | Connecticut | 1562 California | 90210 | NULL | 1337 New York | NULL | NULL | 3 NULL | NULL | Colorado | 5 New Jersey | 7081 | NULL | 225 (10 rows)
CUBE
Generate all possible groups for given columns. For example, the possible groups of (origin_state, destination_state) are (origin_state, destination_state), (origin_state), (destination_state), and ().
SELECT origin_state, destination_state, sum( package_weight ) FROM shipping GROUP BY CUBE ( origin_state, destination_state ); -- Equivalent to: SELECT origin_state, destination_state, sum( package_weight ) FROM shipping GROUP BY GROUPING SETS ( ( origin_state, destination_state ), ( origin_state ), ( destination_state ), ());
ROLLUP
Generates partial possible subtotals for a given set of columns.
SELECT origin_state, origin_zip, sum( package_weight ) FROM shipping GROUP BY ROLLUP ( origin_state, origin_zip ); -- Equivalent to: SELECT origin_state, origin_zip, sum( package_weight ) FROM shipping GROUP BY GROUPING SETS ((origin_state,origin_zip ),( origin_state ),());
Currently, GROUP BY does not support column aliases. For example:
select count(userid) as num ,dept as aaa from salary group by aaa having sum(sal)>2000;
The following error is reported:
Query 20210630_084610_00018_wc8n9@default@HetuEngine failed: line 1:63: Column 'aaa' cannot be resolved
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.