Updated on 2024-09-30 GMT+08:00

SELECT

Function

SELECT retrieves data from a table or view.

Serving as an overlaid filter for a database table, SELECT using SQL keywords retrieves required data from data tables.

Precautions

  • Using SELECT can join HDFS and ordinary tables, but cannot join ordinary and GDS foreign tables. That is, a SELECT statement cannot contain both ordinary and GDS foreign tables.
  • The user must have the SELECT permission on every column used in the SELECT command.
  • UPDATE permission is required when using FOR UPDATE or FOR SHARE.

Syntax

  • Querying data
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
[ WITH [ RECURSIVE ] with_query [, ...] ]
SELECT [/*+ plan_hint */] [ ALL | DISTINCT [ ON ( expression [, ...] ) ] ]
{ * | {expression [ [ AS ] output_name ]} [, ...] }
[ FROM from_item [, ...] ]
[ WHERE condition ]
[ GROUP BY grouping_element [, ...] ]
[ HAVING condition [, ...] ]
[ WINDOW {window_name AS ( window_definition )} [, ...] ]
[ { UNION | INTERSECT | EXCEPT | MINUS } [ ALL | DISTINCT ] select ]
[ ORDER BY {expression [ [ ASC | DESC | USING operator ] | nlssort_expression_clause ] [ NULLS { FIRST | LAST } ]} [, ...] ]
[ { [ LIMIT { count | ALL } ] [ OFFSET start [ ROW | ROWS ] ] } | { LIMIT start, { count | ALL } } ]
[ FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY ]
[ {FOR { UPDATE | SHARE } [ OF table_name [, ...] ] [ NOWAIT ]} [...] ];

In condition and expression, you can use the aliases of expressions in targetlist in compliance with the following rules:

  • Reference only in the same level.
  • Only reference aliases in targetlist.
  • Reference a prior expression in a subsequent expression.
  • The volatile function cannot be used.
  • The Window function cannot be used.
  • Do not reference an alias in the join on condition.
  • An error is reported if targetlist contains multiple referenced aliases.
  • The subquery with_query is as follows:
    1
    2
    with_query_name [ ( column_name [, ...] ) ]
        AS [ [ NOT ] MATERIALIZED ] ( {select | values | insert | update | delete} )
    
  • The specified query source from_item is as follows:
    1
    2
    3
    4
    5
    6
    {[ ONLY ] table_name [ * ] [ partition_clause ] [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
    |( select ) [ AS ] alias [ ( column_alias [, ...] ) ]
    |with_query_name [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
    |function_name ( [ argument [, ...] ] ) [ AS ] alias [ ( column_alias [, ...] | column_definition [, ...] ) ]
    |function_name ( [ argument [, ...] ] ) AS ( column_definition [, ...] )
    |from_item [ NATURAL ] join_type from_item [ ON join_condition | USING ( join_column [, ...] ) ]}
    
  • The group clause is as follows:
    1
    2
    3
    4
    5
    6
    ( )
    | expression
    | ( expression [, ...] )
    | ROLLUP ( { expression | ( expression [, ...] ) } [, ...] )
    | CUBE ( { expression | ( expression [, ...] ) } [, ...] )
    | GROUPING SETS ( grouping_element [, ...] )
    
  • The specified partition partition_clause is as follows:
    1
    2
    PARTITION { ( partition_name ) | 
            FOR (  partition_value [, ...] ) }
    

    Partitions can be specified only for ordinary tables.

  • The sorting order nlssort_expression_clause is as follows:
    1
    NLSSORT ( column_name, ' NLS_SORT = { SCHINESE_PINYIN_M | generic_m_ci } ' )
    
  • Simplified query syntax, equivalent to select * from table_name.
    1
    TABLE { ONLY {(table_name)| table_name} | table_name [ * ]};
    

Parameter Description

  • WITH [ RECURSIVE ] with_query [, ...]

    The WITH clause allows you to specify one or more subqueries that can be referenced by name in the primary query, equal to temporary table.

    If RECURSIVE is specified, it allows a SELECT subquery to reference itself by name.

    The syntax of with_query is: with_query_name [ ( column_name [, ...] ) ] AS [ [ NOT ] MATERIALIZED ] ( {select | values | insert | update | delete} )

    • with_query_name specifies the name of the result set generated by a subquery. Such names can be used to access the result sets of subqueries in a query.
    • By default, the with_query that is referenced multiple times by the primary query is executed only once. Its result set is materialized so that the primary query can query its result set multiple times. The with_query referenced once by the primary query will not be executed independently. Instead, its subquery takes the place where the primary query can directly reference it and is executed with the primary query. If [NOT] MATERIALIZED is specified, the default action is changed.
      • If MATERIALIZED is specified, the subquery is executed once and its result set is materialized.
      • If NOT MATERIALIZED is specified, its subquery takes the place where the primary query can directly reference it. NOT MATERIALIZED is ignored in the following cases:
        • The subquery contains volatile functions.
        • The subquery is a SELECT or VALUES statement containing FOR UPDATE or FOR SHARE.
        • The subquery is an INSERT, UPDATE, or DELETE statement.
        • RECURSIVE is specified for with_query.
        • If with_query2 is referenced more than once and it references with_query1, which referenced itself in the outer layer, with_query2 cannot take the place where it can be referenced.
          For example, in the following example, tmp2 is referenced twice. Because tmp2 references tmp1 which referenced itself in the outer layer, tmp2 will be materialized even if NOT MATERIALIZED is specified.
          1
          2
          3
          4
          5
          with recursive tmp1(b) as (values(1)
          union all
          (with tmp2 as not materialized (select * from tmp1)
           select tt1.b + tt2.b from tmp2 tt1, tmp2 tt2))
           select * from tmp1;
          
    • column_name specifies a column name displayed in the subquery result set.
    • Each subquery can be a SELECT, VALUES, INSERT, UPDATE or DELETE statement.
  • plan_hint clause

    Follows the SELECT keyword in the /*+<Plan hint> */ format. It is used to optimize the plan of a SELECT statement block. For details, see section "Hint-based Tuning."

  • ALL

    Specifies that all rows meeting the requirements are returned. This is the default behavior, so you can omit this keyword.

  • DISTINCT [ ON ( expression [, ...] ) ]

    Removes all duplicate rows from the SELECT result so one row is kept from each group of duplicates.

    ON ( expression [, ...] ) is only reserved for the first row among all the rows with the same result calculated using given expressions.

    DISTINCT ON expression is explained with the same rule of ORDER BY. Unless you use ORDER BY to guarantee that the required row appears first, you cannot know what the first row is.

  • SELECT list

    Indicates columns to be queried. Some or all columns (using wildcard character *) can be queried.

    You may use the AS output_name clause to give an alias for an output column. The alias is used for the displaying of the output column.

    Column names may be either of:

    • Manually input column names which are spaced using commas (,).
    • Fields computed in the FROM clause.
  • FROM clause

    Indicates one or more source tables for SELECT.

    The FROM clause can contain the following elements:

    • table_name

      Indicates the name (optionally schema-qualified) of an existing table or view, for example, schema_name.table_name.

    • alias

      Gives a temporary alias to a table to facilitate the quotation by other queries.

      An alias is used for brevity or to eliminate ambiguity for self-joins. When an alias is provided, it completely hides the actual name of the table or function.

    • column_alias

      Specifies the column alias.

    • PARTITION

      Queries data in the specified partition in a partition table.

    • partition_name

      Specifies the name of a partition.

    • partition_value

      Specifies the value of the specified partition key. If there are many partition keys, use the PARTITION FOR clause to specify the value of the only partition key you want to use.

    • subquery

      Performs a subquery in the FROM clause. A temporary table is created to save subquery results.

    • with_query_name

      WITH clause can also be the source of FROM clause and can be referenced with the name queried by executing WITH.

    • function_name

      Function name. Function calls can appear in the FROM clause.

    • join_type

      There are five types below:

      • [ INNER ] JOIN

        A JOIN clause combines two FROM items. Use parentheses if necessary to determine the order of nesting. In the absence of parentheses, JOIN nests left-to-right.

        In any case, JOIN binds more tightly than the commas separating FROM items.

      • LEFT [ OUTER ] JOIN

        Returns all rows in the qualified Cartesian product (all combined rows that pass its join condition), and pluses one copy of each row in the left-hand table for which there was no right-hand row that passed the join condition. This left-hand row is extended to the full width of the joined table by inserting NULL values for the right-hand columns. Note that only the JOIN clause's own condition is considered while deciding which rows have matches. Outer conditions are applied afterwards.

      • RIGHT [ OUTER ] JOIN

        Returns all the joined rows, plus one row for each unmatched right-hand row (extended with NULL on the left).

        This is just a notational convenience, since you could convert it to a LEFT OUTER JOIN by switching the left and right inputs.

      • FULL [ OUTER ] JOIN

        Returns all the joined rows, pluses one row for each unmatched left-hand row (extended with NULL on the right), and pluses one row for each unmatched right-hand row (extended with NULL on the left).

      • CROSS JOIN

        CROSS JOIN is equivalent to INNER JOIN ON (TRUE), which means no rows are removed by qualification. These join types are just a notational convenience, since they do nothing you could not do with plain FROM and WHERE.

        For the INNER and OUTER join types, a join condition must be specified, namely exactly one of NATURAL ON, join_condition, or USING (join_column [, ...]). For CROSS JOIN, none of these clauses can appear.

      CROSS JOIN and INNER JOIN produce a simple Cartesian product, the same result as you get from listing the two items at the top level of FROM.

    • ON join_condition

      A join condition to define which rows have matches in joins. Example: ON left_table.a = right_table.a

    • USING(join_column[, ...])

      ON left_table.a = right_table.a AND left_table.b = right_table.b ... abbreviation. Corresponding columns must have the same name.

    • NATURAL

      NATURAL is a shorthand for a USING list that mentions all columns in the two tables that have the same names.

    • from item

      Specifies the name of the query source object connected.

  • WHERE clause

    The WHERE clause forms an expression for row selection to narrow down the query range of SELECT. The condition is any expression that evaluates to a result of Boolean type. Rows that do not satisfy this condition will be eliminated from the output.

    In the WHERE clause, you can use the operator (+) to convert a table join to an outer join. However, this method is not recommended because it is not the standard SQL syntax and may raise syntax compatibility issues during platform migration. There are many restrictions on using the operator (+):

    1. It can appear only in the WHERE clause.
    2. If a table join has been specified in the FROM clause, the operator (+) cannot be used in the WHERE clause.
    3. The operator (+) can work only on columns of tables or views, instead of on expressions.
    4. If table A and table B have multiple join conditions, the operator (+) must be specified in all the conditions. Otherwise, the operator (+) will not take effect, and the table join will be converted into an inner join without any prompt information.
    5. Tables specified in a join condition where the operator (+) works cannot cross queries or subqueries. If tables where the operator (+) works are not in the FROM clause of the current query or subquery, an error will be reported. If a peer table for the operator (+) does not exist, no error will be reported and the table join will be converted into an inner join.
    6. Expressions where the operator (+) is used cannot be directly connected through OR.
    7. If a column where the operator (+) works is compared with a constant, the expression becomes a part of the join condition.
    8. A table cannot have multiple foreign tables.
    9. The operator (+) can appear only in the following expressions: comparison, NOT, ANY, ALL, IN, NULLIF, IS DISTINCT FROM, and IS OF expressions. It is not allowed in other types of expressions. In addition, these expressions cannot be connected through AND or OR.
    10. The operator (+) can be used to convert a table join only to a left or right outer join, instead of a full join. That is, the operator (+) cannot be specified on both tables of an expression.

    For the WHERE clause, if a special character % _ or \ is queried in LIKE, add the slash (\) before each character.

    Examples:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    create table tt01 (id int,content varchar(50));
    NOTICE:  The 'DISTRIBUTE BY' clause is not specified. Using 'id' as the distribution column by default.
    HINT:  Please use 'DISTRIBUTE BY' clause to specify suitable data distribution column.
    CREATE TABLE
    insert into tt01 values (1,'Jack say ''hello''');
    INSERT 0 1
    insert into tt01 values (2,'Rose do 50%');
    INSERT 0 1
    insert into tt01 values (3,'Lilei say ''world''');
    INSERT 0 1
    insert into tt01 values (4,'Hanmei do 100%');
    INSERT 0 1
    select * from tt01;
     id |      content
    ----+-------------------
      3 | Lilei say 'world'
      4 | Hanmei do 100%
      1 | Jack say 'hello'
      2 | Rose do 50%
    (4 rows)
    
    select * from tt01 where content like '%''he%';
     id |     content
    ----+------------------
      1 | Jack say 'hello'
    (1 row)
    
    select * from tt01 where content like '%50\%%';
     id |   content
    ----+-------------
      2 | Rose do 50%
    (1 row)
    
  • GROUP BY clause

    Condenses query results into a single row or selected rows that share the same values for the grouped expressions.

    • ROLLUP ( { expression | ( expression [, ...] ) } [, ...] )
      ROLLUP calculates the standard aggregation value specified by an ordered grouping column in GROUP BY, creates a high-level partial sum from right to left, and finally creates a cumulative sum. A group can be regarded as a series of grouping sets. Example:
      1
      GROUP BY ROLLUP (a,b,c)
      

      Or

      1
      GROUP BY GROUPING SETS((a,b,c), (a,b), (a), ( )) 
      

      The elements in the ROLLUP clause can be independent fields or expressions, or a list contained in parentheses. If it is a list in parentheses, they must be a whole when the grouping set is generated. Example:

      1
      GROUP BY CUBE ((a,b), (c,d))
      

      Or

      1
      GROUPING SETS ((a,b,c,d), (a,b), (c,d ), ( ))
      
    • CUBE ( { expression | ( expression [, ...] ) } [, ...] )

      A CUBE grouping is an extension to the GROUP BY clause that creates subtotals for all of the possible combinations of the given list of grouping columns (or expressions). In terms of multidimensional analysis, CUBE generates all the subtotals that could be calculated for a data cube with the specified dimensions. For example, given three expressions (n=3) in the CUBE clause, the operation results in 2n = 23 = 8 groupings. Rows grouped on the values of n expressions are called regular rows, and the rest are called superaggregate rows. Example:

      1
      GROUP BY CUBE (a,b,c)
      

      Or

      1
      GROUP BY GROUPING SETS((a,b,c), (a,b), (a,c), (b,c), (a), (b), (c), ( ))
      

      The elements in the CUBE clause can be independent fields or expressions, or a list contained in parentheses. If it is a list in parentheses, they must be a whole when the grouping set is generated. Example:

      1
      GROUP BY ROLLUP (a, (b, c), d)
      

      Or

      GROUP BY GROUPING SETS ((a,b,c,d), (a,b,c), (a), ( ))
    • GROUPING SETS ( grouping_element [, ...] )

      GROUPING SETS is another extension to the GROUP BY clause. It allows users to specify multiple GROUP BY clauses. The option is used to define a grouping set. Each grouping set needs to be included in a separate parenthesis. A blank parenthesis (()) indicates that all data is processed as a group. This improves efficiency by trimming away unnecessary data. You can specify the required data group for query.

    If the SELECT list expression quotes some ungrouped fields and no aggregate function is used, an error is displayed. This is because multiple values may be returned for ungrouped fields.

  • HAVING clause

    Selects special groups by working with the GROUP BY clause. The HAVING clause compares some attributes of groups with a constant. Only groups that matching the logical expression in the HAVING clause are extracted.

  • WINDOW clause

    The general format is WINDOW window_name AS ( window_definition ) [, ...]. window_name is a name can be referenced by window_definition. window_definition can be expressed in the following forms:

    [ existing_window_name ]

    [ PARTITION BY expression [, ...] ]

    [ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ]

    [ frame_clause ]

    frame_clause defines a window frame for the window function. The window function (not all window functions) depends on window frame and window frame is a set of relevant rows of the current query row. frame_clause can be expressed in the following forms:

    [ RANGE | ROWS ] frame_start

    [ RANGE | ROWS ] BETWEEN frame_start AND frame_end

    frame_start and frame_end can be expressed in the following forms:

    UNBOUNDED PRECEDING

    value PRECEDING (not supported for RANGE)

    CURRENT ROW

    value FOLLOWING (not supported for RANGE)

    UNBOUNDED FOLLOWING

    For the query of column storage table, only row_number window function is supported, frame_clause is not supported.

  • UNION clause

    Computes the set union of the rows returned by the involved SELECT statements.

    The UNION clause has the following constraints:

    • By default, the result of UNION does not contain any duplicate rows unless the ALL option is specified.
    • Multiple UNION operators in the same SELECT statement are evaluated left to right, unless otherwise specified by parentheses.
    • FOR UPDATE cannot be specified either for a UNION result or for any input of a UNION.

    General expression:

    select_statement UNION [ALL] select_statement

    • select_statement can be any SELECT statement without an ORDER BY, LIMIT, or FOR UPDATE clause.
    • ORDER BY and LIMIT in parentheses can be attached in a sub-expression.
  • INTERSECT clause

    Computes the set intersection of rows returned by the involved SELECT statements. The result of INTERSECT does not contain any duplicate rows.

    The INTERSECT clause has the following constraints:

    • Multiple INTERSECT operators in the same SELECT statement are evaluated left to right, unless otherwise specified by parentheses.
    • Processing INTERSECT preferentially when UNION and INTERSECT operations are executed for results of multiple SELECT statements.

    General format:

    select_statement INTERSECT select_statement

    select_statement can be any SELECT statement without a FOR UPDATE clause.

  • EXCEPT clause

    EXCEPT clause has the following common form:

    select_statement EXCEPT [ ALL ] select_statement

    select_statement can be any SELECT statement without a FOR UPDATE clause.

    The EXCEPT operator computes the set of rows that are in the result of the left SELECT statement but not in the result of the right one.

    The result of EXCEPT does not contain any duplicate rows unless the ALL option is specified. To execute ALL, a row that has m duplicates in the left table and n duplicates in the right table will appear MAX(mn, 0) times in the result set.

    Multiple EXCEPT operators in the same SELECT statement are evaluated left to right, unless parentheses dictate otherwise. EXCEPT binds at the same level as UNION.

    Currently, FOR UPDATE and FOR SHARE cannot be specified either for an EXCEPT result or for any input of an EXCEPT.

  • MINUS clause

    Has the same function and syntax as EXCEPT clause.

  • ORDER BY clause

    Sorts data retrieved by SELECT in descending or ascending order. If the ORDER BY expression contains multiple columns:

    • If two columns are equal according to the leftmost expression, they are compared according to the next expression and so on.
    • If they are equal according to all specified expressions, they are returned in an implementation-dependent order.
    • Columns sorted by ORDER BY must be contained in the result retrieved by SELECT.
    • If ORDER BY is not specified, the query results are returned following the generation sequence in the database system.
    • You can add the keyword ASC (in ascending order) or DESC (in descending order) next to any expression in the ORDER BY clause. If the keyword is not specified, ASC is used by default.
    • To sort query results by case-insensitive Chinese pinyin, set the encoding mode to UTF-8 or GBK during database initialization. The commands are as follows:

      initdb –E UTF8 –D ../data –locale=zh_CN.UTF-8 or initdb –E GBK –D ../data –locale=zh_CN.GBK

  • [ { [ LIMIT { count | ALL } ] [ OFFSET start [ ROW | ROWS ] ] } | { LIMIT start, { count | ALL } } ]

    The LIMIT clause consists of two independent LIMIT clauses, an OFFSET clause, and a LIMIT clause with multiple parameters.

    LIMIT { count | ALL }

    OFFSET start [ ROW | ROWS ]

    LIMIT start, { count | ALL }

    count in the clauses specifies the maximum number of rows to return, while start specifies the number of rows to skip before starting to return rows. When both are specified, start rows are skipped before starting to count the count rows to be returned. A multi-parameter LIMIT clause cannot be used together with a single-parameter LIMIT or OFFSET clause.

  • FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY

    If count is omitted in a FETCH clause, it defaults to 1.

  • FOR UPDATE clause

    Locks rows retrieved by SELECT. This ensures that the rows cannot be modified or deleted by other transactions until the current transaction ends. That is, other transactions that attempt UPDATE, DELETE, or SELECT FOR UPDATE of these rows will be blocked until the current transaction ends.

    To avoid waiting for the committing of other transactions, you can apply NOWAIT. Rows to which NOWAIT applies cannot be immediately locked. After SELECT FOR UPDATE NOWAIT is executed, an error is reported.

    FOR SHARE behaves similarly, except that it acquires a shared rather than exclusive lock on each retrieved row. A share lock blocks other transaction from performing UPDATE, DELETE, or SELECT FOR UPDATE on these rows, but it does not prevent them from performing SELECT FOR SHARE.

    If specified tables are named in FOR UPDATE or FOR SHARE, then only rows coming from those tables are locked; any other tables used in SELECT are simply read as usual. Otherwise, locking all tables in the command.

    If FOR UPDATE or FOR SHARE is applied to a view or sub-query, it affects all tables used in the view or sub-query.

    Multiple FOR UPDATE and FOR SHARE clauses can be written if it is necessary to specify different locking behaviors for different tables.

    If the same table is mentioned (or implicitly affected) by both FOR UPDATE and FOR SHARE clauses, it is processed as FOR UPDATE. Similarly, a table is processed as NOWAIT if that is specified in any of the clauses affecting it.

    • For SQL statements containing FOR UPDATE or FOR SHARE, their execution plans will be pushed down to DNs. If the pushdown fails, an error will be reported.
    • The query of column storage table does not support for update/share.
  • NLS_SORT

    Indicates a field to be ordered in a special mode. Currently, only the Chinese Pinyin order and case insensitive order are supported.

    Valid value:

    • SCHINESE_PINYIN_M, sorted by Pinyin order. To use this sort method, specify GBK as the encoding format when you create the database. If you do not do so, this value is invalid.
    • generic_m_ci, case-insensitive order.
  • PARTITION clause

    Queries data in the specified partition of a partitioned table.

Examples

  • WITH clause

    Obtain the temp_t temporary table by a subquery and query all records in this table.

    1
    WITH temp_t(name,isdba) AS (SELECT usename,usesuper FROM pg_user) SELECT * FROM temp_t;
    

    Explicitly specify MATERIALIZED for the with_query named temp_t, and then query all data in the temp_t table.

    1
    WITH temp_t(name,isdba) AS MATERIALIZED (SELECT usename,usesuper FROM pg_user) SELECT * FROM temp_t;
    

    Explicitly specify NOT MATERIALIZED for the with_query named temp_t, and then query all data in the temp_t table.

    1
    2
    3
    4
    WITH temp_t(name,isdba) AS NOT MATERIALIZED (SELECT usename,usesuper FROM pg_user)
     SELECT * FROM temp_t t1 WHERE name LIKE 'A%'
     UNION ALL
     SELECT * FROM temp_t t2 WHERE name LIKE 'B%';
    
  • Example of the DISTINCT clause

    Query all the r_reason_sk records in the tpcds.reason table and de-duplicate them.

    1
    SELECT DISTINCT(r_reason_sk) FROM tpcds.reason;
    
  • Example of the LIMIT clause

    Obtain the first record in the table.

    1
    SELECT * FROM tpcds.reason LIMIT 1;
    

    Obtain the third record in the table.

    1
    SELECT * FROM tpcds.reason LIMIT 1 OFFSET 2;
    

    Obtain the first two records in the table.

    1
    SELECT * FROM tpcds.reason LIMIT 2;
    
  • Example of the ORDER BY clause

    Query all records and sort them in alphabetic order.

    1
    SELECT r_reason_desc FROM tpcds.reason ORDER BY r_reason_desc;
    
  • Example of a SELECT list

    Use table aliases to obtain data from the pg_user and pg_user_status tables.

    1
    SELECT a.usename,b.locktime FROM pg_user a,pg_user_status b WHERE a.usesysid=b.roloid;
    
  • Example of the FULL JOIN clause

    Fully join the data in the pg_user and pg_user_status tables.

    1
    SELECT a.usename,b.locktime,a.usesuper FROM pg_user a FULL JOIN pg_user_status b on a.usesysid=b.roloid;
    
  • Example of the GROUP BY clause

    Create the sales table sale.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    CREATE TABLE sales (
        item VARCHAR(10),
        year VARCHAR(4),
        quantity INT
    );
    
    INSERT INTO sales VALUES('apple', '2018', 800);
    INSERT INTO sales VALUES('apple', '2018', 1000);
    INSERT INTO sales VALUES('banana', '2018', 500);
    INSERT INTO sales VALUES('banana', '2018', 600);
    INSERT INTO sales VALUES('apple', '2019', 1200);
    INSERT INTO sales VALUES('banana', '2019', 1800);
    

    Group by the combination of product and year:

    1
    2
    3
    4
    5
    6
    7
    8
    SELECT item, year, SUM(quantity) FROM sales GROUP BY item, year;
      item  | year | sum
    --------+------+------
     banana | 2019 | 1800
     apple  | 2018 | 1800
     banana | 2018 | 1100
     apple  | 2019 | 1200
    (4 rows)
    

    Use GROUPING SETS to specify customized grouping sets and query the results. The following query returns the total sales calculated by product and year, plus the total sales calculated by product, plus the total sales calculated by year.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    SELECT coalesce (item, 'all products') AS "Product",
    coalesce (year,'all years') AS "Year",
    SUM (quantity) as "Sales"
      FROM sales
     GROUP BY GROUPING SETS (
           (item, year),
           (item),
           (year),
           ()
          )
      ORDER BY item,year;
    Product | Year | Sales
    ----------+----------+------
     apple    | 2018     | 1800
     apple    | 2019     | 1200
    apple | all years | 3000
     banana   | 2018     | 1100
     banana   | 2019     | 1800
    banana | all years | 2900
    all products | 2018 | 2900
    all products | 2019 | 3000
    all products | all years | 5900
    (9 rows)
    

    As the number of grouping columns increase, it is difficult to use GROUPING SETS to list all possible groups. In this case, you can use GROUPING SETS CUBE.

    1
    2
    3
    4
    5
    6
    SELECT coalesce (item, 'all products') AS "Product",
    coalesce (year,'all years') AS "Year",
    SUM (quantity) as "Sales"
      FROM sales
     GROUP BY CUBE (item,year)
     ORDER BY item,year;
    

    GROUPING SETS ROLLUP is used to summarize the results at each level. The following query returns the total sales calculated by the combination of product and year, plus the total sales calculated by product, and plus the total sales calculated by year.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    SELECT coalesce (item, 'all products') AS "Product",
    coalesce (year,'all years') AS "Year",
    SUM (quantity) as "Sales"
      FROM sales
     GROUP BY ROLLUP (item,year)
     ORDER BY item,year;
    
    Product | Year | Sales
    ----------+----------+------
     apple    | 2018     | 1800
     apple    | 2019     | 1200
    apple | all years | 3000
     banana   | 2018     | 1100
     banana   | 2019     | 1800
    banana | all years | 2900
    all products | all years | 5900
    (7 rows)
    
  • Example of the UNION clause

    Combine the content starting with W and N in the r_reason_desc column in the tpcds.reason table.

    1
    2
    3
    4
    5
    6
    7
    SELECT r_reason_sk, tpcds.reason.r_reason_desc
        FROM tpcds.reason
        WHERE tpcds.reason.r_reason_desc LIKE 'W%'
    UNION
    SELECT r_reason_sk, tpcds.reason.r_reason_desc
        FROM tpcds.reason
        WHERE tpcds.reason.r_reason_desc LIKE 'N%';
    
  • Example of the NLS_SORT clause
    1
    SELECT * FROM stu_pinyin_info ORDER BY NLSSORT (name, 'NLS_SORT = SCHINESE_PINYIN_M' );
    

    Case-insensitive order:

    1
    2
    3
    4
    5
    6
    7
    8
    CREATE TABLE stu_icase_info (id bigint, name text) DISTRIBUTE BY REPLICATION;
    INSERT INTO stu_icase_info VALUES (1, 'aaaa'),(2, 'AAAA');
    SELECT * FROM stu_icase_info ORDER BY NLSSORT (name, 'NLS_SORT = generic_m_ci');
     id | name
    ----+------
      1 | aaaa
      2 | AAAA
    (2 rows)
    
  • Example of querying a partitioned table

    Create the partitioned table tpcds.reason_p, insert data, and obtain data from the P_05_BEFORE partition of the table.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    CREATE TABLE tpcds.reason_p
    (
      r_reason_sk integer,
      r_reason_id character(16),
      r_reason_desc character(100)
    )
    PARTITION BY RANGE (r_reason_sk)
    (
      partition P_05_BEFORE values less than (05),
      partition P_15 values less than (15),
      partition P_25 values less than (25),
      partition P_35 values less than (35),
      partition P_45_AFTER values less than (MAXVALUE)
    );
    
    INSERT INTO tpcds.reason_p values(3,'AAAAAAAABAAAAAAA','reason 1'),(10,'AAAAAAAABAAAAAAA','reason 2'),(4,'AAAAAAAABAAAAAAA','reason 3'),(10,'AAAAAAAABAAAAAAA','reason 4'),(10,'AAAAAAAABAAAAAAA','reason 5'),(20,'AAAAAAAACAAAAAAA','reason 6'),(30,'AAAAAAAACAAAAAAA','reason 7');
    
    SELECT * FROM tpcds.reason_p PARTITION (P_05_BEFORE);
     r_reason_sk |   r_reason_id    |   r_reason_desc                   
    -------------+------------------+------------------------------------
               4 | AAAAAAAABAAAAAAA | reason 3                          
               3 | AAAAAAAABAAAAAAA | reason 1                          
    (2 rows)
    ——Query the number of rows in partition P_15:
    SELECT count(*) FROM tpcds.reason_p PARTITION (P_15);
     count  
    --------
         3
    (1 row)
    
  • Example of the HAVING clause

    Collect statistics on records in the tpcds.reason_p table by r_reason_id group and display only records of which the number of r_reason_id values is greater than 2.

    1
    2
    3
    4
    5
    SELECT COUNT(*) c,r_reason_id FROM tpcds.reason_p GROUP BY r_reason_id HAVING c>2;
     c |   r_reason_id    
    ---+------------------
     5 | AAAAAAAABAAAAAAA
    (1 row)
    
  • Example of the IN clause

    Collect statistics on the number of r_reason_id values in the tpcds.reason_p table by r_reason_id group and display only the number of records whose r_reason_id values are AAAAAAAABAAAAAAA or AAAAAAAADAAAAAAA.

    1
    2
    3
    4
    5
    SELECT COUNT(*),r_reason_id FROM tpcds.reason_p GROUP BY r_reason_id HAVING r_reason_id IN('AAAAAAAABAAAAAAA','AAAAAAAADAAAAAAA'); 
    count |   r_reason_id    
    -------+------------------
         5 | AAAAAAAABAAAAAAA
    (1 row)
    
  • Example of the INTERSECT clause

    Query records whose r_reason_id is AAAAAAAABAAAAAAA and r_reason_sk is less than 5.

    1
    2
    3
    4
    5
    6
    SELECT * FROM tpcds.reason_p WHERE r_reason_id='AAAAAAAABAAAAAAA' INTERSECT SELECT * FROM tpcds.reason_p WHERE r_reason_sk<5;
     r_reason_sk |   r_reason_id    |     r_reason_desc                 
    -------------+------------------+------------------------------------
               4 | AAAAAAAABAAAAAAA | reason 3                           
               3 | AAAAAAAABAAAAAAA | reason 1                           
    (2 rows)
    
  • Example of the EXCEPT clause

    Query records whose r_reason_id is AAAAAAAABAAAAAAA and except the records whose r_reason_sk is less than 4.

    1
    2
    3
    4
    5
    6
    7
    8
    SELECT * FROM tpcds.reason_p WHERE r_reason_id='AAAAAAAABAAAAAAA' EXCEPT SELECT * FROM tpcds.reason_p WHERE r_reason_sk<4;
    r_reason_sk |   r_reason_id    |      r_reason_desc                  
    -------------+------------------+------------------------------------
              10 | AAAAAAAABAAAAAAA | reason 2                          
              10 | AAAAAAAABAAAAAAA | reason 5                          
              10 | AAAAAAAABAAAAAAA | reason 4                          
               4 | AAAAAAAABAAAAAAA | reason 3                          
    (4 rows)
    
  • Example of the WHERE clause

    Specify the operator (+) in the WHERE clause to indicate a left join.

    1
    2
    3
    4
    5
    6
    select t1.sr_item_sk ,t2.c_customer_id from store_returns t1, customer t2 where t1.sr_customer_sk  = t2.c_customer_sk(+) 
    order by 1 desc limit 1;
     sr_item_sk | c_customer_id
    ------------+---------------
          18000 |
    (1 row)
    

    Specify the operator (+) in the WHERE clause to indicate a right join.

    1
    2
    3
    4
    5
    6
    select t1.sr_item_sk ,t2.c_customer_id from store_returns t1, customer t2 where t1.sr_customer_sk(+)  = t2.c_customer_sk 
    order by 1 desc limit 1;
     sr_item_sk |  c_customer_id
    ------------+------------------
                | AAAAAAAAJNGEBAAA
    (1 row)
    

    Specify the operator (+) in the WHERE clause to indicate a left join and add a join condition.

    1
    2
    3
    4
    5
    select t1.sr_item_sk ,t2.c_customer_id from store_returns t1, customer t2 where t1.sr_customer_sk  = t2.c_customer_sk(+) and t2.c_customer_sk(+) < 1 order by 1  limit 1;
     sr_item_sk | c_customer_id
    ------------+---------------
              1 |
    (1 row)
    

    If the operator (+) is specified in the WHERE clause, do not use expressions connected through AND/OR.

    1
    2
    3
    select t1.sr_item_sk ,t2.c_customer_id from store_returns t1, customer t2 where not(t1.sr_customer_sk  = t2.c_customer_sk(+) and t2.c_customer_sk(+) < 1);
    ERROR:  Operator "(+)" can not be used in nesting expression.
    LINE 1: ...tomer_id from store_returns t1, customer t2 where not(t1.sr_...                                                           
    

    If the operator (+) is specified in the WHERE clause which does not support expression macros, an error will be reported.

    1
    2
    select t1.sr_item_sk ,t2.c_customer_id from store_returns t1, customer t2 where (t1.sr_customer_sk  = t2.c_customer_sk(+))::bool;
    ERROR:  Operator "(+)" can only be used in common expression.
    

    If the operator (+) is specified on both sides of an expression in the WHERE clause, an error will be reported.

    1
    2
    3
    select t1.sr_item_sk ,t2.c_customer_id from store_returns t1, customer t2 where t1.sr_customer_sk(+)  = t2.c_customer_sk(+);
    ERROR:  Operator "(+)" can't be specified on more than one relation in one join condition
    HINT:  "t1", "t2"...are specified Operator "(+)" in one condition.