Help Center/ GaussDB/ Developer Guide(Centralized_3.x)/ SQL Reference/ SQL Syntax/ SELECT

Updated on 2024-05-07 GMT+08:00

View PDF

SELECT

Description

Retrieves data from a table or view.

Serving as an overlaid filter for a database table, SELECT filters required data from the table using SQL keywords.

Precautions

The owner of a table, users granted with the SELECT permission on the table, or users granted with the SELECT ANY TABLE permission can read data in the table or view. The system administrator has the permission to read data in the table or view by default.
You must have the SELECT permission on each field used in the SELECT statement.

Using FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, or FOR KEY SHARE also requires the UPDATE permission.

Syntax

Query data.

[ WITH [ RECURSIVE ] with_query [, ...] ]
SELECT [/*+ plan_hint */] [ ALL | DISTINCT [ ON ( expression [, ...] ) ] ]
{ * | {expression [ [ AS ] output_name ]} [, ...] }
[ into_option ]
[ FROM from_item [, ...] ]
[ WHERE condition ]
[ [ START WITH condition ] CONNECT BY [NOCYCLE] condition [ ORDER SIBLINGS BY expression ] ]
[ GROUP BY grouping_element [, ...] ]
[ HAVING condition [, ...] ]
[ WINDOW {window_name AS ( window_definition )} [, ...] ]
[ { UNION | INTERSECT | EXCEPT | MINUS } [ ALL | DISTINCT ] select ]
[ ORDER BY {expression [ [ ASC | DESC | USING operator ] | nlssort_expression_clause ] [ NULLS { FIRST | LAST } ]} [, ...] ]
[ LIMIT { [offset,] count | ALL } ]
[ OFFSET start [ ROW | ROWS ] ]
[ FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY ]
[ into_option ]
[ {FOR { UPDATE | NO KEY UPDATE | SHARE | KEY SHARE } [ OF table_name [, ...] ] [ NOWAIT | WAIT N ]} [...] ]
[into_option];
TABLE { ONLY {(table_name)| table_name} | table_name [ * ]};

In condition and expression, you can use the aliases of expressions in targetlist in compliance with the following rules:

Reference only within the same level.
Only reference aliases in targetlist.
Reference a prior expression in a subsequent expression.
The volatile function cannot be used.
The Window function cannot be used.
Aliases cannot be referenced in the join on condition.
An error is reported if the target list contains multiple aliases to be referenced.

In the scenario where the SELECT statement plan is cached, the WHERE IN candidate subset cannot be too large. It is recommended that the number of conditions be less than or equal to 100 to prevent high dynamic memory.

If the WHERE IN candidate subset is too large, the memory usage of the generated plan increases.
If the WHERE IN subsets constructed by concatenated SQL statements are different, the SQL template of the cache plan cannot be reused. A large number of different plans are generated, and the plans cannot be shared. As a result, a large amount of memory is occupied.

The subquery with_query is as follows:

with_query_name [ ( column_name [, ...] ) ]
    AS [ [ NOT ] MATERIALIZED ] ( {select | values | insert | update | delete} )

The INTO clause is as follows:

into_option: {
        INTO var_name [, var_name] ...
	| INTO OUTFILE 'file_name'
		[CHARACTER SET charset_name]
		export_options
	| INTO DUMPFILE 'file_name'
}
export_options: {
[FIELDS
 [TERMINATED BY 'string']
 [[OPTIONALLY] ENCLOSED BY 'char']
 [ESCAPED BY 'char' ]
    ][LINES
 [STARTING BY 'string']
 [TERMINATED BY 'string']
    ]
}

The specified query source from_item is as follows:

{[ ONLY ] table_name [ * ] [ partition_clause ] [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
[ TABLESAMPLE sampling_method ( argument [, ...] ) [ REPEATABLE ( seed ) ] ]
[ TIMECAPSULE {TIMESTAMP | CSN} expression ]
|( select ) [ AS ] alias [ ( column_alias [, ...] ) ]
|with_query_name [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
|function_name ( [ argument [, ...] ] ) [ AS ] alias [ ( column_alias [, ...] | column_definition [, ...] ) ]
|function_name ( [ argument [, ...] ] ) AS ( column_definition [, ...] )
|from_item unpivot_clause
|from_item pivot_clause
|from_item [ NATURAL ] join_type from_item [ ON join_condition | USING ( join_column [, ...] ) ]}

The group clause is as follows:

( )
| expression
| ( expression [, ...] )
| ROLLUP ( { expression | ( expression [, ...] ) } [, ...] )
| CUBE ( { expression | ( expression [, ...] ) } [, ...] )
| GROUPING SETS ( grouping_element [, ...] )

The specified partition partition_clause is as follows:

PARTITION { ( partition_name ) | FOR (  partition_value [, ...] ) } |
SUBPARTITION { ( subpartition_name ) | FOR (  subpartition_value [, ...] )}

The specified partition applies only to partitioned tables.

The sorting order nlssort_expression_clause is as follows:

NLSSORT ( column_name, ' NLS_SORT = { SCHINESE_PINYIN_M | generic_m_ci } ' )
The second parameter can be generic_m_ci, which supports only the case-insensitive order for English characters.

Simplified query syntax, equivalent to select * from table_name.
```
TABLE { ONLY {(table_name)| table_name} | table_name [ * ]};
```

Parameters

WITH [ RECURSIVE ] with_query [, ...]
Specifies one or more subqueries that can be referenced by name in the main query, which is equivalent to a temporary table. This subquery statement structure is called the common table expression (CTE) structure. When this structure is used, the execution plan contains the CTE SCAN content.

If RECURSIVE is specified, it allows a SELECT subquery to reference itself by name.

The detailed format of with_query is as follows: with_query_name [ ( column_name [, ...] ) ] AS [ [ NOT ] MATERIALIZED ] ( {select | values | insert | update | delete} )
- with_query_name specifies the name of the result set generated by a subquery. Such names can be used to access the result sets of subqueries in a query.
- column_name specifies the column name displayed in the subquery result set.
- Each subquery can be a SELECT, VALUES, INSERT, UPDATE or DELETE statement.
- RECURSIVE can appear only after WITH. In the case of multiple CTEs, you only need to declare RECURSIVE at the first CTE.
- You can use MATERIALIZED or NOT MATERIALIZED to modify the CTE.
  - If MATERIALIZED is specified, the WITH query will be materialized, and a copy of the subquery result set is generated. The copy is directly queried at the reference point. Therefore, the WITH subquery cannot be jointly optimized with the SELECT statement trunk (for example, predicate pushdown and equivalence class transfer). In this scenario, you can use NOT MATERIALIZED for modification. If the WITH query can be executed as a subquery inline, the preceding optimization can be performed.
  - If the user does not explicitly declare the materialized attribute, comply with the following rules: If the CTE is referenced only once in the SELECT statement trunk to which it belongs and semantically supports inline execution, it will be rewritten as subquery inline execution. Otherwise, the materialized execution will be performed in CTE Scan mode.
plan_hint clause
Follows the SELECT keyword in the /*+<Plan hint> */ format. It is used to optimize the plan of a SELECT statement block. For details, see Hint-based Tuning. In each statement, only the first /*+ plan_hint */ comment block takes effect as a hint. Multiple hints can be written.
ALL
Specifies that all rows that meet the conditions are returned. This is the default behavior and can be omitted.
DISTINCT [ ON ( expression [, ...] ) ]
Removes all duplicate rows from the SELECT result set so one row is kept from each group of duplicates.

Retains only the first row in the set of rows that have the same result calculated on the given expression.

DISTINCT ON expression is explained with the same rule of ORDER BY. Unless you use ORDER BY to guarantee that the required row appears first, you cannot know what the first row is.
SELECT list
Specifies the name of a column in the table to be queried. The value can be a part of the column name or all of the column names. The wildcard (*) is used to represent the column name.

You may use the AS output_name clause to give an alias for an output column. The alias is used for the displaying of the output column. The name, value, and type keywords can be used as column aliases.

Column names can be expressed in the following formats:
- Manually input column names which are spaced using commas (,).
- Columns computed in the FROM clause.

INTO clause

Exports the SELECT result to a specified user-defined variable or file.

var_name
User-defined variable name. For details, see var_name in SET.
OUTFILE
- CHARACTER SET specifies the encoding format.
- FIELDS specifies the attribute of each column.
  TERMINATED specifies the separator.
  
  [OPTIONALLY] ENCLOSED specifies the quotation mark. When OPTIONALLY is specified, it takes effect only for the string data type.
  
  ESCAPED specifies the escape character.
- LINES specifies the row attribute.
  STARTING specifies the beginning of a row.
  
  TERMINATED specifies the end of a row.
DUMPFILE
Exports a single row of data without separators or newline characters to a file.

file_name

Specifies the absolute path of a file.

Three positions of into_option are as follows:
-- Before the FROM clause
gaussdb=#  select * into @my_var from t;
-- Before the LOCK clause
gaussdb=#  select * from t into @my_var for update;
-- At the end of the SELECT statement
gaussdb=#  select * from t for update into @my_var;

Export to a file.
gaussdb=#  select * from t;
 a | b
---+---
 1 | a
(1 row)
-- Export data to an outfile file.
gaussdb=#  select * from t into outfile '/home/gaussdb/t.txt'FIELDS TERMINATED BY '~' ENCLOSED BY 't' ESCAPED BY '^' LINES STARTING BY '$' TERMINATED BY '&\n';
File content: $t1t~tat&, where LINES STARTING BY($),FIELDS TERMINATED BY(~),ENCLOSED BY(t),LINES TERMINATED BY(&\n)
-- Export data to the dumpfile file.
gaussdb=#  select * from t into dumpfile '/home/gaussdb/t.txt';
File content: 1a

FROM clause
Specifies one or more source tables for SELECT.

The FROM clause can contain the following elements:
- table_name
  Specifies the name of a table or view. The schema name can be added before the table name or view name, for example, schema_name.table_name.
  
  You can use database link to perform operations on remote tables and synonyms. For details, see DATABASE LINK.
- alias
  Gives a temporary alias to a table to facilitate the quotation by other queries.
  
  An alias is used for brevity or to eliminate ambiguity for self-joins. If an alias is provided, it completely hides the actual name of the table.
  
  If an alias is specified for the joined_table table created by JOIN and joined_table is wrapped with "()", that is, (joined_table), non-reserved keywords UNPIVOT and PIVOT cannot be used as aliases.
- TABLESAMPLE sampling_method ( argument [, ...] ) [ REPEATABLE ( seed ) ]
  The TABLESAMPLE clause following table_name specifies that the specified sampling_method should be used to retrieve the subset of rows in the table.
  
  The optional REPEATABLE clause specifies the number of seeds used to generate random numbers in the sampling method. The seed value can be any non-null constant value. If the table was not changed during the query, the two queries having the same seed and argument values will select the same sampling in this table. However, different seed values usually generate different samples. If REPEATABLE is not specified, a new random sample will be selected for each query based on the seed generated by the system.
- TIMECAPSULE { TIMESTAMP | CSN } expression
  Queries the table data of a specified CSN or at a specified time point.
  
  Currently, the following tables do not support flashback query: system catalogs, DFS tables, global temporary tables, local temporary tables, unlogged tables, views, sequence tables, hash bucket tables, shared tables, and inherited tables.
  - TIMECAPSULE TIMESTAMP
    Searches for the result set of a specified time point based on the date as the flashback query flag. date must be a valid past timestamp
  - TIMECAPSULE CSN
    Searches for the result set of a specified CSN point based on the CSN flashback of the table as the flashback query flag. The CSN can be obtained from snpcsn recorded in gs_txn_snapshot.
    A flashback query cannot span statements that affect the table structure or physical storage. Otherwise, an error is reported. Between the flashback point and the current point, if a statement (TRUNCATE, DDL, DCL, or VACUUM FULL) has been executed to modify the table structure or affect physical storage, the flashback fails.
    
    Flashback query does not support index query. Flashback query supports only seqScan for full table scanning.
    
    When the flashback point is too old, the old version cannot be obtained because the flashback version is recycled. As a result, the flashback fails and the error message "Restore point too old" is displayed.
    
    The flashback point is specified by time. The maximum difference between the flashback point and the actual time is 3 seconds.
    
    After truncating a table, perform a flashback query or flashback on the table. The error message "Snapshot too old" is displayed when a flashback is performed at a specified time point. Data cannot be found or the error message "Snapshot too old" is reported during the CSN-based flashback.
- column_alias
  Specifies the column alias.
- PARTITION
  Queries data in the specified partition in a partitioned table.
- partition_name
  Specifies the name of a partition.
- partition_value
  Specifies the value of the specified partition key. If there are many partition keys, use the PARTITION FOR clause to specify the value of the only partition key you want to use.
- SUBPARTITION
  Queries data in the specified level-2 partition in a partitioned table.
- subpartition_name
  Specifies the name of a level-2 partition name.
- subpartition_value
  Specifies the key values of specified level-1 and level-2 partitions. The values of the two partition keys specified by the SUBPARTITION FOR clause uniquely identify a level-2 partition.
- subquery
  Performs a subquery in the FROM clause. A temporary table is created to save subquery results.
- with_query_name
  Specifies that the WITH clause can also be used as the source of the FROM clause and can be referenced by the name of the WITH query.
- function_name
  Function name. Function calls can appear in the FROM clause.
- join_type
  The options are as follows:
  - [ INNER ] JOIN
    A JOIN clause combines two FROM items. You can use parentheses to determine the order of nesting. In the absence of parentheses, JOIN nests left-to-right.
    
    In any case, JOIN binds more tightly than the commas separating FROM items.
  - LEFT [ OUTER ] JOIN
    Returns all rows that meet join conditions in the Cartesian product, plus those rows that do not match the right table rows in the left table by join conditions. This left-hand row is extended to the full width of the joined table by inserting NULL values for the right-hand columns. Note that only the JOIN clause's own condition is considered while the system decides which rows have matches. Outer conditions are applied afterward.
  - RIGHT [ OUTER ] JOIN
    Returns all the joined rows, plus one row for each unmatched right-hand row (extended with NULL on the left).
    
    This is just a notational convenience, since you could convert it to a LEFT OUTER JOIN by switching the left and right inputs.
  - FULL [ OUTER ] JOIN
    Returns all the joined rows, pluses one row for each unmatched left-hand row (extended with NULL on the right), and pluses one row for each unmatched right-hand row (extended with NULL on the left).
  - CROSS JOIN
    Is equivalent to INNER JOIN ON (TRUE), which means no rows are removed by qualification. These join types are just a notational convenience, since they do nothing you could not do with plain FROM and WHERE.
    
    For the INNER and OUTER join types, a join condition must be specified, namely exactly one of NATURAL ON, join_condition, or USING (join_column [, ...]). For CROSS JOIN, none of these clauses can appear.
  CROSS JOIN and INNER JOIN produce a simple Cartesian product, the same result as you get from listing the two items at the top level of FROM.
- ON join_condition
  Defines which rows have matches in joins. Example: ON left_table.a = right_table.a You are advised not to use numeric types such as int for join_condition, because such types can be implicitly converted to bool values (non-zero values are implicitly converted to true and 0 is implicitly converted to false), which may cause unexpected results.
- USING(join_column[, ...])
  ON left_table.a = right_table.a AND left_table.b = right_table.b ... abbreviation. Corresponding columns must have the same name.
- NATURAL
  Is a shorthand for a USING list that mentions all columns in the two tables that have the same names.
- from item
  Specifies the name of the query source object connected.
- unpivot_clause
  Converts a column to a row. The syntax format is as follows:
```
 UNPIVOT [ {INCLUDE | EXCLUDE} NULLS ]
    (   
        unpiovt_col_clause
        unpivot_for_clause
        unpivot_in_clause
    )
```
  - {INCLUDE | EXCLUDE} NULLS
    Controls whether the converted result contains NULL rows. INCLUDE NULLS indicates that the converted results contain NULL rows. EXCLUDE NULLS indicates that the converted results do not contain NULL rows. If this clause is ignored, the unpivot operation removes NULL rows from the converted results by default.
  - unpiovt_col_clause
```
unpivot_col_element 
```
    unpivot_col_element specifies the output column names. These columns store the values to be converted.
  - unpivot_col_element
```
{ column | ( column [, column]... ) }
```
    unpivot_col_element has two forms: column and ( column [, column]... ).
  - unpivot_for_clause
```
FOR { unpivot_col_element }
```
    unpivot_col_element contained in the unpivot_for_clause is used to specify the output column names. These columns store the alias and names of the column to be converted.
  - unpivot_in_clause
```
IN ( unpivot_in_element [,unpivot_in_element...] )
```
    unpivot_in_clause specifies the columns to be converted. The column names and column values are saved in the previously specified output columns.
    unpivot_in_element
```
{ unpivot_col_element }[  AS { unpivot_alias_element } ]
```
    unpivot_col_element specifies the column to be converted. If ( column [, column]...) is used to specify the column to be converted, the column names are connected by underscores (_) and saved in the output columns. For example, IN ((col1, col2)) generates the column name col1_col2 and saves it in the output column specified by unpivot_for_clause. In addition, the AS keyword can be used to specify an alias for the column to be converted. Once an alias is specified, the alias is saved in the output column instead of the name of column to be converted.
  - unpivot_alias_element
```
{ alias | ( alias [, alias]... )}
```
    Similar to unpivot_col_element, unpivot_alias_element has two forms. alias indicates the specified alias.
  Currently, unpivot_clause has the following restrictions:
  - This parameter can be used only in A compatibility mode.
  - The unpivot_clause clause cannot be used with hints.
  - For unpiovt_col_clause, the number of output columns specified by unpivot_col_element must be the same as that of unpivot_col_element contained in unpivot_in_clause.
  - For unpivot_for_clause, the number of output columns specified by unpivot_col_element must be the same as the number of aliases specified by unpivot_alias_element contained in unpivot_in_clause.
  - For unpivot_in_clause, the alias must be a constant or an expression that can be converted to a constant.
  - For unpivot_in_clause, constant expressions support only IMMUTABLE functions.
  - For all unpivot_col_element parameters contained in unpivot_in_clause, if the column types in the same position of these unpivot_col_element parameters are different, UNPIVOT attempts to convert the column types in order to convert the values of columns to be converted to a public type. Similarly, for all unpivot_alias_element parameters, if the alias types of these unpivot_alias_element parameters in the same position are different, UNPIVOT attempts to do the similar type conversion.
    For example, assume that there is an unpivot_in_clause in the form of IN (col1, col2), where col1 is of the int type and col2 is of the float type, UNPIVOT attempts to convert the column value of col1 to the public type float during the calculation.
- pivot_clause
  Converts a row into column. The syntax format is as follows:
```
PIVOT [ XML ]
( aggregate_function ( expr ) [[AS] alias ]
    [, aggregate_function ( expr ) [[AS] alias ] ]...
    pivot_for_clause
    pivot_in_clause
)
```
  - aggregate_function ( expr ) [[AS] alias ]
    Aggregates calculation on a given expression. The calculation result is saved in the output column specified by pivot_in_clause. [AS] alias (The AS keyword can be omitted.) can be used to specify an alias for aggregate_function. The alias is appended to the output column name specified by pivot_in_clause in the format of "_alias".
  - pivot_for_clause
```
FOR { column
    | ( column [, column]... )
    }
```
    Specifies the row to be converted. The column indicates a column of the row to be converted.
  - pivot_in_clause
```
IN ( { { { expr
         | ( expr [, expr]... )
         } [ [AS] alias]
       }...
     }
  )
```
    Specifies the name of the output column. The column name can consist of one or more expressions, for example, (expr1, expr2). When a column name consists of multiple expressions, these expressions are connected by underscores (_) in sequence. That is, the output column name corresponding to (expr1, expr2) is "expr1_expr2". These expressions not only generate output column names, but also determine the time when the aggregation function is triggered. If the values of row to be converted is the same as the value of these expressions, the results calculated by aggregate_function are saved in the output column names that consist of these expressions. Assume that expr1 is "1" and expr2 is "2". For row "1 2", aggregate_function is used for calculation. For row "1 1", the calculation is not triggered.
  Currently, pivot_clause has the following restrictions:
  - This parameter can be used only in A compatibility mode.
  - The pivot_clause clause cannot be used with hints.
  - If more than one aggregate_function is specified, at most one aggregate_function is allowed to have no alias, the rest of the aggregate_function functions are required to specify an alias.
  - XML supports only syntax but does not support functions.
  - The expression in pivot_in_clause can be a constant or an expression that can be converted to a constant. If the expression is not a unary expression, specify an alias for the expression.
  - For pivot_in_clause, constant expressions support only IMMUTABLE functions.
  - For the expression in pivot_in_clause, when the keyword AS is used to specify an alias for the expression, only the non-reserved keywords can be used as aliases.
  - If the length of an output column name exceeds 63 characters, subsequent characters will not be printed.
WHERE clause
Forms an expression for row selection to narrow down the query range of SELECT. condition indicates any expression that returns a value of Boolean type. Rows that do not meet this condition will not be retrieved. You are advised not to use numeric types such as int for condition, because such types can be implicitly converted to bool values (non-zero values are implicitly converted to true and 0 is implicitly converted to false), which may cause unexpected results.

In the WHERE clause, you can use the operator (+) to convert a table join to an outer join. However, this method is not recommended because it is not the standard SQL syntax and may raise syntax compatibility issues during platform migration. There are many restrictions on using the operator (+):
1. It can appear only in the WHERE clause.
2. If a table join has been specified in the FROM clause, the operator (+) cannot be used in the WHERE clause.
3. The operator (+) can work only on columns of tables or views, instead of on expressions.
4. If table A and table B have multiple join conditions, the operator (+) must be specified in all the conditions. Otherwise, the operator (+) will not take effect, and the table join will be converted into an inner join without any prompt information.
5. Tables specified in a join condition where the operator (+) works cannot cross queries or subqueries. If tables where the operator (+) works are not in the FROM clause of the current query or subquery, an error will be reported. If a peer table for the operator (+) does not exist, no error will be reported and the table join will be converted into an inner join.
6. Expressions where the operator (+) is used cannot be directly connected through OR.
7. If a column where the operator (+) works is compared with a constant, the expression becomes a part of the join condition.
8. A table cannot have multiple foreign tables.
9. The operator (+) can appear only in the following expressions: comparison, NOT, ANY, ALL, IN, NULLIF, IS DISTINCT FROM, and IS OF. It is not allowed in other types of expressions. In addition, these expressions cannot be connected through AND or OR.
10. The operator (+) can be used to convert a table join only to a left or right outer join, instead of a full join. That is, the operator (+) cannot be specified on both tables of an expression.
- For the WHERE clause, if special character "%", "_", or "\" is queried in LIKE, add the slash "\" before each character.
- For a hierarchical query, the WHERE expression is processed as follows in a multi-table join query:
  Decompose the WHERE expression based on the disjunctive and conjunctive actions and check whether each subexpression involves multiple tables (potential join conditions) queried at the current layer. If a subexpression is not a sublink and involves only multiple tables queried at the current layer, push it down to the non-recursive START WITH clause or the recursive CONNECT BY clause of the hierarchical query for prior execution as the JOIN condition. Otherwise, place it after the recursive CONNECT BY clause for execution as a filter condition.
  
  During internal implementation, the subexpressions that do not meet pushdown requirements in the parse tree of the original WHERE expression are deleted to obtain the final expression that can be pushed down. Then, the subexpressions that meet the pushdown condition are removed from the parse tree of the original WHERE expression to obtain the final expression that is not pushed down.
  
  To sum up, pay attention to the following points:
  - When the WHERE expression is split, the disjunctive action is also split. Therefore, the two expressions joined by OR are also split.
  - Sublinks must not be pushed down.
  - If an input column is not queried at the current layer, the column will not be pushed down.
  Then, for the following example:
```
SELECT * FROM t1,t2
WHERE t1.a=1 or t1.a=t2.a and t1.b=t2.b or t1.c in (SELECT t3.c FROM t3)
START WITH t1.c=1
CONNECT BY PRIOR t2.c=t1.c;
```
  The following figure shows the pushdown conditions after the WHERE expression is split disjunctively and conjunctively.
  
  It is equivalent to "t1.a=t2.a AND t1.b=t2.b". The following figure shows the conditions that are not pushed down.
  
  It is equivalent to "t1.a=1 OR t1.c in (SELECT t3.c FROM t3)".
  
  For another example:
```
SELECT * FROM t3 WHERE EXISTS(
SELECT * FROM t1,t2
WHERE t1.a+t2.a=t3.a
START WITH t1.c=1
CONNECT BY PRIOR t2.c=t1.c;
)
```
  The WHERE condition contains t3.a inputted by the outer query. Therefore, the WHERE condition is not pushed down.
START WITH clause
The START WITH clause is usually used together with the CONNECT BY clause and indicates the initial condition of recursion. Data is traversed recursively and hierarchically. If this clause is omitted and the CONNECT BY clause is used alone, all rows in the table are used as the initial set. For details, see CONNECT BY.
CONNECT BY clause
CONNECT BY indicates a recursive join condition. It is used together with START WITH to implement data traversal and recursion. The following is an example:
```
gaussdb=#  CREATE TABLE test(name varchar, id int, fatherid int);
gaussdb=#  INSERT INTO test VALUES('A', 1, 0), ('B', 2, 1),('C',3,1),('D',4,1),('E',5,2);
gaussdb=#  SELECT * FROM test START WITH id = 1 CONNECT PRIOR id = fatherid ORDER SIBLINGS BY id DESC;
 name | id | fatherid 
------+----+----------
 A    |  1 |        0
 D    |  4 |        1
 C    |  3 |        1
 B    |  2 |        1
 E    |  5 |        2
(5 rows)
```
In the CONNECT BY condition, the PRIOR keyword can be specified for a column to indicate that the column is recursive. If NOCYCLE is added before the recursive join condition, recursion stops when a loop record is encountered. (Note: A SELECT statement containing the START WITH .. CONNECT BY clause does not support the FOR SHARE or UPDATE lock.)
The process of executing the START WITH statement is as follows:
1. The initial dataset is selected based on the condition in the START WITH clause. In the preceding example, ('A', 1, 0) is selected first. Then, this initial dataset is set as the working set.
2. If the working set is not empty, the data in the working set is used as the input for the next query. The filter criteria are specified in the CONNECT BY clause. The keyword PRIOR indicates the current record. For example, prior id = fatherid in the preceding example indicates that the ID of the current record is fatherid of the next record.
3. Set the dataset filtered in step 2 as the working set and repeat the operation in step 2.
In addition, the database adds the following pseudocolumns to each selected data record so that users can learn about the location of the data in the recursive or tree structure.
- LEVEL: node level.
- CONNECT_BY_ISLEAF: specifies whether a node is a leaf node.
connect_by_isleaf: If a node does not have any subnode, the node is a leaf node and connect_by_isleaf is set to 1. Otherwise, connect_by_isleaf is set to 0.

Assume that table T1 exists and the data in table T1 is shown in Figure 1.
Figure 1 Data structure

Run the following statement:
```
SELECT * FROM T1 CONNECT BY PRIOR B=A AND LEVEL<=3; 
```
The tree structure of the result is shown in Figure 2, where the blue nodes are leaf nodes.
Figure 2 Logical structure of the execution
In addition to pseudocolumns, the following query functions are provided (for details, see Hierarchical Recursion Query Functions):
- sys_connect_by_path(col, separator): returns the connection path from the root node to the current row. The col parameter indicates the name of the column displayed in the path, and the separator parameter indicates the connector.
- connect_by_root(col): displays the top-level node of the node. col indicates the name of the output column.
If a loop exists in the dataset, the database provides loop detection. By default, if a loop is detected, an error is reported and no data is returned. In addition, the NOCYCLE keyword is provided. With it, the query can be executed normally and when the first duplicate data record is found, the query exits directly instead of reporting an error.

Besides, in the hierarchical query, the search is performed strictly according to the depth-first order. If ROWNUM is used as the filtering condition in START WITH or CONNECT BY, the value of ROWNUM is increased by 1 for each record to be returned. Then, the record is verified based on ROWNUM conditions. Records that do not satisfy the conditions are discarded and the value of ROWNUM is decreased by 1.
- The PRIOR keyword can be used only in the CONNECT BY clause instead of the START WITH clause.
- In addition to targetlist, "prior (single column)" is parsed as "prior single column". User-defined functions named prior are not perceived.
- The PRIOR keyword can be specified only for columns in the table instead of expressions, pseudocolumns, or type conversion. For example, PRIOR (a + 1) is not allowed.
- In the CONNECT BY clause, the column using the keyword PRIOR cannot be in the same condition with pseudocolumns such as level and rownum, but they can be in different conditions. For example, (PRIOR a = level) is not allowed, but (PRIOR a = b) and (level = 1) is allowed. Different conditions refer to the conditions connected by AND at the top of the CONNECT BY clause. For example, (PRIOR a = 1 or level = 1) is considered as a condition and is not allowed.
- In the START WITH and CONNECT BY clauses, pseudocolumns cannot be used for sublinks, for example, "rownum = (subquery)" or "rownum in (subquery)".
- You are advised not to use pseudo columns in the CONNECT BY statement. If pseudocolumns need to be used, you need to test the columns to avoid inconsistency between the result and expected result.
- When START WITH and CONNECT BY are called on the CTE defined by WITH AS, if there are multiple CTEs, ensure that the definition of each CTE does not depend on other CTEs.
- If no loop exists in the data but the error message "runs into cycle" is reported, increase the value of max_recursive_times.
- connect_by_isleaf, connect_by_iscycle and level are of the int type.
- connect_by_isleaf, connect_by_iscycle, and level are parsed as pseudo columns in hierarchical queries.
- The alias defined in the projection column cannot be called for START WITH or CONNECT BY.
- Optimization suggestions for START WITH:
  - Create indexes based on the conditions in the CONNECT BY clause to improve the performance of the START WITH clause.
  - Identify bottlenecks based on the plan collected by running EXPLAIN PERFORMANCE or in WDR. If the recursive operator (inner plan) of RECURSIVE UNION is the HASH JOIN operator, but the hash table is created for the temporary table tmp_result or the hash table in plan is materialized (that is, the batch size is greater than 1), the possible cause is that the value of work_mem is too small. As a result, the hash table cannot be created for the outer data table. You can increase the value of work_mem to improve performance.
    Note: GaussDB optimizes tables with a small volume of data and caches table results in hash tables to improve performance. In this case, indexes are not required. However, if the data volume exceeds the limit specified by work_mem, the optimization becomes invalid. In this case, you can create indexes for optimization.
You are advised not to use recursive query when creating a view definition. Otherwise, the obtained view definition is incorrect or an error is reported. This is because the bottom layer of the recursive query statement is changed to the recursive union statement, and the view display definition is obtained by reversely parsing the Recursive Union query tree. Therefore, after a view with recursive query is created, the view definition is the Recursive Union statement, instead of the original START WITH statement, or even the statement may not be valid. Although the functions of the current database view are not affected, the restore operation fails if you use tools such as gs_dump and gs_restore to migrate the data by viewing the view definitions. In this case, you need to manually create the view.
ORDER SIBLINGS BY clause
The output of the START WITH statement is returned level by level. However, there is no sequence guarantee at each level because the database automatically selects the optimal execution path during each round of query. In the preceding example, A is output first, but the sequence of B, C, and D is not fixed. If you have requirements on the final output sequence, you can use ORDER SIBLINGS BY. The usage of ORDER SIBLINGS BY is the same as that of ORDER BY. ORDER SIBLINGS BY is used for sorting at each level during recursion.

The expression following ORDER SIBLING BY only supports sorting by calling non-aggregate and window functions for common columns, column name offsets, and column names. It does not support calling system functions related to START WITH for column names or using pseudocolumns related to START WITH.
GROUP BY clause
Condenses query results into a single row all selected rows that share the same values for the grouped expressions.
- CUBE ( { expression | ( expression [, ...] ) } [, ...] )
  A CUBE grouping is an extension to the GROUP BY clause that creates subtotals for all of the possible combinations of the given list of grouping columns (or expressions). In terms of multidimensional analysis, CUBE generates all the subtotals that could be calculated for a data cube with the specified dimensions. For example, given three expressions (n=3) in the CUBE clause, the operation results in 2ⁿ = 2³ = 8 groupings. Rows grouped on the values of n expressions are called regular rows, and the rest are called superaggregate rows.
- GROUPING SETS ( grouping_element [, ...] )
  Another extension to the GROUP BY clause. It allows users to specify multiple GROUP BY clauses. This improves efficiency by trimming away unnecessary data. After you specify the required data group, the database does not need to compute a whole ROLLUP or CUBE.
- If a SELECT list expression quotes some ungrouped fields and no aggregate function is used, an error is displayed. This is because multiple values may be returned for ungrouped fields.
- If a SELECT list expression references a constant, the GROUP BY clause does not need to group the constant. Otherwise, an error is reported.
HAVING clause
Selects special groups by working with the GROUP BY clause. The HAVING clause compares some attributes of groups with a constant. Only groups that matching the logical expression in the HAVING clause are extracted.
WINDOW clause
The general format is WINDOW window_name AS ( window_definition ) [, ...]. window_name is a name can be referenced by window_definition. window_definition can be expressed in the following forms:
```
[ existing_window_name ]
[ PARTITION BY expression [, ...] ]
[ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ]
[ frame_clause ]
```
frame_clause defines a window frame for the window function. The window function (not all window functions) depends on window frame and window frame is a set of relevant rows of the current query row. frame_clause can be expressed in the following forms:
```
[ RANGE | ROWS ] frame_start
[ RANGE | ROWS ] BETWEEN frame_start AND frame_end
```
frame_start and frame_end can be expressed in the following forms:
```
UNBOUNDED PRECEDING
value PRECEDING
CURRENT ROW
value FOLLOWING
UNBOUNDED FOLLOWING
```
UNION clause
Computes the set union of the rows returned by the involved SELECT statements.

The UNION clause has the following constraints:
- By default, the result of UNION does not contain any duplicate rows unless the ALL clause is declared.
- Multiple UNION operators in the same SELECT statement are evaluated left to right, unless otherwise specified by parentheses.
- FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, and FOR KEY SHARE cannot be specified in the result or input of UNION.
- The GUC parameter enable_union_all_order determines whether to preserve the order of UNION ALL subqueries.
  - Enabling: set enable_union_all_order to on
  - Disabling: set enable_union_all_order to off
  - Querying: show enable_union_all_order
  When the enable_union_all_order parameter is enabled, subquery order preserving is supported if UNION is set to ALL and the main query is not sorted. In other cases, subquery order preserving is not supported.
General expression:

select_statement UNION [ALL] select_statement
- select_statement can be any SELECT statement without the ORDER BY, LIMIT, FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, or FOR KEY SHARE clause.
- ORDER BY and LIMIT can be attached to the subexpression if it is enclosed in parentheses.
INTERSECT clause
Computes the set intersection of rows returned by the involved SELECT statements. The result of INTERSECT does not contain any duplicate rows.

The INTERSECT clause has the following constraints:
- Multiple INTERSECT operators in the same SELECT statement are evaluated left to right, unless otherwise specified by parentheses.
- Processing INTERSECT preferentially when UNION and INTERSECT operations are executed for results of multiple SELECT statements.
General format:

select_statement INTERSECT select_statement

select_statement can be any SELECT statement without the FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, or FOR KEY SHARE clause.
EXCEPT clause
Has the following common form:

select_statement EXCEPT [ ALL ] select_statement

select_statement can be any SELECT statement without the FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, or FOR KEY SHARE clause.

The EXCEPT operator computes the set of rows that are in the result of the left SELECT statement but not in the result of the right one.

The result of EXCEPT does not contain any duplicate rows unless the ALL clause is declared. To execute ALL, a row that has m duplicates in the left table and n duplicates in the right table will appear MAX(m–n, 0) times in the result set.

Multiple EXCEPT operators in the same SELECT statement are evaluated left to right, unless parentheses dictate otherwise. EXCEPT binds at the same level as UNION.

Currently, the FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, and FOR KEY SHARE clauses cannot be specified for the result of EXCEPT or any input of EXCEPT.
MINUS clause
Has the same function and syntax as EXCEPT clause.
ORDER BY clause
Sorts data retrieved by SELECT in descending or ascending order. If the ORDER BY expression contains multiple columns:
- If two columns are equal according to the leftmost expression, they are compared according to the next expression and so on.
- If they are equal according to all specified expressions, they are returned in an implementation-dependent order.
- When used with the DISTINCT keyword, the columns to be sorted in ORDER BY must be included in the columns of the result set retrieved by the SELECT statement.
- When used with the GROUP BY clause, the columns to be sorted in ORDER BY must be included in the columns of the result set retrieved by the SELECT statement.
To support Chinese pinyin order, set the encoding format to UTF-8, GB18030, GB18030_2022, or GBK when initializing the database. The statements are as follows:
```
initdb –E UTF8 –D ../data –locale=zh_CN.UTF-8, initdb -E GB18030 -D ../data -locale=zh_CN.GB18030, initdb -E GB18030_2022 -D ../data -locale=zh_CN.GB18030, or initdb –E GBK –D ../data –locale=zh_CN.GBK.
```
LIMIT clause
Consists of two independent sub-clauses:

LIMIT { count | ALL }

OFFSET start count specifies the maximum number of rows to return, while start specifies the number of rows to skip before starting to return rows. When both are specified, start rows are skipped before starting to count the count rows to be returned.

ROWNUM cannot be used as count or offset in the LIMIT clause.
OFFSET clause
The SQL: 2008 standard has introduced a different clause:

OFFSET start { ROW | ROWS }

start specifies the number of rows to skip before starting to return rows.
FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY
If count is omitted in a FETCH clause, it defaults to 1.
Locking clause
The FOR UPDATE clause locks the rows retrieved by SELECT. This prevents these rows from being modified or deleted by other transactions before the current transaction ends. That is, other transactions that attempt to run UPDATE, DELETE, SELECT FOR UPDATE, SELECT FOR NO KEY UPDATE, SELECT FOR SHARE, or SELECT FOR KEY SHARE for these rows will be blocked until the current transaction ends. Any DELETE on a row will also acquire the FOR UPDATE locking mode, as will UPDATE that modifies values on the primary key column. Conversely, SELECT FOR UPDATE waits for concurrent transactions that have run the preceding commands on the same row, and then locks and returns the updated row (there may be no row because the row may have been deleted).

FOR NO KEY UPDATE behaves similarly to FOR UPDATE, except that it acquires a weaker lock that will not block SELECT FOR KEY SHARE that attempts to acquire the lock on the same row. Any UPDATE that does not acquire the FOR UPDATE lock will also acquire this locking mode.

FOR SHARE behaves similarly, except that it acquires a shared rather than exclusive lock on each retrieved row. A shared lock blocks other transactions from executing UPDATE, DELETE, SELECT FOR UPDATE, or SELECT FOR NO KEY UPDATE, but does not block SELECT FOR SHARE or SELECT FOR KEY SHARE.

FOR KEY SHARE is similar to FOR SHARE except that its lock is weak. SELECT FOR UPDATE is blocked but SELECT FOR NO KEY UPDATE is not blocked. A key-shared lock blocks other transactions from executing DELETE or UPDATE that modifies the key value, but does not block UPDATE, SELECT FOR NO KEY UPDATE, SELECT FOR SHARE, or SELECT FOR KEY SHARE.

To prevent the operation from waiting for the commit of other transactions, you can use NOWAIT. If the selected row cannot be locked immediately, an error is reported immediately and there is no waiting. If you use WAIT N and the selected row cannot be locked immediately, the operation needs to wait for N seconds (the value of N is of the int type with a range of 0 ≤ N ≤ 2147483). If the lock is obtained within N seconds, the operation is performed normally. Otherwise, an error is reported.

If specified tables are named in a locking clause, then only rows coming from those tables are locked; any other tables used in SELECT are simply read as usual. Otherwise, locking all tables in the statement.

If a locking clause is applied to a view or sub-query, it affects all tables used in the view or sub-query.

Multiple locking clauses can be written if it is necessary to specify different locking behaviors for different tables.

If a table appears (or implicitly appears) in multiple clauses at the same time, the strongest lock is used. Similarly, a table is processed as NOWAIT if that is specified in any of the clauses affecting it.
- Only FOR SHARE and FOR UPDATE can be used to query the Ustore table.
- For the FOR UPDATE/SHARE statements whose subquery is a stream plan, the same locked row cannot be concurrently updated.
NLS_SORT
Specifies that a field is sorted in a special order. Currently, only Chinese Pinyin and case-insensitive sorting are supported. To support this sorting mode, you need to set the encoding format to UTF8, GB18030, GB18030_2022, or GBK when creating a database. If you set the encoding format to another format, for example, SQL_ASCII, an error may be reported or the sorting mode may be invalid.

Value range:
- SCHINESE_PINYIN_M, sorted by Pinyin order.
- generic_m_ci: sorted in case-insensitive order (optional; only English characters are supported in the case-insensitive order.)
PARTITION clause
Queries data in the specified partition in a partitioned table.

Examples

-- Create a schema.
gaussdb=# CREATE SCHEMA tpcds;

-- Create the tpcds.reason table.
gaussdb=# CREATE TABLE tpcds.reason
(
  r_reason_sk      integer,
  r_reason_id      character(16),
  r_reason_desc    character(100)
);

-- Insert multiple records into the table.
gaussdb=# INSERT INTO tpcds.reason values(3,'AAAAAAAABAAAAAAA','reason 1'),(10,'AAAAAAAABAAAAAAA','reason 2'),(4,'AAAAAAAABAAAAAAA','reason 3'),(10,'AAAAAAAABAAAAAAA','reason 4'),(10,'AAAAAAAABAAAAAAA','reason 5'),(20,'AAAAAAAACAAAAAAA','N%reason 6'),(30,'AAAAAAAACAAAAAAA','W%reason 7');

-- Obtain the temp_t temporary table by a subquery and query all records in this table.
gaussdb=# WITH temp_t(name,isdba) AS (SELECT usename,usesuper FROM pg_user) SELECT * FROM temp_t;

-- Query all r_reason_sk records in the tpcds.reason table and delete duplicate records.
gaussdb=# SELECT DISTINCT(r_reason_sk) FROM tpcds.reason;

-- Example of a LIMIT clause: Obtain a record from the table.
gaussdb=# SELECT * FROM tpcds.reason LIMIT 1;

-- Query all records and sort them in alphabetic order.
gaussdb=# SELECT r_reason_desc FROM tpcds.reason ORDER BY r_reason_desc;

-- Use table aliases to obtain data from the pg_user and pg_user_status tables:
gaussdb=# SELECT a.usename,b.locktime FROM pg_user a,pg_user_status b WHERE a.usesysid=b.roloid;

-- Example of the FULL JOIN clause: Join data in the pg_user and pg_user_status tables.
gaussdb=# SELECT a.usename,b.locktime,a.usesuper FROM pg_user a FULL JOIN pg_user_status b on a.usesysid=b.roloid;

-- Example of the GROUP BY clause: Filter data based on query conditions, and group the results.
gaussdb=# SELECT r_reason_id, AVG(r_reason_sk) FROM tpcds.reason GROUP BY r_reason_id HAVING AVG(r_reason_sk) > 25;

-- Example of the GROUP BY CUBE clause: Filter data based on query conditions, and group the results.
gaussdb=# SELECT r_reason_id,AVG(r_reason_sk) FROM tpcds.reason GROUP BY CUBE(r_reason_id,r_reason_sk);

-- Example of the GROUP BY GROUPING SETS clause: Filter data based on query conditions, and group the results.
gaussdb=# SELECT r_reason_id,AVG(r_reason_sk) FROM tpcds.reason GROUP BY GROUPING SETS((r_reason_id,r_reason_sk),r_reason_sk);

-- Example of the UNION clause: Merge the names started with W and N in the r_reason_desc column in the tpcds.reason table.
gaussdb=# SELECT r_reason_sk, tpcds.reason.r_reason_desc
    FROM tpcds.reason
    WHERE tpcds.reason.r_reason_desc LIKE 'W%'
UNION
SELECT r_reason_sk, tpcds.reason.r_reason_desc
    FROM tpcds.reason
    WHERE tpcds.reason.r_reason_desc LIKE 'N%';

-- Example of the NLS_SORT clause: Sort by Chinese Pinyin.
gaussdb=# SELECT * FROM tpcds.reason ORDER BY NLSSORT( r_reason_desc, 'NLS_SORT = SCHINESE_PINYIN_M');


-- sorting by case-insensitive order (optional; only English characters are supported in the case-insensitive order.)
gaussdb=# SELECT * FROM tpcds.reason ORDER BY NLSSORT( r_reason_desc, 'NLS_SORT = generic_m_ci');

-- Create partitioned table tpcds.reason_p.
gaussdb=# CREATE TABLE tpcds.reason_p
(
  r_reason_sk integer,
  r_reason_id character(16),
  r_reason_desc character(100)
)
PARTITION BY RANGE (r_reason_sk)
(
  partition P_05_BEFORE values less than (05),
  partition P_15 values less than (15),
  partition P_25 values less than (25),
  partition P_35 values less than (35),
  partition P_45_AFTER values less than (MAXVALUE)
)
;

-- Insert data.
gaussdb=# INSERT INTO tpcds.reason_p values(3,'AAAAAAAABAAAAAAA','reason 1'),(10,'AAAAAAAABAAAAAAA','reason 2'),(4,'AAAAAAAABAAAAAAA','reason 3'),(10,'AAAAAAAABAAAAAAA','reason 4'),(10,'AAAAAAAABAAAAAAA','reason 5'),(20,'AAAAAAAACAAAAAAA','reason 6'),(30,'AAAAAAAACAAAAAAA','reason 7');

-- Example of the PARTITION clause: Obtain data from the P_05_BEFORE partition in the tpcds.reason_p table.
gaussdb=#  SELECT * FROM tpcds.reason_p PARTITION (P_05_BEFORE);
 r_reason_sk |   r_reason_id    |   r_reason_desc                   
-------------+------------------+------------------------------------
           4 | AAAAAAAABAAAAAAA | reason 3                          
           3 | AAAAAAAABAAAAAAA | reason 1                          
(2 rows)

-- Example of the GROUP BY clause: Group records in the tpcds.reason_p table by r_reason_id, and count the number of records in each group.
gaussdb=# SELECT COUNT(*),r_reason_id FROM tpcds.reason_p GROUP BY r_reason_id;
 count |   r_reason_id    
-------+------------------
     2 | AAAAAAAACAAAAAAA
     5 | AAAAAAAABAAAAAAA
(2 rows)

-- Example of the GROUP BY CUBE clause: Filter data based on query conditions, and group the results.
gaussdb=# SELECT * FROM tpcds.reason GROUP BY  CUBE (r_reason_id,r_reason_sk,r_reason_desc);

-- Example of the GROUP BY GROUPING SETS clause: Filter data based on query conditions, and group the results.
gaussdb=# SELECT * FROM tpcds.reason GROUP BY  GROUPING SETS ((r_reason_id,r_reason_sk),r_reason_desc);

-- Example of the HAVING clause: Group records in the tpcds.reason_p table by r_reason_id, count the number of records in each group, and display only values whose number of r_reason_id is greater than 2.
gaussdb=# SELECT COUNT(*) c,r_reason_id FROM tpcds.reason_p GROUP BY r_reason_id HAVING c>2;
 c |   r_reason_id    
---+------------------
 5 | AAAAAAAABAAAAAAA
(1 row)

-- Example of the IN clause: Group records in the tpcds.reason_p table by r_reason_id, count the number of records in each group, and display only the numbers of records whose r_reason_id is AAAAAAAABAAAAAAA or AAAAAAAADAAAAAAA.
gaussdb=# SELECT COUNT(*),r_reason_id FROM tpcds.reason_p GROUP BY r_reason_id HAVING r_reason_id IN('AAAAAAAABAAAAAAA','AAAAAAAADAAAAAAA'); 
count |   r_reason_id    
-------+------------------
     5 | AAAAAAAABAAAAAAA
(1 row)

-- Example of the INTERSECT clause: Query records whose r_reason_id is AAAAAAAABAAAAAAA and whose r_reason_sk is smaller than 5.
gaussdb=# SELECT * FROM tpcds.reason_p WHERE r_reason_id='AAAAAAAABAAAAAAA' INTERSECT SELECT * FROM tpcds.reason_p WHERE r_reason_sk<5;
 r_reason_sk |   r_reason_id    |     r_reason_desc                 
-------------+------------------+------------------------------------
           4 | AAAAAAAABAAAAAAA | reason 3                           
           3 | AAAAAAAABAAAAAAA | reason 1                           
(2 rows)

-- Example of the EXCEPT clause: Query records whose r_reason_id is AAAAAAAABAAAAAAA and whose r_reason_sk is greater than or equal to 4.
gaussdb=# SELECT * FROM tpcds.reason_p WHERE r_reason_id='AAAAAAAABAAAAAAA' EXCEPT SELECT * FROM tpcds.reason_p WHERE r_reason_sk<4;
r_reason_sk |   r_reason_id    |      r_reason_desc                  
-------------+------------------+------------------------------------
          10 | AAAAAAAABAAAAAAA | reason 2                          
          10 | AAAAAAAABAAAAAAA | reason 5                          
          10 | AAAAAAAABAAAAAAA | reason 4                          
           4 | AAAAAAAABAAAAAAA | reason 3                          
(4 rows)

-- Specify the operator (+) in the WHERE clause to indicate a left join.
gaussdb=# select t1.sr_item_sk ,t2.c_customer_id from store_returns t1, customer t2 where t1.sr_customer_sk  = t2.c_customer_sk(+) 
order by 1 desc limit 1;
 sr_item_sk | c_customer_id
------------+---------------
      18000 |
(1 row)

-- Specify the operator (+) in the WHERE clause to indicate a right join.
gaussdb=# select t1.sr_item_sk ,t2.c_customer_id from store_returns t1, customer t2 where t1.sr_customer_sk(+)  = t2.c_customer_sk 
order by 1 desc limit 1;
 sr_item_sk |  c_customer_id
------------+------------------
            | AAAAAAAAJNGEBAAA
(1 row)

-- Specify the operator (+) in the WHERE clause to indicate a left join and add a join condition.
gaussdb=# select t1.sr_item_sk ,t2.c_customer_id from store_returns t1, customer t2 where t1.sr_customer_sk  = t2.c_customer_sk(+) and t2.c_customer_sk(+) < 1 order by 1  limit 1;
 sr_item_sk | c_customer_id
------------+---------------
          1 |
(1 row)

-- If the operator (+) is specified in the WHERE clause, do not use expressions connected through AND/OR.
gaussdb=# select t1.sr_item_sk ,t2.c_customer_id from store_returns t1, customer t2 where not(t1.sr_customer_sk  = t2.c_customer_sk(+) and t2.c_customer_sk(+) < 1);
ERROR:  Operator "(+)" can not be used in nesting expression.
LINE 1: ...tomer_id from store_returns t1, customer t2 where not(t1.sr_...
                                                             ^
-- If the operator (+) is specified in the WHERE clause which does not support expression macros, an error will be reported.
gaussdb=# select t1.sr_item_sk ,t2.c_customer_id from store_returns t1, customer t2 where (t1.sr_customer_sk  = t2.c_customer_sk(+))::bool;
ERROR:  Operator "(+)" can only be used in common expression.

-- If the operator (+) is specified on both sides of an expression in the WHERE clause, an error will be reported.
gaussdb=# select t1.sr_item_sk ,t2.c_customer_id from store_returns t1, customer t2 where t1.sr_customer_sk(+)  = t2.c_customer_sk(+);
ERROR:  Operator "(+)" can't be specified on more than one relation in one join condition
HINT:  "t1", "t2"...are specified Operator "(+)" in one condition.

-- Delete the table.
gaussdb=# DROP TABLE tpcds.reason_p;

-- Example of a flashback query
-- Create the tpcds.time_table table.
gaussdb=#  create table tpcds.time_table(idx integer, snaptime timestamp, snapcsn bigint, timeDesc character(100));
-- Insert records into the tpcds.time_table table.
gaussdb=#  INSERT INTO tpcds.time_table select 1, now(),int8in(xidout(next_csn)), 'time1' from gs_get_next_xid_csn();
gaussdb=#  INSERT INTO tpcds.time_table select 2, now(),int8in(xidout(next_csn)), 'time2' from gs_get_next_xid_csn();
gaussdb=#  INSERT INTO tpcds.time_table select 3, now(),int8in(xidout(next_csn)), 'time3' from gs_get_next_xid_csn();
gaussdb=#  INSERT INTO tpcds.time_table select 4, now(),int8in(xidout(next_csn)), 'time4' from gs_get_next_xid_csn();
gaussdb=#  select * from tpcds.time_table;

 idx |          snaptime          | snapcsn |                                               timedesc
-----+----------------------------+---------+------------------------------------------------------------------------------------------------------
   1 | 2021-04-25 17:50:05.360326 |  107322 | time1
   2 | 2021-04-25 17:50:10.886848 |  107324 | time2
   3 | 2021-04-25 17:50:16.12921  |  107327 | time3
   4 | 2021-04-25 17:50:22.311176 |  107330 | time4
(4 rows)
gaussdb=#  delete tpcds.time_table;
DELETE 4
gaussdb=#  SELECT * FROM tpcds.time_table TIMECAPSULE TIMESTAMP to_timestamp('2021-04-25 17:50:22.311176','YYYY-MM-DD HH24:MI:SS.FF');
 idx |          snaptime          | snapcsn |                                               timedesc
-----+----------------------------+---------+------------------------------------------------------------------------------------------------------
   1 | 2021-04-25 17:50:05.360326 |  107322 | time1
   2 | 2021-04-25 17:50:10.886848 |  107324 | time2
   3 | 2021-04-25 17:50:16.12921  |  107327 | time3
(3 rows)
gaussdb=#  SELECT * FROM tpcds.time_table TIMECAPSULE CSN 107330;
 idx |          snaptime          | snapcsn |                                               timedesc
-----+----------------------------+---------+------------------------------------------------------------------------------------------------------
   1 | 2021-04-25 17:50:05.360326 |  107322 | time1
   2 | 2021-04-25 17:50:10.886848 |  107324 | time2
   3 | 2021-04-25 17:50:16.12921  |  107327 | time3
(3 rows)

-- Example of a WITH RECURSIVE query: Calculate the accumulated value from 1 to 100.
gaussdb=#  WITH RECURSIVE t1(a) as (
 select 100
),
t(n) AS (
    VALUES (1)
  UNION ALL
    SELECT n+1 FROM t WHERE n < (select max(a) from t1)
)
SELECT sum(n) FROM t;
 sum
------
 5050
(1 row)

-- Example of the UNPIVOT clause: Convert the MATH and PHY columns in the table p1 to the rows (class, score).
gaussdb=# CREATE TABLE p1(id int, math int, phy int);
gaussdb=# INSERT INTO p1 values(1,20,30);
gaussdb=# INSERT INTO p1 values(2,30,40);
gaussdb=# INSERT INTO p1 values(3,40,50);
gaussdb=# SELECT * FROM p1;
 id | math | phy
----+------+-----
  1 |   20 |  30
  2 |   30 |  40
  3 |   40 |  50
(3 rows)

gaussdb=# SELECT * FROM p1 UNPIVOT(score FOR class IN(math, phy));
 id | class | score
----+-------+-------
  1 | MATH  |    20
  1 | PHY   |    30
  2 | MATH  |    30
  2 | PHY   |    40
  3 | MATH  |    40
  3 | PHY   |    50
(6 rows)

-- Example of the PIVOT clause: Convert the rows (class, score) in the table p2 to the math and phy columns.
gaussdb=# CREATE TABLE p2(id int, class varchar(10), score int);
gaussdb=# INSERT INTO p2 SELECT * FROM p1 UNPIVOT(score FOR class IN(math, phy));
gaussdb=# SELECT * FROM p2;
 id | class | score
----+-------+-------
  1 | MATH  |    20
  1 | PHY   |    30
  2 | MATH  |    30
  2 | PHY   |    40
  3 | MATH  |    40
  3 | PHY   |    50
(6 rows)

gaussdb=# SELECT * FROM p2 PIVOT(max(score) FOR class IN ('MATH', 'PHY'));
 id | 'MATH' | 'PHY'
----+--------+-------
  1 |     20 |    30
  3 |     40 |    50
  2 |     30 |    40
(3 rows)

-- Delete the table.
gaussdb=# DROP TABLE tpcds.reason;

-- Delete a schema.
gaussdb=# DROP SCHEMA tpcds CASCADE;

Parent topic: SQL Syntax

Previous topic: SAVEPOINT

Next topic: SELECT INTO

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

Which of the following issues have you encountered?

Content is inconsistent with the product UI

Unclear descriptions

Lack of examples or code

Incorrect steps

Can't find what I need

Lack of best practices

Feedback (optional)

0/500

Select at least one type of issue, and enter your comments or suggestions.

Enter a maximum of 500 characters.

Submit Cancel

For any further questions, feel free to contact us through the chatbot.

Chatbot