Help Center/ GaussDB/ M-Compatibility Developer Guide(Distributed)/ SQL Reference/ SQL Syntax/ SQL Statements/ S/ SELECT

Updated on 2025-10-23 GMT+08:00

View PDF

SELECT

Description

SELECT retrieves data from tables or views.

Serving as an overlaid filter on database tables, the SELECT statement uses SQL keywords to filter data tables and extract the required data.

Precautions

The owner of the table, users granted the SELECT permission on the table, and users granted the SELECT ANY TABLE permission can read data in the table or view. System administrators have the permission to read data in the table or view by default when separation of duties is disabled.
SELECT can join ordinary tables, but cannot join ordinary and GDS foreign tables. That is, the SELECT statement cannot contain both an ordinary table and a GDS foreign table.
You must have the SELECT permission on all columns in the SELECT statement.

The UPDATE permission is required for the use of FOR UPDATE or FOR SHARE.

Syntax

Query data.

[ WITH [ RECURSIVE ] with_query [, ...] ]
SELECT [/*+ plan_hint */]
    [ALL | DISTINCT | DISTINCTROW]
    [SQL_NO_CACHE]
    select_expr [, select_expr] ...
    [FROM from_item [,...]]
    [WHERE where_condition]
    [GROUP BY {col_name | expr | position}]
    [HAVING condition]
    [ORDER BY {col_name | expression | position}
      [ASC | DESC][ NULLS { FIRST | LAST } ], ...]
    [[LIMIT {[offset,] row_count | row_count OFFSET offset}] | FETCH {FIRST | NEXT} [(expr)] {ROW| ROWS} ONLY]
    [into_option]
    [ FOR READ ONLY | 
        [ {FOR { UPDATE | SHARE } [ OF table_name [, ...] ] [ NOWAIT | WAIT N | SKIP LOCKED ]} [...] ] |
        LOCK IN SHARE MODE]

select_expr:
    expression [ [AS] output_name ]

where_condition:
expr

SELECT ... UNION:

SELECT ... 
     UNION [ALL | DISTINCT] SELECT ... 
     [UNION [ALL | DISTINCT] SELECT ...]

Restrictions on using aliases in the SELECT clause in the WHERE, HAVING, and GROUP BY clauses are as follows:

Table creation statements are initialized.

m_db=# DROP TABLE IF EXISTS t0,t1;
m_db=# CREATE TABLE t0 (c0 INT, c1 INT);
m_db=# CREATE TABLE t1 (c0 INT, c1 INT);

Reference only within the same level.

m_db=# SELECT r FROM t0 WHERE (select c0 as r from t0) = 1; -- error

Reference only the aliases in the target list.

m_db=# SELECT * FROM t0 WHERE (SELECT c0 AS r FROM t0) = 1 AND r = 2; -- error

Reference a prior expression in a subsequent expression.
```
m_db=# SELECT r AS d, c0 AS r FROM t0; -- error
```

The volatile function cannot be used.

m_db=# SELECT uuid() AS r FROM t0 WHERE r = 1; -- error

The Window function cannot be used.

m_db=# SELECT pg_catalog.rank() over(PARTITION BY c0 ORDER BY c1) AS r FROM t0 WHERE r = 1; -- error

Aliases cannot be referenced in the condition of JOIN ON.

m_db=# SELECT uuid() AS r FROM t0 JOIN t1 ON r = 1; -- error

An error is reported if the target list contains multiple aliases to be referenced.
```
m_db=# SELECT c0 AS r AS u AS v FROM t0; -- error
```

The subquery with_query is as follows:

with_query_name [ ( column_name [, ...] ) ]
    AS ( {select} )

The specified query source from_item is as follows:

{ table_name  [ partition_clause ] [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
|( select ) [ AS ] alias [ ( column_alias [, ...] ) ]
|with_query_name [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
|function_name ( [ argument [, ...] ] ) [ AS ] alias [ ( column_alias [, ...] | column_definition [, ...] ) ]
|function_name ( [ argument [, ...] ] ) AS ( column_definition [, ...] )
|joined_table

table_name contains the virtual table dual. An M-compatible database allows you to specify dual as the virtual table name when no table is referenced.

joined_table is as follows:

joined_table: {
   table_reference [INNER | CROSS] JOIN table_factor [join_specification]
  | table_reference {LEFT|RIGHT} [OUTER] JOIN table_reference join_specification
  | table_reference NATURAL [{LEFT|RIGHT} [OUTER]] JOIN table_factor
}
join_specification: {
   ON join_condition
  | USING (join_column_list)
}
join_column_list:
    column_name [, column_name] ...

The GROUP clause is as follows:
```
| expression
| ( expression [, ...] )
```
The specified partition partition_clause in from_item is as follows:
```
PARTITION { ( { partition_name | subpartition_name } [, ...] ) | FOR (  partition_value [, ...] ) }
```
- Specifying partitions applies only to partitioned tables.
- If PARTITION specifies multiple partition names, the partition names can be the same. The union set of the partition ranges is used.

Parameters

plan_hint
Follows the SELECT keyword in the /*+ */ format. It is used to optimize the plan of a SELECT statement block based on hints. In each statement, only the first /*+ plan_hint */ comment block takes effect as a hint. Multiple hints can be written.

WITH [ RECURSIVE ] with_query [, ...]
Specifies one or more subqueries that can be referenced by name in the main query, which are equivalent to temporary tables. This subquery statement structure is called the common table expression (CTE) structure. When this structure is used, the execution plan contains the CTE SCAN content.

If RECURSIVE is specified, it allows a SELECT subquery to reference itself by name.

The detailed format for with_query is as follows:
```
with_query_name [ ( column_name [, ...] ) ] AS ( {select} )
```
- with_query_name specifies the name of the result set generated by a subquery. Such names can be used to access the result sets of subqueries in a query.
- column_name specifies the column names displayed in the subquery result set.
- You can use SELECT in the subqueries.
- RECURSIVE can appear only after WITH. If there are multiple CTEs, you only need to specify RECURSIVE at the first CTE.
- When using RECURSIVE, the data types of the subquery results on the left and right of UNION ALL or UNION [DISTINCT] in the CTE subquery must be converted to the same data type using the CAST function. For example, WITH RECURSIVE cte (n) AS ( SELECT cast(id as signed int) from table_1 UNION ALL SELECT cast((n + 1) as signed int) FROM cte WHERE n < 5 ) SELECT * FROM cte. For details about the type conversion generated by operators, see Specification Constraints of Logical Operators, Specification Constraints of Bitwise Operators, and Specification Constraints of Arithmetic Operators.
ALL
Specifies that all rows that meet the conditions are returned. The keyword can be omitted because all rows that meet the conditions are returned by default.
DISTINCT | DISTINCTROW
Removes all duplicate rows from the result set of the SELECT statement so that each row in the result set is unique.

The DISTINCT|DISTINCTROW expression is interpreted using the same rules as ORDER BY. Unless you use ORDER BY to guarantee that the required row appears first, you cannot know what the first row is.
SELECT list
Specifies the column names in the queried table. It can be the names of some or all of the columns. To specify all columns, use the wildcard (*).

You may use the AS output_name clause to assign an alias to the output column. The alias is usually used for displaying the output column. The keywords NAME, VALUE, and TYPE can be used as column aliases.

Column names can be:
- Manually input column names, where multiple names are separated by commas (,).
- Columns computed by the FROM clause.
FROM clause
Specifies one or more source tables for SELECT.

The FROM clause contains the following elements:
- table_name
  Specifies the name of a table or view. The schema name can be added before the table name or view name, for example, schema_name.table_name.
- alias
  Assigns a temporary alias to a table or a complex table reference so that the reference of the table by other queries is easier.
  
  An alias is used for abbreviation or eliminating ambiguity in self-joins. If an alias is assigned, it completely hides the actual name of the table.
- column_alias
  Specifies the column alias.
- PARTITION
  Queries data in the specified partition in a partitioned table.
- partition_name
  Specifies the name of the partition.
- partition_value
  Specifies the partition key value. If a partitioned table has multiple partition keys, you can use PARTITION FOR to uniquely identify a partition.
- subquery
  Performs a subquery in the FROM clause. A temporary table is created to save the output of the subquery.
- with_query_name
  The WITH clause can also be the source of the FROM clause and can be referenced by the name of the WITH query.
- function_name
  Specifies the function name. Functions can be used in a FROM clause.
- The types of JOIN are as follows:
  - [ INNER ] JOIN
    A JOIN clause combines two FROM items. You can use parentheses to determine the order of nesting. In the absence of parentheses, JOIN nests left-to-right.
    
    In any case, JOIN binds more tightly than FROM items that are separated by commas.
  - LEFT [ OUTER ] JOIN
    Returns all rows that meet the join conditions in the Cartesian product, as well as the rows in the left table that do not match the right table rows according to join conditions. Therefore, the rows on the left represent the full length of the joined table, where the corresponding columns in the right table are filled with NULL. Note that the condition of the JOIN clause is used only when matches are being calculated. The condition of the outer layer is applied after the calculation is complete.
  - RIGHT [ OUTER ] JOIN
    Returns all result rows of INNER JOIN, as well as all rows on the right that do not have matches (rows on the left are filled with NULL).
    
    This is just a notational convenience because you can always convert it to a LEFT OUTER JOIN by switching the left and right inputs.
  - CROSS JOIN
    CROSS JOIN is equivalent to INNER JOIN ON(TRUE), that is, rows that are not deleted by the conditions. This type of JOIN has the same effect as FROM and WHERE, and is more convenient in symbol usage.
    
    You must specify a join condition for INNER JOIN and OUTER JOIN, that is, one of the following: NATURAL ON, join_condition, and USING (join_column [, ...]).
  CROSS JOIN and INNER JOIN produce a simple Cartesian product, the same result as you get from listing the two items at the top level of FROM.
  
  You can use multiple JOIN clauses consecutively.
- ON join_condition
  Specifies the join condition, defining which rows match. Example: ON left_table.a = right_table.a You are advised not to use numeric types such as int for join_condition, because such types can be implicitly converted to Boolean values (non-zero values are implicitly converted to true and 0 is implicitly converted to false), which may cause unexpected results.
- USING(join_column[, ...])
  Abbreviation for ON left_table.a = right_table.a AND left_table.b = right_table.b ... Corresponding columns must have the same name.
- NATURAL
  A shorthand for a USING list that contains all columns in the two tables that have the same names.
- from item
  Specifies the name of the query source object used for the join.
WHERE clause
The WHERE clause forms an expression for row selection to narrow down the query range of SELECT. condition indicates any expression that returns a value of Boolean type. Rows that do not meet this condition will not be retrieved. You are advised not to use numeric types such as int for condition, because such types can be implicitly converted to Boolean values (non-zero numeric values are implicitly converted to true and 0 is implicitly converted to false), which may cause unexpected results.

When using a LIKE operator in a WHERE clause, add a slash (\) before each special character if any of the following is queried in LIKE: %_\
GROUP BY clause
Groups the query results based on the values of one or more columns. The query results with the same value are in the same group.

If a SELECT list expression references ungrouped columns and no aggregate function is used, an error is displayed. This is because multiple values may be returned for ungrouped columns.
HAVING clause
Used together with the GROUP BY clause to select special groups. The HAVING clause compares some attributes of groups with a constant. Only groups that match the logical expression in the HAVING clause are extracted. HAVING can only reference columns in the GROUP BY clause or columns used in aggregate functions.
UNION clause
Computes the set union of the rows returned by multiple SELECT statements.

The UNION clause has the following constraints:
- The default result of UNION does not contain any duplicate rows unless the ALL clause is specified.
- UNION operators in the same SELECT statement are evaluated from left to right unless specified by parentheses.
- FOR UPDATE cannot be specified either for a UNION result or for any input of a UNION.
General expression:
```
select_statement UNION [ALL] select_statement
```
- select_statement can be any SELECT statement without an ORDER BY, LIMIT, or FOR UPDATE statement.
- ORDER BY and LIMIT can be attached to subexpressions if enclosed in parentheses.
ORDER BY clause
Sorts data retrieved by SELECT in ascending or descending order. If the ORDER BY expression contains multiple columns:
- The values in the leftmost column are sorted first. If the values in this column are the same, they are then compared based on the next expression until the comparison is complete.
- If the values are equal for all specified expressions, they are returned in a random order.
- When used with the DISTINCT keyword, the columns to be sorted in ORDER BY must be included in the columns of the result set retrieved by the SELECT statement.
- The columns to be sorted by the ORDER BY clause where the GROUP BY clause is also used must be included in the columns grouped by GROUP BY.
If the SELECT result set contains aggregate functions, the ORDER BY clause, not used together with GROUP BY, takes effect only for the columns where the functions that return the set are located.
```
m_db=# CREATE TABLE t1(c1 int, c2 int);
m_db=# EXPLAIN SELECT sum(c1) c FROM t1 ORDER BY c, c2; -- Ignores ORDER BY.
m_db=# EXPLAIN SELECT sum(c1) c, generate_series (1,5) s FROM t1 ORDER BY c, c2; -- Ignores ORDER BY.
The m_db=# EXPLAIN SELECT sum(c1) c, generate_series (1,5) s FROM t1 ORDER BY c, s; -- Sorts the s column in the result.
```
- NULLS FIRST
  Specifies that null values appear before non-null values in the sort ordering. This is the default when DESC is specified.
- NULLS LAST
  Specifies that null values appear after non-null values in the sort ordering. This is the default when DESC is not specified.
To support Chinese Pinyin sorting, you need to specify the gb18030 and GBK character sets when initializing the M-compatible database. For details, see Database-level Character Sets and Collations. In addition, you need to specify the gb18030_chinese_ci and gbk_chinese_ci collations during comparison.
LIMIT clause
Consists of two independent clauses:

LIMIT { count | ALL } limits the number of rows to be returned. count specifies the number of rows to be returned. The effect of LIMIT ALL is the same as that of omitting the LIMIT clause.

LIMIT start count specifies the maximum number of rows to be returned, while start specifies the number of rows to skip before starting to return rows. When both are specified, start rows are skipped before count rows to be returned start to be counted. ROWNUM cannot be used as count or offset in the LIMIT clause.
FETCH {FIRST | NEXT} [(expr)] {ROW| ROWS} ONLY
If count is omitted in a FETCH clause, it defaults to 1, indicating that the clause specifies the total number of rows counting from the first row of the returned query result.

This attribute is not supported when the version compatibility control parameter m_format_dev_version of an M-compatible database is set to 's1' or later.
OFFSET clause
SQL: A different syntax was proposed in 2008.

OFFSET start { ROW | ROWS }

start specifies the number of rows to skip before starting to return rows.
Locking clause
The FOR UPDATE clause locks the rows retrieved by SELECT. This ensures that the rows cannot be modified or deleted by other transactions until the current transaction ends, that is, other transactions that attempt to perform UPDATE, DELETE, SELECT FOR UPDATE, or SELECT FOR SHARE on these rows will be blocked until the current transaction ends. Any DELETE on the same row will also acquire the FOR UPDATE lock mode, as will UPDATE that modifies values on the primary key column. Conversely, SELECT FOR UPDATE will wait for concurrent transactions that are running the preceding commands on the same row, and will continue to lock and return the updated rows (there may be no row because the rows may be deleted).

FOR SHARE behaves similarly, except that it requires a shared lock rather than an exclusive lock on each retrieved row. A shared lock prevents other transactions from performing UPDATE, DELETE, or SELECT FOR UPDATE on the rows, but does not prevent them from performing SELECT FOR SHARE.

To prevent the operation from waiting for the commit of other transactions, you can use NOWAIT. If the selected row cannot be locked immediately, an error is reported immediately. If you use WAIT n and the selected row cannot be locked immediately, the operation needs to wait for n seconds (the value of n is an integer ranging from 0 to 2147483). If the lock is obtained within n seconds, the operation is performed normally. Otherwise, an error is reported. If you use SKIP LOCKED, locked rows are skipped when a table is locked. Only rows that are locked using a row lock can be skipped. In scenarios where locks do not block each other in different transactions, for example, SELECT FOR SHARE - SELECT FOR SHARE, rows that are locked with preceding locks are not skipped though SKIP LOCKED is specified.

If table names are specified in a locking clause, then only those tables will be locked. Other tables used in SELECT will not be locked. Otherwise, all tables used in the command will be locked.

If a locking clause is applied to a view or subquery, it will lock all tables used in the view or subquery.

Multiple locking clauses can be used to specify different lock modes for different tables.

If a table appears (or implicitly appears) in multiple clauses, the strongest lock is used. Similarly, if there is NOWAIT in any of the clauses that affect a table, then NOWAIT will apply to the table.
- The LOCK IN SHARE MODE clause is equivalent to the FOR SHARE clause in which no other options are specified.
- Only FOR SHARE/FOR UPDATE/FOR READ ONLY/LOCK IN SHARE MODE can be used in querying the Ustore table.
- For the FOR UPDATE or FOR SHARE statements whose subquery is a stream plan, the same row that is locked cannot be concurrently updated.
- SELECT FOR UPDATE and SELECT FOR SHARE cannot be used together with UNION, EXCEPT, DISTINCT, GROUP BY, or HAVING.
PARTITION clause
Queries data in the corresponding partition in a partitioned table.

Parent topic: S

Previous topic: SAVEPOINT

Next topic: Simple Queries

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

Which of the following issues have you encountered?

Content is inconsistent with the product UI

Unclear descriptions

Lack of examples or code

Incorrect steps

Can't find what I need

Lack of best practices

Feedback (optional)

0/500

Select at least one type of issue, and enter your comments or suggestions.

Enter a maximum of 500 characters.

Submit Cancel

For any further questions, feel free to contact us through the chatbot.

Chatbot