Updated on 2023-10-23 GMT+08:00

INSERT

Function

INSERT inserts new rows into a table.

Precautions

  • The owner of a table, users granted with the INSERT permission on the table, or users granted with the INSERT ANY TABLE permission can insert data into the table. The system administrator has the permission to insert data into the table by default.
  • Use of the RETURNING clause requires the SELECT permission on all columns mentioned in RETURNING.
  • For column-store tables, the RETURNING clause is currently not supported.
  • If ON DUPLICATE KEY UPDATE is used, you must have the SELECT and UPDATE permissions on the table and the SELECT permission on the unique constraint (primary key or unique index).
  • If you use the query clause to insert rows from a query, you need to have the SELECT permission on any table or column used in the query.
  • If you use the query clause to insert data from the dynamic data anonymization column, the inserted result is the anonymized value and cannot be restored.
  • When you connect to a database compatible to Teradata and td_compatible_truncation is on, a long string will be automatically truncated. If later INSERT statements (not involving foreign tables) insert long strings to columns of char- and varchar-typed columns in the target table, the system will truncate the long strings to ensure no strings exceed the maximum length defined in the target table.

    If inserting multi-byte character data (such as Chinese characters) to a database with the character set byte encoding (SQL_ASCII, LATIN1), and the character data crosses the truncation position, the string is truncated based on its bytes instead of characters. Unexpected result will occur in tail after the truncation. If you want correct truncation result, you are advised to adopt encoding set such as UTF8, which has no character data crossing the truncation position.

Syntax

1
2
3
4
5
6
7
[ WITH [ RECURSIVE ] with_query [, ...] ]
INSERT [/*+ plan_hint */] INTO table_name [ AS alias ] [ ( column_name [, ...] ) ]
    { DEFAULT VALUES
    | VALUES {( { expression | DEFAULT } [, ...] ) }[, ...] 
    | query }
    [ ON DUPLICATE KEY UPDATE { NOTHING | { column_name = { expression | DEFAULT } } [, ...] [ WHERE condition ] } ]
    [ RETURNING {* | {output_expression [ [ AS ] output_name ] }[, ...]} ];

Parameter Description

  • WITH [ RECURSIVE ] with_query [, ...]

    Specifies one or more subqueries that can be referenced by name in the main query, which is equivalent to a temporary table.

    If RECURSIVE is specified, it allows a SELECT subquery to reference itself by name.

    Format of with_query:

    with_query_name [ ( column_name [, ...] ) ] AS [ [ NOT ] MATERIALIZED ]
    ( {select | values | insert | update | delete} )

    with_query_name specifies the name of the result set generated by a subquery. Such names can be used to access the result sets of subqueries in a query.

    -- column_name specifies the column name displayed in the subquery result set.

    Each subquery can be a SELECT, VALUES, INSERT, UPDATE or DELETE statement.

    – You can use MATERIALIZED or NOT MATERIALIZED to modify the CTE.
    • If MATERIALIZED is specified, the WITH query will be materialized, and a copy of the subquery result set is generated. The copy is directly queried at the reference point. Therefore, the WITH subquery cannot be jointly optimized with the SELECT statement trunk (for example, predicate pushdown and equivalence class transfer). In this scenario, you can use NOT MATERIALIZED for modification. If the WITH query can be executed as a subquery inline, the preceding optimization can be performed.
    • If the user does not explicitly declare the materialized attribute, comply with the following rules: If the CTE is referenced only once in the trunk statement to which it belongs and semantically supports inline execution, it will be rewritten as subquery inline execution. Otherwise, the materialized execution will be performed in CTE Scan mode.

    INSERT ON DUPLICATE KEY UPDATE does not support the WITH and WITH RECURSIVE clauses.

  • plan_hint clause

    Follows the INSERT keyword in the /*+ */ format. It is used to optimize the plan of an INSERT statement block. For details, see Hint-based Tuning. In each statement, only the first /*+ plan_hint */ comment block takes effect as a hint. Multiple hints can be written.

  • table_name

    Specifies the name of the target table where data will be inserted.

    Value range: an existing table name

  • column_name

    Specifies the name of a column in a table.

    • The column name can be qualified with a subfield name or array subscript, if needed.
    • Each column not present in the explicit or implicit column list will be filled with a default value, either its declared default value or NULL if there is none. Inserting into only some fields of a composite column leaves the other fields null.
    • The target column names column_name can be listed in any order. If no list of column names is given at all, the default is all the columns of the table in their declared order.
    • The target columns are the first N column names, if there are only N columns supplied by the value clause or query.
    • The values provided by the value clause and query are associated with the corresponding columns from left to right in the table.

    Value range: an existing column

  • expression

    Specifies an expression or a value to assign to the corresponding column.

    • In the INSERT ON DUPLICATE KEY UPDATE statement, expression can be VALUES(column_name) or EXCLUDED.column_name, indicating that the value of column_name corresponding to the conflict row is referenced. Note that VALUES(column_name) cannot be nested in an expression (for example, VALUES(column_name)+1). EXCLUDED is not subject to this restriction.
    • If single-quotation marks are inserted in a column, the single-quotation marks need to be used for escape.
    • If the expression for any column is not of the correct data type, automatic type conversion will be attempted. If the attempt fails, data insertion fails, and the system returns an error message.
  • DEFAULT

    Specifies the default value of a field. The value is NULL if no default value is assigned to it.

  • query

    Specifies a query statement (SELECT statement) that uses the query result as the inserted data.

  • RETURNING

    Returns the inserted rows. The syntax of the RETURNING list is identical to that of the output list of SELECT. Note that INSERT ON DUPLICATE KEY UPDATE does not support the RETURNING clause.

  • output_expression

    Specifies an expression used to calculate the output result of the INSERT statement after each row is inserted.

    Value range: The expression can use any field in the table. You can use the asterisk (*) to return all fields of the inserted row.

  • output_name

    Specifies a name to use for a returned column.

    Value range: a string. It must comply with the naming convention.

  • ON DUPLICATE KEY UPDATE

    For a table with a unique constraint (UNIQUE INDEX or PRIMARY KEY), if the inserted data violates the unique constraint, the UPDATE clause is executed to update the conflicting rows. If the clause of UPDATE is NOTHING, no operation will be performed.

    For a table without a unique constraint, only insert is performed.

    • Triggers are not supported. The INSERT or UPDATE trigger of the target table is not fired.
    • The unique constraint or primary key of DEFERRABLE is not supported.
    • If a table has multiple unique constraints and the inserted data violates multiple unique constraints, only the first row that has a conflict is updated. (The check sequence is closely related to index maintenance. Generally, the conflict check is performed on the index that is created first.)
    • Distribution columns and unique index columns cannot be updated.
    • Column-store tables do not support this operation.
    • The WHERE clause of UPDATE does not contain sublinks.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- Create the tpcds. reason_t2 table:
openGauss=# CREATE TABLE tpcds.reason_t2
(
  r_reason_sk    integer,
  r_reason_id    character(16),
  r_reason_desc  character(100)
);

-- Insert a record into a table:
openGauss=# INSERT INTO tpcds.reason_t2(r_reason_sk, r_reason_id, r_reason_desc) VALUES (1, 'AAAAAAAABAAAAAAA', 'reason1');

-- Insert a record into the table, which is equivalent to the previous syntax:
openGauss=# INSERT INTO tpcds.reason_t2 VALUES (2, 'AAAAAAAABAAAAAAA', 'reason2');

-- Insert multiple records into the table.
openGauss=# INSERT INTO tpcds.reason_t2 VALUES (3, 'AAAAAAAACAAAAAAA','reason3'),(4, 'AAAAAAAADAAAAAAA', 'reason4'),(5, 'AAAAAAAAEAAAAAAA','reason5');

-- Insert records whose r_reason_sk in the tpcds.reason table is less than 5:
openGauss=# INSERT INTO tpcds.reason_t2 SELECT * FROM tpcds.reason WHERE r_reason_sk <5;

-- Create a unique index for the table:
openGauss=# CREATE UNIQUE INDEX reason_t2_u_index ON tpcds.reason_t2(r_reason_sk);

-- Insert multiple records into the table. If the records conflict, update the r_reason_id field in the conflict data row to BBBBBBBBCAAAAAAA.
openGauss=# INSERT INTO tpcds.reason_t2 VALUES (5, 'BBBBBBBBCAAAAAAA','reason5'),(6, 'AAAAAAAADAAAAAAA', 'reason6') ON DUPLICATE KEY UPDATE r_reason_id = 'BBBBBBBBCAAAAAAA';

-- Delete the tpcds.reason_t2 table.
openGauss=# DROP TABLE tpcds.reason_t2;

Suggestions

VALUES

When you run the INSERT statement to insert data in batches, you are advised to combine multiple records into one statement to improve data loading performance. Example: INSERT INTO sections VALUES (30, 'Administration', 31, 1900),(40, 'Development', 35, 2000), (50, 'Development' , 60 , 2001);

If values of an insert statement are distributed on a DN, GaussDB can push the statement down to the corresponding DN for execution. Currently, only constants, simple expressions, and pushdown functions (provolatile in pg_proc is set to 'i') are supported. If a column in the table has a default value, the value must be a constant or a simple expression. Neither a single-value statement nor a multi-value statement can be pushed down to a single DN.