INSERT

Function

Inserts one or more rows of data into a table.

Precautions

To insert data into a table, you must have the INSERT permission on the table.
To use the RETURNING clause, you must have the SELECT permission on the table.
If using the QUERY clause to insert data rows from a query, you must also have the SELECT permission on the table used in the query.
If using the OVERWRITE clause to insert data in overwrite mode, you must additionally have the SELECT and TRUNCATE permissions on the table.
When connecting to a Teradata-compatible database with the td_compatible_truncation parameter set to on, the auto-truncation feature for excessively long strings is enabled. In subsequent INSERT statements (excluding external table scenarios), if an overly long string is inserted into a char or varchar column in the target table, the system will automatically truncate the string based on the maximum length defined for the corresponding column in the target table.

If multi-byte character data (for example, Chinese characters) is inserted into a database with a byte-based encoding character set (such as SQL_ASCII and LATIN1) and the character data spans the truncation point, the data will be truncated based on byte length, potentially resulting in unexpected results at the truncated end. If accuracy in truncation results is required, you are advised to use an input character set like UTF8, which supports character-level truncation, as the database's encoding set.

For processing ORC and PARQUET table data, you are advised to use the INSERT OVERWRITE syntax. Example:

For inserting data into a non-partitioned table, use INSERT OVERWRITE INTO tablename select_clause.
For inserting data into a partitioned table, use INSERT OVERWRITE INTO tablename PARTITION FOR (partvalue [, partvalue]) select_clause.

Currently, the INSERT syntax does not guarantee transactional integrity for ORC and PARQUET formats. Using the INSERT syntax writes new data incrementally, which may lead to:

Dirty reads, where partially written data is accessed.
Residual data remaining in case of exceptions during the write process, compromising data consistency.

Syntax

    
         [ WITH  with_query [, ...] ]
INSERT [/*+ plan_hint */] [ IGNORE | OVERWRITE ] INTO table_name [ partition_clause ] [ AS alias ] [ ( column_name [, ...] ) ]
    { DEFAULT VALUES
    | VALUES {( { expression | DEFAULT } [, ...] ) }[, ...] 
    | query }
    [ ON DUPLICATE KEY duplicate_action | ON CONFLICT [ conflict_target ] conflict_action ]
    [ RETURNING {* | {output_expression [ [ AS ] output_name ] }[, ...]} ];

where partition_clause can be:
    
PARTITION FOR ( partition_key_value [, ...] )

and duplicate_action can be:

    UPDATE { column_name = { expression | DEFAULT } |
             ( column_name [, ...] ) = ( { expression | DEFAULT } [, ...] )
           } [, ...]

and conflict_target can be one of:

    ( { index_column_name | ( index_expression ) } [ COLLATE collation ] [ opclass ] [, ...] ) [ WHERE index_predicate ]
    ON CONSTRAINT constraint_name

and conflict_action is one of:

    DO NOTHING
    DO UPDATE SET { column_name = { expression | DEFAULT } |
                    ( column_name [, ...] ) = ( { expression | DEFAULT } [, ...] )
                  } [, ...]
              [ WHERE condition ]

Parameter Description

WITH with_query [, ...]
Declares one or more subqueries that can be referenced by name in the main query, acting as temporary tables.

Detailed format of with_query:
```
with_query_name [ ( column_name [, ...] ) ] AS
( {select | values | insert | update | delete} )
```
– with_query_name specifies the name of the result set generated by the subquery, which can be used to access the subquery's result set in the query.

– column_name specifies the column names displayed in the subquery result set.

– Each subquery can be a SELECT, VALUES, INSERT, UPDATE, or DELETE statement.
plan_hint clause
Appears after keywords in the form of /*+ */. Used to provide hints for optimizing the execution plan of a specific statement block. Refer to Plan Hint Optimization for detailed usage.
IGNORE
Ignores data conflicts caused by primary keys or unique constraints.
OVERWRITE
Indicates the overwrite insertion mode. When using this mode, existing data in the target table is cleared, retaining only the newly inserted data.

OVERWRITE supports inserting specified columns. Other columns take default values or NULL if no default exists.
- Do not concurrently execute OVERWRITE with real-time write operations like INSERT INTO, as it risks unintended clearing of real-time written data.
- OVERWRITE is suitable for batch data import scenarios and is not recommended for inserting small amounts of data.
- Do not concurrently perform INSERT OVERWRITE operations on the same table to prevent errors like "tuple concurrently updated.".
- During cluster scaling, if the table targeted by INSERT OVERWRITE requires data redistribution, INSERT OVERWRITE clears current data and automatically distributes new data across resized nodes. Concurrent execution of INSERT OVERWRITE and data redistribution interrupts the latter.
- OVERWRITE allows overwriting external OBS tables, deleting original directory data files.
table_name
Name of the target table where data is to be inserted.

Range: an existing table name.
AS
Alias for the target table table_name. alias is the alias name.
column_name
Column name in the target table.
- The column name can be modified by a subfield name or array subscript.
- Each column not present in the column list is filled with a system default value or a declared default value, or NULL if none is present. For example, if you insert data into some columns of a composite type, other columns will be NULL.
- The target column (column_name) can be sorted in order. If no column is listed, all columns are listed by default, and the order is the same as that of the table declaration.
- If only N columns are provided in the value clause and query, the target columns are the first N columns.
- The values provided by the value clause and query are associated to the corresponding columns from left to right in the table.
Range: an existing column name.
partition_key_value
Partition key value.

The values specified by the PARTITION FOR ( partition_key_value [, ...] ) clause can uniquely identify a partition.

Range: value range of the partition key of the partition to be renamed.

expression

Assign a valid expression or value to the corresponding column:

When inserting single quotation marks into a column in a table, you need to use the single quotation marks for escape.
If the expression for inserting a row is not of the correct data type, the system attempts to convert the data type. If the conversion fails, the data fails to be inserted and the system returns an error message.

Example:

      
           CREATE TABLE tt01 (id int,content varchar(50)) store AS orc;

INSERT INTO tt01 values (1,'Jack say ''hello''');
INSERT 0 1
INSERT INTO tt01 values (2,'Rose do 50%');
INSERT 0 1
INSERT INTO tt01 values (3,'Lilei say ''world''');
INSERT 0 1
INSERT INTO tt01 values (4,'Hanmei do 100%');
INSERT 0 1

SELECT * FROM tt01;
 id |      content
----+-------------------
  3 | Lilei say 'world'
  4 | Hanmei do 100%
  1 | Jack say 'hello'
  2 | Rose do 50%
(4 rows)

DEFAULT
Default value of the corresponding field name. If no default value is available, the value is NULL.
query
A query statement (SELECT statement) that uses the query result as the inserted data.
ON DUPLICATE KEY
Used to update conflicting data when a primary key or unique constraint conflict occurs.

duplicate_action specifies the columns and data to be updated.
ON CONFLICT
Ignores or updates conflicting data when a primary key or unique constraint conflict occurs.

conflict_target is used to specify the column name index_column_name, the expression index_expression that contains multiple column names, or the constraint name constraint_name. It is used to infer whether there is a unique index from the column name, expression that contains multiple column names, or constraint name. index_column_name and index_expression comply with the index column format of CREATE INDEX.

conflict_action specifies the policy to be executed when a primary key or unique constraint conflict occurs. The options are:
- DO NOTHING: Ignore the conflict.
- DO UPDATE SET: Update data upon a conflict. The columns to be updated and the data to be updated are specified later.
RETURNING
Returns the inserted rows. The syntax of the RETURNING list is identical to that of the output list of SELECT.
output_expression
An expression used by the INSERT command to calculate the output result after each row is inserted.

Range: The expression can use any field in the table. You can use * to return all columns of the inserted row.
output_name
Output name of a column.

Range: a string. It must comply with the naming convention.

Examples

Create the reason_t1 table.

    
         CREATE TABLE reason_t1
(
    TABLE_SK          INTEGER               ,
    TABLE_ID          VARCHAR(20)           ,
    TABLE_NA          VARCHAR(20)
) store AS orc;

Insert a record into the table.

    
         INSERT INTO reason_t1(TABLE_SK, TABLE_ID, TABLE_NA) VALUES (1, 'S01', 'StudentA');

Insert a record into the table. The syntax is equivalent to the previous syntax.

    
         INSERT INTO reason_t1 VALUES (1, 'S01', 'StudentA');

Insert records whose TABLE_SK is less than 1 into the table.

    
         INSERT INTO reason_t1 SELECT * FROM reason_t1 WHERE TABLE_SK < 1;

Insert multiple records into the table.

    
         INSERT INTO reason_t1 VALUES (1, 'S01', 'StudentA'),(2, 'T01', 'TeacherA'),(3, 'T02', 'TeacherB');
SELECT * FROM reason_t1 ORDER BY 1;
 TABLE_SK | TABLE_ID | TABLE_NAME
----------+----------+------------
        1 |      S01 |   StudentA
        2 |      T01 |   TeacherA
        3 |      T02 |   TeacherB
(3 rows)

Use INSERT OVERWRITE to update data in a table, that is, insert data in overwrite mode.

    
         INSERT OVERWRITE INTO reason_t1 values (4, 'S02', 'StudentB');
SELECT * FROM reason_t1 ORDER BY 1;
 TABLE_SK | TABLE_ID | TABLE_NAME
----------+----------+------------
        4 |      S02 |   StudentB
(1 rows)

Insert data back into the reason_t1 table.

INSERT INTO reason_t1 SELECT * FROM reason_t1;

Specify the default values for independent columns.

INSERT INTO reason_t1 VALUES (5, 'S03', DEFAULT);

To insert some data in a table to another table, perform the WITH subquery to obtain the temporary table temp_t, and then insert all data in the temporary table temp_t to the reason_t1 table.

WITH temp_t AS (SELECT * FROM reason_t1) INSERT INTO reason_t1 SELECT * FROM temp_t ORDER BY 1;

Insert data into a specified partition of a partitioned table.

    
         CREATE TABLE test_value_row(a int) PARTITION BY (d int) store AS orc;
INSERT OVERWRITE INTO test_value_row VALUES(55,51);
INSERT OVERWRITE INTO test_value_row VALUES(85,80);
INSERT OVERWRITE INTO test_value_row partition for (80) VALUES(85,80);

ALTER TABLE test_value_row DROP PARTITIONS (d=51);