Updated on 2024-05-29 GMT+08:00

INSERT

Syntax

INSERT { INTO | OVERWRITE } [TABLE] table_name [(column_list)] [ PARTITION (partition_clause)] {select_statement | VALUES (value [, value ...]) [, (value [, value ...]) ...] }

FROM from_statement INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement

FROM from_statement INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...) select_statement

Remarks

If there is only one field in the data table and the field type can be row, struct, uniontype, use row to encapsulate the type when inserting data.
-- When a complex type is inserted into a single field table, the complex type must be wrapped by row().
CREATE TABLE test_row (id row(c1 int, c2 string));

INSERT INTO test_row values row(row(1, 'test'));

CREATE TABLE test_union (id uniontype<int, string>);

INSERT INTO test_union values row(uniontype<0,1,'test'>);

--The complex type of a multi-field table can be directly inserted.
CREATE TABLE test_multy_value(id int, col row(c1 int, c2 string));

INSERT INTO test_multy_value values (1,row(1,'test'));

Description

  • This statement is used to insert a new data row into a table.
  • If a list of column names is specified, the list of column names must exactly match the name of the column list generated by the query statement. The value of each column that is not in the column name list is set to null.
  • If no column name list is specified, the column generated by the query statement must exactly match the column to be inserted.
  • When INSERT INTO is used, data is added to the table. When INSERT OVERWRITE is used, if table property auto.purge is set to true, data in the original table data is directly deleted and new data is written to the table.
  • If the object table is a partitioned table, insert overwrite deletes the data in the corresponding partition instead of all data.
  • The table keyword following INSERT INTO is optional to be compatible with the Hive syntax.

Example

  • Create the fruit and fruit_copy tables.
    create table fruit (name varchar,price double);
    create table fruit_copy (name varchar,price double);
  • Inserts a row of data into the fruit table.
    insert into fruit values('LIchee',32);
    --The following is an example of the compatible format with the table keyword:
    insert into table fruit values('Cherry',88);
  • Insert multiple lines of data into the fruit table.
    insert into fruit values('banana',10),('peach',6),('lemon',12),('apple',7);
  • Load the data lines in the fruit table to the fruit_copy table. After the execution, there are five records in the table.
    insert into fruit_copy select * from fruit;
  • Clear the fruit_copy table, and then load the data in the fruit table to the table. After the execution, there are two records in the fruit_copy table.
    insert overwrite fruit_copy select *  from fruit limit 2;
  • For the VARCHAR type, INSERT can be used only when the column length defined in the target table is greater than the actual column length of the source table. SELECT queries data from the source table and inserts the data to the target table.
    create table varchar50(c1 varchar(50));
    insert into varchar50 values('hetuEngine');
    create table varchar100(c1 varchar(100));
    insert into varchar100 select * from varchar50;
  • When the insert overwrite statement is used for a partitioned table, only the data in the partition where the inserted value is located is cleared, not the entire table.
    --Create a table.
    create table test_part (id int, alias varchar) partitioned by (dept_id int, status varchar);
    
    insert into test_part  partition(dept_id=10, status='good') values (1, 'xyz'), (2, 'abc');
    
    select * from test_part order by id;
     id | alias | dept_id | status 
    ----|-------|---------|--------
      1 | xyz   |      10 | good   
      2 | abc   |      10 | good   
    (2 rows)
    
    --Clear the partition(dept_id=25, status='overwrite') partition and insert a data record.
    insert overwrite test_part (id, alias, dept_id, status) values (3, 'uvw', 25, 'overwrite');
     select * from test_part ;
     id | alias | dept_id |  status   
    ----|-------|---------|-----------
      1 | xyz   |      10 | good      
      2 | abc   |      10 | good      
      3 | uvw   |      25 | overwrite
    
    --Clear the partition(dept_id=10, status='good') partition and insert a data record.
    insert overwrite test_part (id, alias, dept_id, status) values (4, 'new', 10, 'good');
    select * from test_part ordr;
     id | alias | dept_id |  status   
    ----|-------|---------|-----------
      3 | uvw   |      25 | overwrite 
      4 | new   |      10 | good      
    (2 rows)
    
    -- Insert data to a partitioned table.
    create table test_p_1(name string, age int) partitioned by (provice string, city string);
    
     create table test_p_2(name string, age int) partitioned by (provice string, city string);
     
    -- Add data to test_p_1.
     insert into test_p_1 partition (provice = 'hebei', city= 'baoding') values ('xiaobei',15),( 'xiaoming',22);
    -- Insert data into test_p_2 based on test_p_1.
    
    -- Method 1
    from test_p_1 insert into table test_p_2 partition (provice = 'hebei', city= 'baoding') select name,age;
     
    -- Method 2
    insert into test_p_2 partition(provice = 'hebei', city= 'baoding') select name,age from test_p_1;

Precautions

By default, data cannot be inserted into external tables. To insert data into external tables, add configurations to the data source.

  • Co-deployment

    Log in to FusionInsight Manager, choose Cluster > Services > HetuEngine > Dashboard, and click the HSConsole link on the HSConsole Web UI to go to the compute instance page.

    Choose Data Source > Hive > Edit. On the Custom Configuration page, click Add to add custom configuration item hive.non-managed-table-writes-enabled and set it to true.

  • Independent deployment of Hive

    Log in to FusionInsight Manager, choose Cluster > Services > HetuEngine > Dashboard, and click the HSConsole link on the HSConsole Web UI to go to the compute instance page.

    Choose Data Source > Hive data source name > Edit. On the Custom Configuration page, click Add to add custom configuration item hive.non-managed-table-writes-enabled and set it to true.