MERGE INTO

Scenario

In data processing and analysis scenarios, you often need to merge data from one source into another—for example, integrating incremental data into a main table. However, due to the lack of efficient single-statement support, users typically resort to combining multiple statements to achieve this functionality. This approach not only reduces processing efficiency but may also compromise data consistency and accuracy. Addressing these challenges requires ensuring both efficiency and reliability during data operations. To simplify this process, Fabric SQL introduces MERGE INTO syntax for Iceberg tables, enabling you to perform insert, update, or delete operations with a single statement. This enhancement streamlines data merging workflows, improving both efficiency and reliability.

Function

MERGE INTO matches data between target and source tables based on specified conditions. If a match is found, it performs an UPDATE or DELETE operation on the target table. If no match exists, it executes an INSERT. By consolidating UPDATE and INSERT operations into a single step, MERGE INTO enhances execution efficiency.

Precautions

To execute MERGE INTO, you must have UPDATE and INSERT permissions on the target table and SELECT permission on the source table.
The target table must be in Iceberg format.

Syntax

MERGE [/*+ plan_hint */] INTO table_name [ partition_clause ] [ [ AS ] alias ]
USING { { table_name | view_name } | subquery } [ partition_clause ] [ [ AS ] alias ]
ON ( condition )
{ when_matched_clause | when_not_matched_clause } [, ...]

where when_matched_clause can be:
WHEN MATCHED [AND condition] THEN
{ UPDATE SET { column_name = { expression | DEFAULT } | ( column_name [, ...] ) = ( { expression | DEFAULT } [, ...] ) } [, ...] | DELETE }

and when_not_matched_clause can be:
WHEN NOT MATCHED [AND condition] THEN
INSERT { DEFAULT VALUES | [ ( column_name [, ...] ) ] VALUES ( { expression | DEFAULT } [, ...] ) [, ...] }

Parameter Description

plan_hint clause
Appears after keywords in the form of /*+ */. Used to provide hints for optimizing the execution plan of a specific statement block. Refer to Plan Hint Optimization for detailed usage.

INTO clause
Defines the target table for updates or inserts:
- table_name
  Name of the target table.
- partition_clause
  Specifies partitions within the target table.
  
  The partition must already exist or correspond to valid partition key values.
- alias
  Alias of the target table.
USING clause
Identifies the source table, which can be a table, view, or subquery.
- partition_clause
  Specifies partitions within the source table.
  
  Only applicable if the source is a partitioned table. Views and subqueries do not support partitioning.
- alias
  Alias of the source table.
ON clause
Defines the join condition between the target and source tables. Columns used in the join condition cannot be updated.
WHEN MATCHED clause
Executes UPDATE or DELETE operations when records in the source and target tables match based on the join condition.
- Distribution columns cannot be updated. System tables and system columns are also not updatable.
- An error will occur if a row is updated multiple times.
WHEN NOT MATCHED clause
The WHEN NOT MATCHED clause is used to INSERT data into the target table when no match is found between the source and target tables based on the given conditions.

The INSERT clause does not support multiple VALUES statements.

Both WHEN MATCHED and WHEN NOT MATCHED clauses can be reordered, omitted individually (but not both), or specified multiple times as needed.
DEFAULT
Use this option to populate a field with its default value.
AND condition
Conditions for WHEN MATCHED and WHEN NOT MATCHED clauses ensure operations are performed only when the criteria are met. These conditions are optional.

Examples

Create the target table products and the source table newproducts, then insert sample data.

    
     
       
       
         CREATE TABLE products
(
product_id INTEGER,
product_name VARCHAR2(60),
category VARCHAR2(60)
) STORE AS iceberg;

INSERT INTO products VALUES (1501, 'vivitar 35mm', 'electrncs'),
(1502, 'olympus is50', 'electrncs'),
(1600, 'play gym', 'toys'),
(1601, 'lamaze', 'toys'),
(1666, 'harry potter', 'dvd');

CREATE TABLE newproducts
(
product_id INTEGER,
product_name VARCHAR2(60),
category VARCHAR2(60)
) STORE AS iceberg;

INSERT INTO newproducts VALUES (1502, 'olympus camera', 'electrncs'),
(1601, 'lamaze', 'toys'),
(1666, 'harry potter', 'toys'),
(1700, 'wait interface', 'books');

        

      

    
   

Perform a MERGE INTO operation: Update rows where product_id matches, but only if the target table's product_name is not 'play gym' and the source table's product_name is not 'lamaze'. Delete rows where the source table's product_name is 'lamaze'. Insert rows where product_id does not match and the source table's category is 'books'.

    
         MERGE INTO products p
USING newproducts np
ON (p.product_id = np.product_id)
WHEN MATCHED AND p.product_name != 'play gym' and np.product_name != 'lamaze' THEN
  UPDATE SET product_name = np.product_name, category = np.category
WHEN MATCHED AND np.product_name = 'lamaze' THEN
  DELETE
WHEN NOT MATCHED AND np.category = 'books' THEN
  INSERT VALUES (np.product_id, np.product_name, np.category)
;

Query results show updates: Rows with product_id values 1502 and 1666 were updated, row 1601 was deleted, and row 1700 was newly inserted.

    
         SELECT * FROM products ORDER BY product_id;
 product_id |  product_name  | category  
------------+----------------+-----------
       1501 | vivitar 35mm   | electrncs
       1502 | olympus camera | electrncs
       1600 | play gym       | toys
       1666 | harry potter   | toys
       1700 | wait interface | books
(5 rows)