Help Center> Data Warehouse Service (DWS)> Troubleshooting> Database Use> "ERROR: Non-deterministic UPDATE" Is Reported During Update
Updated on 2024-01-25 GMT+08:00

"ERROR: Non-deterministic UPDATE" Is Reported During Update

Symptom

"ERROR: Non-deterministic UPDATE" is reported when the UPDATE statement is executed

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
CREATE TABLE public.t1(a int, b int) WITH(orientation = column);
CREATE TABLE

CREATE TABLE public.t2(a int, b int) WITH(orientation = column);
CREATE TABLE

INSERT INTO public.t1 VALUES (1, 1);
INSERT INTO public.t2 VALUES (1, 1),(1, 2);

UPDATE t1 SET t1.b = t2.b FROM t2 WHERE t1.a = t2.a;
ERROR: Non-deterministic 
UPDATEDETAIL:  multiple updates to a row by a single query for column store table.

Possible Causes

If a tuple in an SQL statement is updated for multiple times, the error message "ERROR: Non-deterministic UPDATE" is displayed.

The update operation is divided into two steps:

  1. Search for tuples that meet the update conditions through joins.
  2. Perform the update operation.

In the preceding case, for the tuple (1, 1) in table public.t1, there are two records that meet the update condition t1.a = t2.a in table public.t2: (1, 1) and (1, 2). According to the executor, the tuple (1, 1) in the logic table t2 needs to be updated twice. This will result in the following two scenarios:

  1. When tables public.t1 and public.t2 are joined, (1, 1) is hit first and then (1, 2) is hit. In this case, the tuple (1, 1) of public.t1 is updated to (1,1) and then to (1,2). The final result is (1, 2).
  2. When tables public.t1 and public.t2 are joined, (1, 2) is hit first and then (1, 1) is hit. In this case, the tuple (1, 1) of public.t1 is updated to (1,2) and then to (1,1). The final result is (1, 1).

During the actual execution, the sequence of the output result set of table public.t2 affects the final output of the UPDATE statement (the location of table public.t2 in actual execution may be a complex subquery). As a result, the execution result of the UPDATE statement is random, which is unacceptable.

Solution

You are advised to adjust the UPDATE statement based on the site requirements. For example, analyze the meaning of the columns of table public.t2 to determine the target column to be updated. In the preceding case, if you want to update the value of column b in public.t1 to the maximum value in public.t2 when the values of column a are the same, you can modify the logic as follows:

1
2
3
4
5
6
7
UPDATE t1 SET t1.b = t2.b_max FROM (SELECT a, max(b) AS b_max FROM t2 GROUP BY a) t2 WHERE t1.a = t2.a;
UPDATE 1
SELECT * FROM public.t1;
 a | b
---+---
 1 | 2
(1 row)