Updated on 2024-06-07 GMT+08:00

Undo Space Recycling Blocked by Long Transactions

Symptom

  1. The following error information is printed in gs_log:
    snapshot too old! the undo record has been forcibly discarded
    xid xxx, the undo size xxx of the transaction exceeds the threshold xxx. trans_undo_threshold_size xxx,undo_space_limit_size xxx. 

    In the actual error information, xxx indicates the actual data.

  2. The value of global_recycle_xid (global recycling XID of the Undo subsystem) does not change for a long time.

  3. Long transactions exist in the pg_running_xacts and pg_stat_activity views, blocking the progress of oldestxmin and global_recycle_xid. If the value of xmin obtained by querying active transactions in pg_running_xacts is the same as that of gs_txid_oldestxmin and the thread execution time obtained by querying pg_stat_activity based on a PID is too long, the recycling is suspended by a long transaction.
    SELECT * FROM pg_running_xacts where xmin::text::bigint<>0 and vacuum <> 't' order by xmin::text::bigint asc limit 5;
    SELECT * FROM gs_txid_oldestxmin();
    SELECT * FROM pg_stat_activity WHERE pid = Thread PID where the long transaction exists

Solution

Use pg_terminate_session(pid, sessionid) to terminate the sessions of the long transactions. (Note: There is no fixed quick restoration method for long transactions. Forcibly ending the execution of SQL statements is a common but high-risk operation. Exercise caution when performing this operation. Before performing this operation, please confirm with the administrator and Huawei technical personnel to prevent service failures or errors.)